Aligning human and computational coherence evaluations

Automated coherence metrics constitute an efficient and popular way to evaluate topic models. Previous work presents a mixed picture of their presumed correlation with human judgment. This work proposes a novel sampling approach to mining topic representations at a large scale while seeking to mitig...

Full description

Saved in:

Bibliographic Details
Main Authors:	LIM, Jia Peng, LAUW, Hady Wirawan
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Vocabulary decision-making processes topic models Computational Engineering Databases and Information Systems Linguistics
Online Access:	https://ink.library.smu.edu.sg/sis_research/9427 https://ink.library.smu.edu.sg/context/sis_research/article/10427/viewcontent/coli_a_00518_pvoa_cc_nc_nd.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10427
record_format	dspace
spelling	sg-smu-ink.sis_research-104272024-10-25T08:35:22Z Aligning human and computational coherence evaluations LIM, Jia Peng LAUW, Hady Wirawan Automated coherence metrics constitute an efficient and popular way to evaluate topic models. Previous work presents a mixed picture of their presumed correlation with human judgment. This work proposes a novel sampling approach to mining topic representations at a large scale while seeking to mitigate bias from sampling, enabling the investigation of widely used automated coherence metrics via large corpora. Additionally, this article proposes a novel user study design, an amalgamation of different proxy tasks, to derive a finer insight into the human decision-making processes. This design subsumes the purpose of simple rating and outlier-detection user studies. Similar to the sampling approach, the user study conducted is extensive, comprising 40 study participants split into eight different study groups tasked with evaluating their respective set of 100 topic representations. Usually, when substantiating the use of these metrics, human responses are treated as the gold standard. This article further investigates the reliability of human judgment by flipping the comparison and conducting a novel extended analysis of human response at the group and individual level against a generic corpus. The investigation results show a moderate to good correlation between these metrics and human judgment, especially for generic corpora, and derive further insights into the human perception of coherence. Analyzing inter-metric correlations across corpora shows moderate to good correlation among these metrics. As these metrics depend on corpus statistics, this article further investigates the topical differences between corpora, revealing nuances in applications of these metrics. 2024-09-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9427 info:doi/10.1162/coli_a_00518 https://ink.library.smu.edu.sg/context/sis_research/article/10427/viewcontent/coli_a_00518_pvoa_cc_nc_nd.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Vocabulary decision-making processes topic models Computational Engineering Databases and Information Systems Linguistics
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Vocabulary decision-making processes topic models Computational Engineering Databases and Information Systems Linguistics
spellingShingle	Vocabulary decision-making processes topic models Computational Engineering Databases and Information Systems Linguistics LIM, Jia Peng LAUW, Hady Wirawan Aligning human and computational coherence evaluations
description	Automated coherence metrics constitute an efficient and popular way to evaluate topic models. Previous work presents a mixed picture of their presumed correlation with human judgment. This work proposes a novel sampling approach to mining topic representations at a large scale while seeking to mitigate bias from sampling, enabling the investigation of widely used automated coherence metrics via large corpora. Additionally, this article proposes a novel user study design, an amalgamation of different proxy tasks, to derive a finer insight into the human decision-making processes. This design subsumes the purpose of simple rating and outlier-detection user studies. Similar to the sampling approach, the user study conducted is extensive, comprising 40 study participants split into eight different study groups tasked with evaluating their respective set of 100 topic representations. Usually, when substantiating the use of these metrics, human responses are treated as the gold standard. This article further investigates the reliability of human judgment by flipping the comparison and conducting a novel extended analysis of human response at the group and individual level against a generic corpus. The investigation results show a moderate to good correlation between these metrics and human judgment, especially for generic corpora, and derive further insights into the human perception of coherence. Analyzing inter-metric correlations across corpora shows moderate to good correlation among these metrics. As these metrics depend on corpus statistics, this article further investigates the topical differences between corpora, revealing nuances in applications of these metrics.
format	text
author	LIM, Jia Peng LAUW, Hady Wirawan
author_facet	LIM, Jia Peng LAUW, Hady Wirawan
author_sort	LIM, Jia Peng
title	Aligning human and computational coherence evaluations
title_short	Aligning human and computational coherence evaluations
title_full	Aligning human and computational coherence evaluations
title_fullStr	Aligning human and computational coherence evaluations
title_full_unstemmed	Aligning human and computational coherence evaluations
title_sort	aligning human and computational coherence evaluations
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/9427 https://ink.library.smu.edu.sg/context/sis_research/article/10427/viewcontent/coli_a_00518_pvoa_cc_nc_nd.pdf
_version_	1814777848684085248

Aligning human and computational coherence evaluations

Similar Items