Large-scale correlation analysis of automated metrics for topic models
Automated coherence metrics constitute an important and popular way to evaluate topic models. Previous works present a mixed picture of their presumed correlation with human judgement. In this paper, we conduct a large-scale correlation analysis of coherence metrics. We propose a novel sampling appr...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2023
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8346 https://ink.library.smu.edu.sg/context/sis_research/article/9349/viewcontent/acl23.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9349 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-93492023-12-13T03:38:04Z Large-scale correlation analysis of automated metrics for topic models LIM, Jia Peng LAUW, Hady Wirawan Automated coherence metrics constitute an important and popular way to evaluate topic models. Previous works present a mixed picture of their presumed correlation with human judgement. In this paper, we conduct a large-scale correlation analysis of coherence metrics. We propose a novel sampling approach to mine topics for the purpose of metric evaluation, and conduct the analysis via three large corpora showing that certain automated coherence metrics are correlated. Moreover, we extend the analysis to measure topical differences between corpora. Lastly, we examine the reliability of human judgement by conducting an extensive user study, which is designed as an amalgamation of different proxy tasks to derive a finer insight into the human decision-making processes. Our findings reveal some correlation between automated coherence metrics and human judgement, especially for generic corpora. 2023-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8346 info:doi/10.18653/v1/2023.acl-long.776 https://ink.library.smu.edu.sg/context/sis_research/article/9349/viewcontent/acl23.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Automated metric Coherence metric Correlation analysis Human decision-making Human judgments Large corpora Large-scale correlations Metric evaluation Topic Modeling User study Databases and Information Systems |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Automated metric Coherence metric Correlation analysis Human decision-making Human judgments Large corpora Large-scale correlations Metric evaluation Topic Modeling User study Databases and Information Systems |
spellingShingle |
Automated metric Coherence metric Correlation analysis Human decision-making Human judgments Large corpora Large-scale correlations Metric evaluation Topic Modeling User study Databases and Information Systems LIM, Jia Peng LAUW, Hady Wirawan Large-scale correlation analysis of automated metrics for topic models |
description |
Automated coherence metrics constitute an important and popular way to evaluate topic models. Previous works present a mixed picture of their presumed correlation with human judgement. In this paper, we conduct a large-scale correlation analysis of coherence metrics. We propose a novel sampling approach to mine topics for the purpose of metric evaluation, and conduct the analysis via three large corpora showing that certain automated coherence metrics are correlated. Moreover, we extend the analysis to measure topical differences between corpora. Lastly, we examine the reliability of human judgement by conducting an extensive user study, which is designed as an amalgamation of different proxy tasks to derive a finer insight into the human decision-making processes. Our findings reveal some correlation between automated coherence metrics and human judgement, especially for generic corpora. |
format |
text |
author |
LIM, Jia Peng LAUW, Hady Wirawan |
author_facet |
LIM, Jia Peng LAUW, Hady Wirawan |
author_sort |
LIM, Jia Peng |
title |
Large-scale correlation analysis of automated metrics for topic models |
title_short |
Large-scale correlation analysis of automated metrics for topic models |
title_full |
Large-scale correlation analysis of automated metrics for topic models |
title_fullStr |
Large-scale correlation analysis of automated metrics for topic models |
title_full_unstemmed |
Large-scale correlation analysis of automated metrics for topic models |
title_sort |
large-scale correlation analysis of automated metrics for topic models |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2023 |
url |
https://ink.library.smu.edu.sg/sis_research/8346 https://ink.library.smu.edu.sg/context/sis_research/article/9349/viewcontent/acl23.pdf |
_version_ |
1787136838288277504 |