Disentangling transformer language models as superposed topic models
Topic Modelling is an established research area where the quality of a given topic is measured using coherence metrics. Often, we infer topics from Neural Topic Models (NTM) by interpreting their decoder weights, consisting of top-activated words projected from individual neurons. Transformer-based...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2023
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8470 https://ink.library.smu.edu.sg/context/sis_research/article/9473/viewcontent/2023.emnlp_main.534__1_.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9473 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-94732024-01-04T09:35:50Z Disentangling transformer language models as superposed topic models LIM, Jia Peng LAUW, Hady Wirawan Topic Modelling is an established research area where the quality of a given topic is measured using coherence metrics. Often, we infer topics from Neural Topic Models (NTM) by interpreting their decoder weights, consisting of top-activated words projected from individual neurons. Transformer-based Language Models (TLM) similarly consist of decoder weights. However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple coherent topics. Our results show that it is empirically feasible to disentangle coherent topics from GPT-2 models using the Wikipedia corpus. We validate this approach for GPT-2 models using Zero-Shot Topic Modelling. Finally, we extend the proposed approach to disentangle and analyse LLaMA models. 2023-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8470 https://ink.library.smu.edu.sg/context/sis_research/article/9473/viewcontent/2023.emnlp_main.534__1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Programming Languages and Compilers |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Databases and Information Systems Programming Languages and Compilers |
spellingShingle |
Databases and Information Systems Programming Languages and Compilers LIM, Jia Peng LAUW, Hady Wirawan Disentangling transformer language models as superposed topic models |
description |
Topic Modelling is an established research area where the quality of a given topic is measured using coherence metrics. Often, we infer topics from Neural Topic Models (NTM) by interpreting their decoder weights, consisting of top-activated words projected from individual neurons. Transformer-based Language Models (TLM) similarly consist of decoder weights. However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple coherent topics. Our results show that it is empirically feasible to disentangle coherent topics from GPT-2 models using the Wikipedia corpus. We validate this approach for GPT-2 models using Zero-Shot Topic Modelling. Finally, we extend the proposed approach to disentangle and analyse LLaMA models. |
format |
text |
author |
LIM, Jia Peng LAUW, Hady Wirawan |
author_facet |
LIM, Jia Peng LAUW, Hady Wirawan |
author_sort |
LIM, Jia Peng |
title |
Disentangling transformer language models as superposed topic models |
title_short |
Disentangling transformer language models as superposed topic models |
title_full |
Disentangling transformer language models as superposed topic models |
title_fullStr |
Disentangling transformer language models as superposed topic models |
title_full_unstemmed |
Disentangling transformer language models as superposed topic models |
title_sort |
disentangling transformer language models as superposed topic models |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2023 |
url |
https://ink.library.smu.edu.sg/sis_research/8470 https://ink.library.smu.edu.sg/context/sis_research/article/9473/viewcontent/2023.emnlp_main.534__1_.pdf |
_version_ |
1787590775475798016 |