Disentangling transformer language models as superposed topic models

Topic Modelling is an established research area where the quality of a given topic is measured using coherence metrics. Often, we infer topics from Neural Topic Models (NTM) by interpreting their decoder weights, consisting of top-activated words projected from individual neurons. Transformer-based...

Full description

Saved in:

Bibliographic Details
Main Authors:	LIM, Jia Peng, LAUW, Hady Wirawan
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2023
Subjects:	Databases and Information Systems Programming Languages and Compilers
Online Access:	https://ink.library.smu.edu.sg/sis_research/8470 https://ink.library.smu.edu.sg/context/sis_research/article/9473/viewcontent/2023.emnlp_main.534__1_.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9473
record_format	dspace
spelling	sg-smu-ink.sis_research-94732024-01-04T09:35:50Z Disentangling transformer language models as superposed topic models LIM, Jia Peng LAUW, Hady Wirawan Topic Modelling is an established research area where the quality of a given topic is measured using coherence metrics. Often, we infer topics from Neural Topic Models (NTM) by interpreting their decoder weights, consisting of top-activated words projected from individual neurons. Transformer-based Language Models (TLM) similarly consist of decoder weights. However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple coherent topics. Our results show that it is empirically feasible to disentangle coherent topics from GPT-2 models using the Wikipedia corpus. We validate this approach for GPT-2 models using Zero-Shot Topic Modelling. Finally, we extend the proposed approach to disentangle and analyse LLaMA models. 2023-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8470 https://ink.library.smu.edu.sg/context/sis_research/article/9473/viewcontent/2023.emnlp_main.534__1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Programming Languages and Compilers
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Databases and Information Systems Programming Languages and Compilers
spellingShingle	Databases and Information Systems Programming Languages and Compilers LIM, Jia Peng LAUW, Hady Wirawan Disentangling transformer language models as superposed topic models
description	Topic Modelling is an established research area where the quality of a given topic is measured using coherence metrics. Often, we infer topics from Neural Topic Models (NTM) by interpreting their decoder weights, consisting of top-activated words projected from individual neurons. Transformer-based Language Models (TLM) similarly consist of decoder weights. However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple coherent topics. Our results show that it is empirically feasible to disentangle coherent topics from GPT-2 models using the Wikipedia corpus. We validate this approach for GPT-2 models using Zero-Shot Topic Modelling. Finally, we extend the proposed approach to disentangle and analyse LLaMA models.
format	text
author	LIM, Jia Peng LAUW, Hady Wirawan
author_facet	LIM, Jia Peng LAUW, Hady Wirawan
author_sort	LIM, Jia Peng
title	Disentangling transformer language models as superposed topic models
title_short	Disentangling transformer language models as superposed topic models
title_full	Disentangling transformer language models as superposed topic models
title_fullStr	Disentangling transformer language models as superposed topic models
title_full_unstemmed	Disentangling transformer language models as superposed topic models
title_sort	disentangling transformer language models as superposed topic models
publisher	Institutional Knowledge at Singapore Management University
publishDate	2023
url	https://ink.library.smu.edu.sg/sis_research/8470 https://ink.library.smu.edu.sg/context/sis_research/article/9473/viewcontent/2023.emnlp_main.534__1_.pdf
_version_	1787590775475798016

Disentangling transformer language models as superposed topic models

Similar Items