Towards reinterpreting neural topic models via composite activations

Most Neural Topic Models (NTM) use a variational auto-encoder framework producing K topics limited to the size of the encoder’s output. These topics are interpreted through the selection of the top activated words via the weights or reconstructed vector of the decoder that are directly connected to...

全面介紹

Saved in:
書目詳細資料
Main Authors: LIM, Jia Peng, LAUW, Hady Wirawan
格式: text
語言:English
出版: Institutional Knowledge at Singapore Management University 2022
主題:
在線閱讀:https://ink.library.smu.edu.sg/sis_research/7610
https://ink.library.smu.edu.sg/context/sis_research/article/8613/viewcontent/emnlp22.pdf
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:Most Neural Topic Models (NTM) use a variational auto-encoder framework producing K topics limited to the size of the encoder’s output. These topics are interpreted through the selection of the top activated words via the weights or reconstructed vector of the decoder that are directly connected to each neuron. In this paper, we present a model-free two-stage process to reinterpret NTM and derive further insights on the state of the trained model. Firstly, building on the original information from a trained NTM, we generate a pool of potential candidate “composite topics” by exploiting possible co-occurrences within the original set of topics, which decouples the strict interpretation of topics from the original NTM. This is followed by a combinatorial formulation to select a final set of composite topics, which we evaluate for coherence and diversity on a large external corpus. Lastly, we employ a user study to derive further insights on the reinterpretation process.