A Survey on Forms of Visualization and Tools Used in Topic Modelling
In this paper, we surveyed recent publications on topic modeling and analyzed the forms of visualizations and tools used. Expectedly, this information will help Natural Language Processing (NLP) researchers to make better decisions about which types of visualization are appropriate for them and whic...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
JOIV
2023
|
Subjects: | |
Online Access: | http://eprints.uthm.edu.my/11627/1/J16136_dd38e6fdb59711e4af7fdf32b619097b.pdf http://eprints.uthm.edu.my/11627/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Tun Hussein Onn Malaysia |
Language: | English |
Summary: | In this paper, we surveyed recent publications on topic modeling and analyzed the forms of visualizations and tools used. Expectedly, this information will help Natural Language Processing (NLP) researchers to make better decisions about which types of visualization are appropriate for them and which tools can help them. This could also spark further development of existing visualizations or the emergence of new visualizations if a gap is present. Topic modeling is an NLP technique used to identify topics hidden in a collection of documents. Visualizing these topics permits a faster understanding of the underlying subject matter in terms
of its domain. This survey covered publications from 2017 to early 2022. The PRISMA methodology was used to review the publications. One hundred articles were collected, and 42 were found eligible for this study after filtration. Two research questions were formulated. The first question asks, "What are the different forms of visualizations used to display the result of topic modeling?" and the second question is "What visualization software or API is used? From our results, we discovered that different forms of visualizations meet
different purposes of their display. We categorized them as maps, networks, evolution-based charts, and others. We also discovered that LDAvis is the most frequently used software/API, followed by the R language packages and D3.js. The primary limitation of this survey is it is not exhaustive. Hence, some eligible publications may not be included. |
---|