LA-HCN: label-based attention for hierarchical multi-label text classification neural network

Hierarchical multi-label text classification (HMTC) has been gaining popularity in recent years thanks to its applicability to a plethora of real-world applications. The existing HMTC algorithms largely focus on the design of classifiers, such as the local, global, or a combination of them. However,...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhang, Xinyi, Xu, Jiahao, Soh, Charlie, Chen, Lihui
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/160673
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Hierarchical multi-label text classification (HMTC) has been gaining popularity in recent years thanks to its applicability to a plethora of real-world applications. The existing HMTC algorithms largely focus on the design of classifiers, such as the local, global, or a combination of them. However, very few studies have focused on hierarchical feature extraction and explore the association between the hierarchical labels and the text. In this paper, we propose a Label-based Attention for Hierarchical Multi-label Text Classification Neural Network (LA-HCN), where the novel label-based attention module is designed to hierarchically extract important information from the text based on the labels from different hierarchy levels. Besides, hierarchical information is shared across levels while preserving the hierarchical label-based information. Separate local and global document embeddings are obtained and used to facilitate the respective local and global classifications. In our experiments, LA-HCN outperforms other state-of-the-art neural network-based HMTC algorithms on four public HMTC datasets. The ablation study also demonstrates the effectiveness of the proposed label-based attention module as well as the novel local and global embeddings and classifications. By visualizing the learned attention (words), we find that LA-HCN is able to extract meaningful information corresponding to the different labels which provides explainability that may be helpful for the human analyst.