MAGNET ARCHITECTURE OPTIMIZATION ON MULTI-LABEL TEXT CLASSIFICATION
Multi-label text classification is a matter of categorizing each text into one or more categories. MAGNET is a deep learning model architecture that combines Graph Attention Networks, BiLSTM, and BERT embeddings to address multi-label text classification task. MAGNET utilizes Graph Attention N...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/58050 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:58050 |
---|---|
spelling |
id-itb.:580502021-08-30T12:42:59ZMAGNET ARCHITECTURE OPTIMIZATION ON MULTI-LABEL TEXT CLASSIFICATION Adrinta Abdurrazzaq, Muhammad Indonesia Theses multi-label text classification, graph representation learning, natural language processing, pretraining INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/58050 Multi-label text classification is a matter of categorizing each text into one or more categories. MAGNET is a deep learning model architecture that combines Graph Attention Networks, BiLSTM, and BERT embeddings to address multi-label text classification task. MAGNET utilizes Graph Attention Networks to obtain dependency information between labels by paying attention to the dependencies. MAGNET has limitations in handling data with many labels. This causes the adjacency matrix that is formed to be very large and makes the model difficult to train because it requires large computational resources. In this research, label clustering was used to reduce the dimension of the adjacency matrix. The labels are grouped into several clusters first, then the labels that are in the same cluster form their own adjacency matrix. Louvain's algorithm is used to cluster labels, where Louvain's algorithm is used for graph data structures. For this reason, the adjacency matrix can represent a graph of the dependencies between labels and be used as input for Louvain's Algorithm. On the other hand, the use of fine-tuning layers with BiGRU and embedding methods using XLNet can be tried because BiGRU and XLNet have better performance than BiLSTM and BERT in other researches. From the results of the research, the two proposed architectures that were tested on three different data were able to produce similar or better performance compared to the base MAGNET architecture. However, when there is too much weight of dependencies between labels that are lost due to clustering, the proposed architecture cannot perform as well as the model that uses the base MAGNET architecture. Meanwhile, the combination of BiGRU and XLNet embeddings outperformed the performance of the combination of BiLSTM and BERT embeddings used in previous reearch. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Multi-label text classification is a matter of categorizing each text into one or more
categories. MAGNET is a deep learning model architecture that combines Graph
Attention Networks, BiLSTM, and BERT embeddings to address multi-label text
classification task. MAGNET utilizes Graph Attention Networks to obtain
dependency information between labels by paying attention to the dependencies.
MAGNET has limitations in handling data with many labels. This causes the
adjacency matrix that is formed to be very large and makes the model difficult to
train because it requires large computational resources.
In this research, label clustering was used to reduce the dimension of the adjacency
matrix. The labels are grouped into several clusters first, then the labels that are in
the same cluster form their own adjacency matrix. Louvain's algorithm is used to
cluster labels, where Louvain's algorithm is used for graph data structures. For
this reason, the adjacency matrix can represent a graph of the dependencies
between labels and be used as input for Louvain's Algorithm. On the other hand,
the use of fine-tuning layers with BiGRU and embedding methods using XLNet can
be tried because BiGRU and XLNet have better performance than BiLSTM and
BERT in other researches.
From the results of the research, the two proposed architectures that were tested
on three different data were able to produce similar or better performance
compared to the base MAGNET architecture. However, when there is too much
weight of dependencies between labels that are lost due to clustering, the proposed
architecture cannot perform as well as the model that uses the base MAGNET
architecture. Meanwhile, the combination of BiGRU and XLNet embeddings
outperformed the performance of the combination of BiLSTM and BERT
embeddings used in previous reearch. |
format |
Theses |
author |
Adrinta Abdurrazzaq, Muhammad |
spellingShingle |
Adrinta Abdurrazzaq, Muhammad MAGNET ARCHITECTURE OPTIMIZATION ON MULTI-LABEL TEXT CLASSIFICATION |
author_facet |
Adrinta Abdurrazzaq, Muhammad |
author_sort |
Adrinta Abdurrazzaq, Muhammad |
title |
MAGNET ARCHITECTURE OPTIMIZATION ON MULTI-LABEL TEXT CLASSIFICATION |
title_short |
MAGNET ARCHITECTURE OPTIMIZATION ON MULTI-LABEL TEXT CLASSIFICATION |
title_full |
MAGNET ARCHITECTURE OPTIMIZATION ON MULTI-LABEL TEXT CLASSIFICATION |
title_fullStr |
MAGNET ARCHITECTURE OPTIMIZATION ON MULTI-LABEL TEXT CLASSIFICATION |
title_full_unstemmed |
MAGNET ARCHITECTURE OPTIMIZATION ON MULTI-LABEL TEXT CLASSIFICATION |
title_sort |
magnet architecture optimization on multi-label text classification |
url |
https://digilib.itb.ac.id/gdl/view/58050 |
_version_ |
1822930652605972480 |