Graph convolutional neural networks for text categorization

Text categorization is the task of labelling text data from a predetermined set of thematic labels. In recent years, it has become of increasing importance as we generate large volumes of data and require the ability to search through these vast datasets with flexible queries. However, manually labe...

Full description

Saved in:
Bibliographic Details
Main Author: Lakhotia, Suyash
Other Authors: Xavier Bresson
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74095
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Text categorization is the task of labelling text data from a predetermined set of thematic labels. In recent years, it has become of increasing importance as we generate large volumes of data and require the ability to search through these vast datasets with flexible queries. However, manually labelling text data is an extremely tedious task that is prone to human error. Thus, text classification has become a key focus of machine learning research, with the goal of producing models that are more efficient and accurate than traditional methods. This project explores the recently enhanced deep learning techniques of convolutional neural networks and their fusion with graph analysis (i.e. graph convolutional neural networks) in the field of text categorization and compares their performance to established baseline models and simpler multilayer perceptrons. We show through experiments on three major text classification datasets (Rotten Tomatoes Sentence Polarity, 20 Newsgroups and Reuters Corpus Volume 1) that graph convolutional neural networks can naturally work in the space of words represented as a graph and perform with greater or similar test accuracy when compared to standard convolutional neural networks and simpler baseline models.