ONLINE NEWS ARTICLES AUTOMATIC SUMMARIZATION USING GRAPH CONVOLUTIONAL NETWORK
<p align="justify">Automatic multi-document summarization transforms a collection of documents with the same topic into one summary that contains common information and unique information of each article. This final project aims to produce an automatic extractive summarization system...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/27460 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | <p align="justify">Automatic multi-document summarization transforms a collection of documents with the same topic into one summary that contains common information and unique information of each article. This final project aims to produce an automatic extractive summarization system that utilizes artificial neural networks learning and GCN (Graph Convolutional Network) topology. GCN topology can compute sentences relationships graph input upon generating summaries. <br />
<br />
<br />
This system consists of four main components, which are preprocess, graph construction, sentence scoring, and sentence selection components. Sentence scoring component uses neural networks architecture comprises RNN (Recurrent Neural Network) and GCN topologies to produce score estimation of a sentence with word embedding sequence and sentences relationships graph as input. The total of graph input representations that will be used are three. Sentence selection component constructs summaries with two techniques, which are greedy selection and MMR (Maximum Marginal Relevance). <br />
<br />
Experiment produces models with best parameters for 100-word summary and 200word summary. System utilizing GCN produces summaries with average ROUGE2 recall of 0.370 for 100-word summary and 0.378 for 200-word summary on evaluation to test data set. Addition of sentences relationships graph with PDG (Personalized Discourse Graph) representation increases the performance of system without graph. Best selection technique for 100-word summaries are greedy selection while best technique for 200-word summaries are MMR. <p align="justify"> |
---|