AUTOMATED SUMMARIZATION FOR INDONESIAN NEWS ARTICLE USING ABSTRACT MEANING REPRESENTATION

Along with the growth of online news sources, summaries have become in needs to obtain important information in shorter reading times. Summarization with Abstract Meaning Representation (AMR) has been done for the first time for Indonesian by using a rule-based AMR parser. Thus, the said AMR p...

全面介紹

Saved in:

書目詳細資料
主要作者:	Akhyar, Amany
格式:	Theses
語言:	Indonesia
在線閱讀:	https://digilib.itb.ac.id/gdl/view/55526
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

id	id-itb.:55526
spelling	id-itb.:555262021-06-18T05:46:16ZAUTOMATED SUMMARIZATION FOR INDONESIAN NEWS ARTICLE USING ABSTRACT MEANING REPRESENTATION Akhyar, Amany Indonesia Theses Summarization, IndoSum, Abstract Meaning Representation INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/55526 Along with the growth of online news sources, summaries have become in needs to obtain important information in shorter reading times. Summarization with Abstract Meaning Representation (AMR) has been done for the first time for Indonesian by using a rule-based AMR parser. Thus, the said AMR parser has limitations by generating nodes with phrases that cause problems in the concept merging process of summarization system. In this research, a machine learning-based AMR parser for Indonesian is used to represent news article sentences from the IndoSum dataset. This AMR parser only generates nodes with words. The concepts from generated AMR graph then would be combined based on the same word and synonyms to form a source graph. The source graph is then selected into subgraphs (also called summary graph) which would be generated into a word set using Simple Natural Language Generation (Simple NLG). From the word set, the system will extract three sentences of news articles based on the highest score of the matching words normalized to sentence length. The data used for this research is IndoSum dataset. From the research results, it is proven that AMR generated by machine learningbased AMR parser can go through the process of concepts merging really well. As a baseline, the extraction of the top three most similar news article sentences is carried out based on cosine similarity. The representation used is Word2Vec which has been retrained. The proposed system still has not exceeded the baseline. From the analysis carried out, it appears that the system tends to choose the node whose original word is in the initial sentence. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Along with the growth of online news sources, summaries have become in needs to obtain important information in shorter reading times. Summarization with Abstract Meaning Representation (AMR) has been done for the first time for Indonesian by using a rule-based AMR parser. Thus, the said AMR parser has limitations by generating nodes with phrases that cause problems in the concept merging process of summarization system. In this research, a machine learning-based AMR parser for Indonesian is used to represent news article sentences from the IndoSum dataset. This AMR parser only generates nodes with words. The concepts from generated AMR graph then would be combined based on the same word and synonyms to form a source graph. The source graph is then selected into subgraphs (also called summary graph) which would be generated into a word set using Simple Natural Language Generation (Simple NLG). From the word set, the system will extract three sentences of news articles based on the highest score of the matching words normalized to sentence length. The data used for this research is IndoSum dataset. From the research results, it is proven that AMR generated by machine learningbased AMR parser can go through the process of concepts merging really well. As a baseline, the extraction of the top three most similar news article sentences is carried out based on cosine similarity. The representation used is Word2Vec which has been retrained. The proposed system still has not exceeded the baseline. From the analysis carried out, it appears that the system tends to choose the node whose original word is in the initial sentence.
format	Theses
author	Akhyar, Amany
spellingShingle	Akhyar, Amany AUTOMATED SUMMARIZATION FOR INDONESIAN NEWS ARTICLE USING ABSTRACT MEANING REPRESENTATION
author_facet	Akhyar, Amany
author_sort	Akhyar, Amany
title	AUTOMATED SUMMARIZATION FOR INDONESIAN NEWS ARTICLE USING ABSTRACT MEANING REPRESENTATION
title_short	AUTOMATED SUMMARIZATION FOR INDONESIAN NEWS ARTICLE USING ABSTRACT MEANING REPRESENTATION
title_full	AUTOMATED SUMMARIZATION FOR INDONESIAN NEWS ARTICLE USING ABSTRACT MEANING REPRESENTATION
title_fullStr	AUTOMATED SUMMARIZATION FOR INDONESIAN NEWS ARTICLE USING ABSTRACT MEANING REPRESENTATION
title_full_unstemmed	AUTOMATED SUMMARIZATION FOR INDONESIAN NEWS ARTICLE USING ABSTRACT MEANING REPRESENTATION
title_sort	automated summarization for indonesian news article using abstract meaning representation
url	https://digilib.itb.ac.id/gdl/view/55526
_version_	1823643727061581824

AUTOMATED SUMMARIZATION FOR INDONESIAN NEWS ARTICLE USING ABSTRACT MEANING REPRESENTATION

相似書籍