RHETORICAL SENTENCE CATEGORIZATION IN SCIENTIFIC PAPERS WITH DEEP LEARNING

Rhetorical Document Profile (RDP) is an information framework used to structure the contentsof a scientific paper. RDP divides sentences in scientific papers into 7 to 16 rhetorical categories based on the sentence. RDP can be used as structured data as an input from other systems such as scientific...

Full description

Saved in:

Bibliographic Details
Main Author:	- NIM : 13514032 , Chalvin
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/26230
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:26230
spelling	id-itb.:262302018-10-01T10:20:08ZRHETORICAL SENTENCE CATEGORIZATION IN SCIENTIFIC PAPERS WITH DEEP LEARNING - NIM : 13514032 , Chalvin Indonesia Final Project INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/26230 Rhetorical Document Profile (RDP) is an information framework used to structure the contentsof a scientific paper. RDP divides sentences in scientific papers into 7 to 16 rhetorical categories based on the sentence. RDP can be used as structured data as an input from other systems such as scientific paper summarization systems. There have been several studies that tried to automate rhetorical classifications. For categorization of 7 rhetoric sentences, Teufel's research managed to get an f-score of 0.51 using naive bayes in 2002. Merity et al. managed to get an f-score of 0.93 by using the maximum entropy classifier in 2009. Research on the classification of 16 categories was pioneered by Widyantoro et al. who managed to get an fscore of 0.25 using various techniques. Rachman succeeded in getting f-measure around 0.43 in 2017 using the shallow learning method with word2vec and sequence labeling. This shows that the feature engineering in previous researches was not optimal. <br /> <br /> <br /> <br /> <br /> Lately, a lot of researches in natural language processing uses deep learning. This happens because deep learning is able to capture high level features automatically by combining simpler features. Without deep learning, high level features can only be obtained if a system with an understanding of data similar to human understanding can be made. With the use of deep learning, various models built to answer problems in natural language processing get the best performance. <br /> <br /> <br /> <br /> <br /> In this study, various deep learning architectures were tested to optimize feature engineering in rhetorical sentence categorization research. The architectures tested in this study include CNN, GRU, LSTM, Bi-GRU, and Bi-LSTM. These architectures was chosen because they have been proven to get good results in sentence categorization. The best model in this study gets an f-measure of 0.457. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Rhetorical Document Profile (RDP) is an information framework used to structure the contentsof a scientific paper. RDP divides sentences in scientific papers into 7 to 16 rhetorical categories based on the sentence. RDP can be used as structured data as an input from other systems such as scientific paper summarization systems. There have been several studies that tried to automate rhetorical classifications. For categorization of 7 rhetoric sentences, Teufel's research managed to get an f-score of 0.51 using naive bayes in 2002. Merity et al. managed to get an f-score of 0.93 by using the maximum entropy classifier in 2009. Research on the classification of 16 categories was pioneered by Widyantoro et al. who managed to get an fscore of 0.25 using various techniques. Rachman succeeded in getting f-measure around 0.43 in 2017 using the shallow learning method with word2vec and sequence labeling. This shows that the feature engineering in previous researches was not optimal. <br /> <br /> <br /> <br /> <br /> Lately, a lot of researches in natural language processing uses deep learning. This happens because deep learning is able to capture high level features automatically by combining simpler features. Without deep learning, high level features can only be obtained if a system with an understanding of data similar to human understanding can be made. With the use of deep learning, various models built to answer problems in natural language processing get the best performance. <br /> <br /> <br /> <br /> <br /> In this study, various deep learning architectures were tested to optimize feature engineering in rhetorical sentence categorization research. The architectures tested in this study include CNN, GRU, LSTM, Bi-GRU, and Bi-LSTM. These architectures was chosen because they have been proven to get good results in sentence categorization. The best model in this study gets an f-measure of 0.457.
format	Final Project
author	- NIM : 13514032 , Chalvin
spellingShingle	- NIM : 13514032 , Chalvin RHETORICAL SENTENCE CATEGORIZATION IN SCIENTIFIC PAPERS WITH DEEP LEARNING
author_facet	- NIM : 13514032 , Chalvin
author_sort	- NIM : 13514032 , Chalvin
title	RHETORICAL SENTENCE CATEGORIZATION IN SCIENTIFIC PAPERS WITH DEEP LEARNING
title_short	RHETORICAL SENTENCE CATEGORIZATION IN SCIENTIFIC PAPERS WITH DEEP LEARNING
title_full	RHETORICAL SENTENCE CATEGORIZATION IN SCIENTIFIC PAPERS WITH DEEP LEARNING
title_fullStr	RHETORICAL SENTENCE CATEGORIZATION IN SCIENTIFIC PAPERS WITH DEEP LEARNING
title_full_unstemmed	RHETORICAL SENTENCE CATEGORIZATION IN SCIENTIFIC PAPERS WITH DEEP LEARNING
title_sort	rhetorical sentence categorization in scientific papers with deep learning
url	https://digilib.itb.ac.id/gdl/view/26230
_version_	1822020948233027584

RHETORICAL SENTENCE CATEGORIZATION IN SCIENTIFIC PAPERS WITH DEEP LEARNING

Similar Items