FEATURES DEVELOPMENT AND FEATURE EXTRACTION METHODS FOR IDENTIFICATION OF SCIENTIFIC RELATION SCHEMES BASED ON RHETORICAL CITATION
This dissertation research discusses the identification of scientific papers relations based on rhetorical citation obtained by analyzing the citation context contained in a citation sentence. This approach is known as a citation context-based approach, where this approach is more detailed compar...
Saved in:
Main Author: | |
---|---|
Format: | Dissertations |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/49322 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:49322 |
---|---|
spelling |
id-itb.:493222020-09-14T13:36:01ZFEATURES DEVELOPMENT AND FEATURE EXTRACTION METHODS FOR IDENTIFICATION OF SCIENTIFIC RELATION SCHEMES BASED ON RHETORICAL CITATION Sibaroni, Yuliant Indonesia Dissertations paper relations, extend, criticize, compare, features, machine learning INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/49322 This dissertation research discusses the identification of scientific papers relations based on rhetorical citation obtained by analyzing the citation context contained in a citation sentence. This approach is known as a citation context-based approach, where this approach is more detailed compared to the previous two approaches namely content-based and citation analysis-based approaches. The latter two relations approaches can only be used to identify similarities relation between papers. At present, the schema of scientific paper relations developed based on citation context is only explicitly carried out by Wang et al, where the relations produced are extend, criticize, and compare relations. The main feature used by Wang to identify this paper relation is quite simple, namely the cue phrase feature. The focus of this dissertation research is to develop a feature extraction method and produced a feature set of paper relations that can identify Wang's paper relations better. The identification of paper relations is done by classifying each sentence using a supervised machine learning approach. The feature development process is carried out in stages, starting from the extend relation, the critique relation, and finally the compare relation. The results showed that each type of paper relations has special and different features. In the extend relation, several important features were obtained, namely the phrase combination feature and the n-gram feature with top-N correlation. In criticize relations, there are 5 groups of important features, namely the adaptation feature of extended relations, the combination of cue phrases with citation, the combination of cue phrases with previous citation, the combination of cue phrases, and conjunction of some basic features. In the compare relation, there are three important groups of features produced, namely proportionWord feature, probabilityWord feature, and cuephraseWord feature. The feature development process is done by observing the patterns that appear in each relation sentence. Although compared to the baseline, the proposed feature has better performance, but there are still some problems such as the high false prediction values that still appear, the missing of the citation context sentence (co-reference), and so forth.. The increase of F-Measure obtained ranges from 15-40% compared to the baseline feature text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
This dissertation research discusses the identification of scientific papers relations
based on rhetorical citation obtained by analyzing the citation context contained in
a citation sentence. This approach is known as a citation context-based approach,
where this approach is more detailed compared to the previous two approaches
namely content-based and citation analysis-based approaches. The latter two
relations approaches can only be used to identify similarities relation between
papers.
At present, the schema of scientific paper relations developed based on citation
context is only explicitly carried out by Wang et al, where the relations produced
are extend, criticize, and compare relations. The main feature used by Wang to
identify this paper relation is quite simple, namely the cue phrase feature. The focus
of this dissertation research is to develop a feature extraction method and produced
a feature set of paper relations that can identify Wang's paper relations better. The
identification of paper relations is done by classifying each sentence using a
supervised machine learning approach. The feature development process is carried
out in stages, starting from the extend relation, the critique relation, and finally the
compare relation.
The results showed that each type of paper relations has special and different
features. In the extend relation, several important features were obtained, namely
the phrase combination feature and the n-gram feature with top-N correlation. In
criticize relations, there are 5 groups of important features, namely the adaptation
feature of extended relations, the combination of cue phrases with citation, the
combination of cue phrases with previous citation, the combination of cue phrases,
and conjunction of some basic features. In the compare relation, there are three
important groups of features produced, namely proportionWord feature,
probabilityWord feature, and cuephraseWord feature. The feature development
process is done by observing the patterns that appear in each relation sentence.
Although compared to the baseline, the proposed feature has better performance,
but there are still some problems such as the high false prediction values that still
appear, the missing of the citation context sentence (co-reference), and so forth..
The increase of F-Measure obtained ranges from 15-40% compared to the baseline
feature |
format |
Dissertations |
author |
Sibaroni, Yuliant |
spellingShingle |
Sibaroni, Yuliant FEATURES DEVELOPMENT AND FEATURE EXTRACTION METHODS FOR IDENTIFICATION OF SCIENTIFIC RELATION SCHEMES BASED ON RHETORICAL CITATION |
author_facet |
Sibaroni, Yuliant |
author_sort |
Sibaroni, Yuliant |
title |
FEATURES DEVELOPMENT AND FEATURE EXTRACTION METHODS FOR IDENTIFICATION OF SCIENTIFIC RELATION SCHEMES BASED ON RHETORICAL CITATION |
title_short |
FEATURES DEVELOPMENT AND FEATURE EXTRACTION METHODS FOR IDENTIFICATION OF SCIENTIFIC RELATION SCHEMES BASED ON RHETORICAL CITATION |
title_full |
FEATURES DEVELOPMENT AND FEATURE EXTRACTION METHODS FOR IDENTIFICATION OF SCIENTIFIC RELATION SCHEMES BASED ON RHETORICAL CITATION |
title_fullStr |
FEATURES DEVELOPMENT AND FEATURE EXTRACTION METHODS FOR IDENTIFICATION OF SCIENTIFIC RELATION SCHEMES BASED ON RHETORICAL CITATION |
title_full_unstemmed |
FEATURES DEVELOPMENT AND FEATURE EXTRACTION METHODS FOR IDENTIFICATION OF SCIENTIFIC RELATION SCHEMES BASED ON RHETORICAL CITATION |
title_sort |
features development and feature extraction methods for identification of scientific relation schemes based on rhetorical citation |
url |
https://digilib.itb.ac.id/gdl/view/49322 |
_version_ |
1822272004456185856 |