DEVELOPMENT OF SENTIMENT ANALYSIS AND INTENT CLASSIFICATION OF SCIENTIFIC JOURNAL'S CITATION MODEL

A citation or quote is defined as the takeover of one or more sentences from another written work. In the citation, the author's opinion can be seen in the form of positive credit or negative criticism. In addition, it can also be seen what the author is trying to quote, such as the backgro...

Full description

Saved in:
Bibliographic Details
Main Author: Mahendra Guntara Harsono, Rayza
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/56324
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:56324
spelling id-itb.:563242021-06-22T06:38:31ZDEVELOPMENT OF SENTIMENT ANALYSIS AND INTENT CLASSIFICATION OF SCIENTIFIC JOURNAL'S CITATION MODEL Mahendra Guntara Harsono, Rayza Indonesia Final Project citation, sentiment, intent, transformer model INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/56324 A citation or quote is defined as the takeover of one or more sentences from another written work. In the citation, the author's opinion can be seen in the form of positive credit or negative criticism. In addition, it can also be seen what the author is trying to quote, such as the background of the journal, methods, and experimental results. The positive and negative opinions are called sentiments, while the intentions quoted by the author are called intent. Knowing these two aspects can help in getting the context of a scientific work. This can particularly assist medical researchers in compiling research materials on the COVID-19 pandemic that are available on the CORD-19 dataset. In this final project, we review the model with the highest performance for the sentiment and intent classification task of citation sentences based on the current state of the art (SOTA) NLP Transformer model, such as SciBERT and XLNet. SciBERT is a modification of the BERT which has been the model with the highest performance since 2018 in various NLP tasks. SciBERT uses more than 1,000,000 scientific papers in its pretrain stage to understand the context of the scientific paper's domain. XLNet as a new transformer model, has been proven by Mercier as SOTA in these two classification tasks (Mercier et al., 2020). Experiments performed finetuning using the Scicite and ACL-ARC datasets with more than 11,020 and 7000 data, respectively, with various hyperparameters, such as epochs and learning rates, and F1 metrics to determine the most optimal model. The system design consists of two Transformer models that classify the citation text into sentiment and intent classes, and visualize it in the form of a network graph. Experiments and analysis show that SciBERT gives results with the highest F1 macro average metric on both tasks, with a score of 0.87 on sentiment classification and 0.83 on intent classification. Even though the F1 value is high, both models still have difficulty in recognizing medical terms and pathogens contained in the CORD-19 dataset. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description A citation or quote is defined as the takeover of one or more sentences from another written work. In the citation, the author's opinion can be seen in the form of positive credit or negative criticism. In addition, it can also be seen what the author is trying to quote, such as the background of the journal, methods, and experimental results. The positive and negative opinions are called sentiments, while the intentions quoted by the author are called intent. Knowing these two aspects can help in getting the context of a scientific work. This can particularly assist medical researchers in compiling research materials on the COVID-19 pandemic that are available on the CORD-19 dataset. In this final project, we review the model with the highest performance for the sentiment and intent classification task of citation sentences based on the current state of the art (SOTA) NLP Transformer model, such as SciBERT and XLNet. SciBERT is a modification of the BERT which has been the model with the highest performance since 2018 in various NLP tasks. SciBERT uses more than 1,000,000 scientific papers in its pretrain stage to understand the context of the scientific paper's domain. XLNet as a new transformer model, has been proven by Mercier as SOTA in these two classification tasks (Mercier et al., 2020). Experiments performed finetuning using the Scicite and ACL-ARC datasets with more than 11,020 and 7000 data, respectively, with various hyperparameters, such as epochs and learning rates, and F1 metrics to determine the most optimal model. The system design consists of two Transformer models that classify the citation text into sentiment and intent classes, and visualize it in the form of a network graph. Experiments and analysis show that SciBERT gives results with the highest F1 macro average metric on both tasks, with a score of 0.87 on sentiment classification and 0.83 on intent classification. Even though the F1 value is high, both models still have difficulty in recognizing medical terms and pathogens contained in the CORD-19 dataset.
format Final Project
author Mahendra Guntara Harsono, Rayza
spellingShingle Mahendra Guntara Harsono, Rayza
DEVELOPMENT OF SENTIMENT ANALYSIS AND INTENT CLASSIFICATION OF SCIENTIFIC JOURNAL'S CITATION MODEL
author_facet Mahendra Guntara Harsono, Rayza
author_sort Mahendra Guntara Harsono, Rayza
title DEVELOPMENT OF SENTIMENT ANALYSIS AND INTENT CLASSIFICATION OF SCIENTIFIC JOURNAL'S CITATION MODEL
title_short DEVELOPMENT OF SENTIMENT ANALYSIS AND INTENT CLASSIFICATION OF SCIENTIFIC JOURNAL'S CITATION MODEL
title_full DEVELOPMENT OF SENTIMENT ANALYSIS AND INTENT CLASSIFICATION OF SCIENTIFIC JOURNAL'S CITATION MODEL
title_fullStr DEVELOPMENT OF SENTIMENT ANALYSIS AND INTENT CLASSIFICATION OF SCIENTIFIC JOURNAL'S CITATION MODEL
title_full_unstemmed DEVELOPMENT OF SENTIMENT ANALYSIS AND INTENT CLASSIFICATION OF SCIENTIFIC JOURNAL'S CITATION MODEL
title_sort development of sentiment analysis and intent classification of scientific journal's citation model
url https://digilib.itb.ac.id/gdl/view/56324
_version_ 1822002329633685504