GENERATIVE SENTIMEN ANALYSIS ON INDONESIAâS ONLINE NEWS USING DIRECT QUOTATION SENTENCES
Sentiment analysis of direct quote sentences aims to extract public figures' opinions on a particular matter using direct quotes from news. Direct quotes are direct speech of an individual and can be used as direct opinions. Traditional sentiment analysis methods cannot be directly applied t...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/82483 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Sentiment analysis of direct quote sentences aims to extract public figures'
opinions on a particular matter using direct quotes from news. Direct quotes are
direct speech of an individual and can be used as direct opinions. Traditional
sentiment analysis methods cannot be directly applied to direct quote sentences.
There are three stages in the sentiment analysis process of direct quote sentences:
extraction, attribution, and polarity analysis. These stages aim to extract direct
quote sentences, identify the speaker of the quote, determine the target of the
quote, and analyze the polarity of the quote.
In previous studies, sentiment analysis of direct quote sentences was performed
using a regex system and Named Entity Recognition (NER). However, this
approach did not perform well because the system could not understand the entire
news context. To address this issue, a generative approach can be used to process
the entire news context. The model used takes news documents as input and
outputs direct quote sentences, speaker identification, quote targets, and quote
polarities. Another approach variation involves using regex in the direct quote
extraction stage, which can reduce the resources used by the generative model.
The dataset construction process involves using GPT-4 to increase the quantity of
data, resulting in 1000 training data documents. Fifty test news documents will be
annotated by an annotator. The generative model will be trained on the training
data through fine-tuning and tested on the test data. Experimental results show
that the system aided by regex achieves the best performance. The system using
the IndoT5-base-paraphrase model with regex assistance achieves F1 scores of
0.99 for quote extraction, 0.99 for speaker extraction, 0.74 for quote targets, and
0.81 for polarity analysis. |
---|