GENERATIVE SENTIMEN ANALYSIS ON INDONESIA’S ONLINE NEWS USING DIRECT QUOTATION SENTENCES

Sentiment analysis of direct quote sentences aims to extract public figures' opinions on a particular matter using direct quotes from news. Direct quotes are direct speech of an individual and can be used as direct opinions. Traditional sentiment analysis methods cannot be directly applied t...

Full description

Saved in:
Bibliographic Details
Main Author: Alexander Audino, Rio
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/82483
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:82483
spelling id-itb.:824832024-07-08T14:26:06ZGENERATIVE SENTIMEN ANALYSIS ON INDONESIA’S ONLINE NEWS USING DIRECT QUOTATION SENTENCES Alexander Audino, Rio Indonesia Final Project sentiment analysis, direct quotes, generative approach INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/82483 Sentiment analysis of direct quote sentences aims to extract public figures' opinions on a particular matter using direct quotes from news. Direct quotes are direct speech of an individual and can be used as direct opinions. Traditional sentiment analysis methods cannot be directly applied to direct quote sentences. There are three stages in the sentiment analysis process of direct quote sentences: extraction, attribution, and polarity analysis. These stages aim to extract direct quote sentences, identify the speaker of the quote, determine the target of the quote, and analyze the polarity of the quote. In previous studies, sentiment analysis of direct quote sentences was performed using a regex system and Named Entity Recognition (NER). However, this approach did not perform well because the system could not understand the entire news context. To address this issue, a generative approach can be used to process the entire news context. The model used takes news documents as input and outputs direct quote sentences, speaker identification, quote targets, and quote polarities. Another approach variation involves using regex in the direct quote extraction stage, which can reduce the resources used by the generative model. The dataset construction process involves using GPT-4 to increase the quantity of data, resulting in 1000 training data documents. Fifty test news documents will be annotated by an annotator. The generative model will be trained on the training data through fine-tuning and tested on the test data. Experimental results show that the system aided by regex achieves the best performance. The system using the IndoT5-base-paraphrase model with regex assistance achieves F1 scores of 0.99 for quote extraction, 0.99 for speaker extraction, 0.74 for quote targets, and 0.81 for polarity analysis. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Sentiment analysis of direct quote sentences aims to extract public figures' opinions on a particular matter using direct quotes from news. Direct quotes are direct speech of an individual and can be used as direct opinions. Traditional sentiment analysis methods cannot be directly applied to direct quote sentences. There are three stages in the sentiment analysis process of direct quote sentences: extraction, attribution, and polarity analysis. These stages aim to extract direct quote sentences, identify the speaker of the quote, determine the target of the quote, and analyze the polarity of the quote. In previous studies, sentiment analysis of direct quote sentences was performed using a regex system and Named Entity Recognition (NER). However, this approach did not perform well because the system could not understand the entire news context. To address this issue, a generative approach can be used to process the entire news context. The model used takes news documents as input and outputs direct quote sentences, speaker identification, quote targets, and quote polarities. Another approach variation involves using regex in the direct quote extraction stage, which can reduce the resources used by the generative model. The dataset construction process involves using GPT-4 to increase the quantity of data, resulting in 1000 training data documents. Fifty test news documents will be annotated by an annotator. The generative model will be trained on the training data through fine-tuning and tested on the test data. Experimental results show that the system aided by regex achieves the best performance. The system using the IndoT5-base-paraphrase model with regex assistance achieves F1 scores of 0.99 for quote extraction, 0.99 for speaker extraction, 0.74 for quote targets, and 0.81 for polarity analysis.
format Final Project
author Alexander Audino, Rio
spellingShingle Alexander Audino, Rio
GENERATIVE SENTIMEN ANALYSIS ON INDONESIA’S ONLINE NEWS USING DIRECT QUOTATION SENTENCES
author_facet Alexander Audino, Rio
author_sort Alexander Audino, Rio
title GENERATIVE SENTIMEN ANALYSIS ON INDONESIA’S ONLINE NEWS USING DIRECT QUOTATION SENTENCES
title_short GENERATIVE SENTIMEN ANALYSIS ON INDONESIA’S ONLINE NEWS USING DIRECT QUOTATION SENTENCES
title_full GENERATIVE SENTIMEN ANALYSIS ON INDONESIA’S ONLINE NEWS USING DIRECT QUOTATION SENTENCES
title_fullStr GENERATIVE SENTIMEN ANALYSIS ON INDONESIA’S ONLINE NEWS USING DIRECT QUOTATION SENTENCES
title_full_unstemmed GENERATIVE SENTIMEN ANALYSIS ON INDONESIA’S ONLINE NEWS USING DIRECT QUOTATION SENTENCES
title_sort generative sentimen analysis on indonesia’s online news using direct quotation sentences
url https://digilib.itb.ac.id/gdl/view/82483
_version_ 1822997718453190656