LEARNING THROUGH DISAGREEMENTS IN TEXT CLASSIFICATION: ANNOTATOR WEIGHTING AND LARGE LANGUAGE MODEL ASSISTED PREDICTION

The progress in Natural Language Processing (NLP) has brought about challenges in managing disagreements within annotated datasets, particularly in text classification tasks. This final project explores innovative methods to tackle annotation discrepancies by employing multi-annotator modeling and p...

Full description

Saved in:

Bibliographic Details
Main Author:	Chandrasaputra, Christopher
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/87586
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:87586
spelling	id-itb.:875862025-01-31T10:58:13ZLEARNING THROUGH DISAGREEMENTS IN TEXT CLASSIFICATION: ANNOTATOR WEIGHTING AND LARGE LANGUAGE MODEL ASSISTED PREDICTION Chandrasaputra, Christopher Indonesia Final Project Natural Language Processing, Text Classification, Disagreement, Large Language Models, Annotator Weighting. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/87586 The progress in Natural Language Processing (NLP) has brought about challenges in managing disagreements within annotated datasets, particularly in text classification tasks. This final project explores innovative methods to tackle annotation discrepancies by employing multi-annotator modeling and predictions supported by Large Language Models (LLMs). The main objective is to enhance prediction accuracy by integrating annotator-specific weighting and leveraging LLMs to address conflicts. The research centers on datasets from SemEval 2023, which encompass multiple domains with diverse annotation variations. Two primary strategies were developed: (1) an annotator weighting mechanism to evaluate and modify individual contributions based on levels of agreement, and (2) an LLM-assisted prediction system to aid in decision-making during instances of disagreement. Experiments were carried out using resampled datasets and pre-trained language models to boost computational efficiency and resilience against ambiguous data. The results indicate that the combined strategy of Annotator Weighting and LLM-Assisted prediction enhances prediction performance by up to 0.13 in F1-Micro score and 0.059 in Cross Entropy score compared to the baseline, with the annotator weighting method underscoring the influence of individual annotators and the LLM-assisted method resolving predictions amid disagreements. These insights contribute to a deeper understanding of conflicts in NLP tasks, facilitating even more precise text classification. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	The progress in Natural Language Processing (NLP) has brought about challenges in managing disagreements within annotated datasets, particularly in text classification tasks. This final project explores innovative methods to tackle annotation discrepancies by employing multi-annotator modeling and predictions supported by Large Language Models (LLMs). The main objective is to enhance prediction accuracy by integrating annotator-specific weighting and leveraging LLMs to address conflicts. The research centers on datasets from SemEval 2023, which encompass multiple domains with diverse annotation variations. Two primary strategies were developed: (1) an annotator weighting mechanism to evaluate and modify individual contributions based on levels of agreement, and (2) an LLM-assisted prediction system to aid in decision-making during instances of disagreement. Experiments were carried out using resampled datasets and pre-trained language models to boost computational efficiency and resilience against ambiguous data. The results indicate that the combined strategy of Annotator Weighting and LLM-Assisted prediction enhances prediction performance by up to 0.13 in F1-Micro score and 0.059 in Cross Entropy score compared to the baseline, with the annotator weighting method underscoring the influence of individual annotators and the LLM-assisted method resolving predictions amid disagreements. These insights contribute to a deeper understanding of conflicts in NLP tasks, facilitating even more precise text classification.
format	Final Project
author	Chandrasaputra, Christopher
spellingShingle	Chandrasaputra, Christopher LEARNING THROUGH DISAGREEMENTS IN TEXT CLASSIFICATION: ANNOTATOR WEIGHTING AND LARGE LANGUAGE MODEL ASSISTED PREDICTION
author_facet	Chandrasaputra, Christopher
author_sort	Chandrasaputra, Christopher
title	LEARNING THROUGH DISAGREEMENTS IN TEXT CLASSIFICATION: ANNOTATOR WEIGHTING AND LARGE LANGUAGE MODEL ASSISTED PREDICTION
title_short	LEARNING THROUGH DISAGREEMENTS IN TEXT CLASSIFICATION: ANNOTATOR WEIGHTING AND LARGE LANGUAGE MODEL ASSISTED PREDICTION
title_full	LEARNING THROUGH DISAGREEMENTS IN TEXT CLASSIFICATION: ANNOTATOR WEIGHTING AND LARGE LANGUAGE MODEL ASSISTED PREDICTION
title_fullStr	LEARNING THROUGH DISAGREEMENTS IN TEXT CLASSIFICATION: ANNOTATOR WEIGHTING AND LARGE LANGUAGE MODEL ASSISTED PREDICTION
title_full_unstemmed	LEARNING THROUGH DISAGREEMENTS IN TEXT CLASSIFICATION: ANNOTATOR WEIGHTING AND LARGE LANGUAGE MODEL ASSISTED PREDICTION
title_sort	learning through disagreements in text classification: annotator weighting and large language model assisted prediction
url	https://digilib.itb.ac.id/gdl/view/87586
_version_	1823000101538234368

LEARNING THROUGH DISAGREEMENTS IN TEXT CLASSIFICATION: ANNOTATOR WEIGHTING AND LARGE LANGUAGE MODEL ASSISTED PREDICTION

Similar Items