SEMANTIC TEXTUAL SIMILARITY (STS) FOR INDONESIAN SENTENCE USING SIAMESE NEURAL NETWORK

Semantic Textual Similarity (STS) is a task in natural language processing that deals with determining how similar two sentences are. STS is a very important component in solving other natural language processing tasks such as semantic search, summarization, question answering, plagiarism detecti...

Full description

Saved in:
Bibliographic Details
Main Author: Baptiso Sorlawan, Agung
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/54226
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:54226
spelling id-itb.:542262021-03-15T14:24:50ZSEMANTIC TEXTUAL SIMILARITY (STS) FOR INDONESIAN SENTENCE USING SIAMESE NEURAL NETWORK Baptiso Sorlawan, Agung Indonesia Final Project Semantic Textual Similarity, Siamese Neural Network, encoder, pooling, objective function INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/54226 Semantic Textual Similarity (STS) is a task in natural language processing that deals with determining how similar two sentences are. STS is a very important component in solving other natural language processing tasks such as semantic search, summarization, question answering, plagiarism detection, and information extraction. One of the architecture which is the focus of this research that can be used to model STS is Siamese Neural Network (SNN). One of the most important components in STS is the encoder. The encoder maps sentences to numerical vectors. In this research, experiments are being made to various kinds of SNN encoder. Other than that, experiments also being done to the other components of SNN as well, which are the pooling layer and the objective function. The dataset used in these experiments is acquired from Prosa.ai which contains frequently asked questions (FAQ) sentences. From the experiments, the best STS model achieves an f1­score of 0,9723 which is better than the baseline. That model is SNN with IndoBERT encoder, MEAN + CLS pooling and regression objective function. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Semantic Textual Similarity (STS) is a task in natural language processing that deals with determining how similar two sentences are. STS is a very important component in solving other natural language processing tasks such as semantic search, summarization, question answering, plagiarism detection, and information extraction. One of the architecture which is the focus of this research that can be used to model STS is Siamese Neural Network (SNN). One of the most important components in STS is the encoder. The encoder maps sentences to numerical vectors. In this research, experiments are being made to various kinds of SNN encoder. Other than that, experiments also being done to the other components of SNN as well, which are the pooling layer and the objective function. The dataset used in these experiments is acquired from Prosa.ai which contains frequently asked questions (FAQ) sentences. From the experiments, the best STS model achieves an f1­score of 0,9723 which is better than the baseline. That model is SNN with IndoBERT encoder, MEAN + CLS pooling and regression objective function.
format Final Project
author Baptiso Sorlawan, Agung
spellingShingle Baptiso Sorlawan, Agung
SEMANTIC TEXTUAL SIMILARITY (STS) FOR INDONESIAN SENTENCE USING SIAMESE NEURAL NETWORK
author_facet Baptiso Sorlawan, Agung
author_sort Baptiso Sorlawan, Agung
title SEMANTIC TEXTUAL SIMILARITY (STS) FOR INDONESIAN SENTENCE USING SIAMESE NEURAL NETWORK
title_short SEMANTIC TEXTUAL SIMILARITY (STS) FOR INDONESIAN SENTENCE USING SIAMESE NEURAL NETWORK
title_full SEMANTIC TEXTUAL SIMILARITY (STS) FOR INDONESIAN SENTENCE USING SIAMESE NEURAL NETWORK
title_fullStr SEMANTIC TEXTUAL SIMILARITY (STS) FOR INDONESIAN SENTENCE USING SIAMESE NEURAL NETWORK
title_full_unstemmed SEMANTIC TEXTUAL SIMILARITY (STS) FOR INDONESIAN SENTENCE USING SIAMESE NEURAL NETWORK
title_sort semantic textual similarity (sts) for indonesian sentence using siamese neural network
url https://digilib.itb.ac.id/gdl/view/54226
_version_ 1822929549837467648