AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING

Paraphrasing is a technique of processing information by changing the form of the text without changing its meaning. The system for automatic paraphrasing generation for Indonesian language that has been developed uses a rule-based approach, but its use is still limited to sentence with defined r...

Full description

Saved in:
Bibliographic Details
Main Author: Afra Sabrina, Nada
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/56232
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:56232
spelling id-itb.:562322021-06-21T16:08:51ZAUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING Afra Sabrina, Nada Indonesia Final Project paraphrasing, Simulated Annealing, local editing INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/56232 Paraphrasing is a technique of processing information by changing the form of the text without changing its meaning. The system for automatic paraphrasing generation for Indonesian language that has been developed uses a rule-based approach, but its use is still limited to sentence with defined rules. This study uses an unsupervised approach with the Simulated Annealing algorithm adapted from Unsupervised Paraphrasing by Simulated Annealing system. Paraphrase candidates are generated by doing local editing. The acceptance probability of candidate is based on the objective function value which is a linear combination of semantic preservation score, diversity of language expressions score, and fluency score. Adaptation for Indonesian is done by changing language-specific resources. These resources included a language models for fluency score calculation, a dictionary, word embedding, and a stopword list used to extract keywords. In addition, this study also implementing modification by changing the implementation of Hill Climbing in determining word candidate in generate candidate process and using an Indonesian thesaurus to obtain synonyms for the words on replacement. Based on the experimental results, it was found that the modified algorithm using Indonesian thesaurus obtained the best results in terms of the number of sentences that were successfully paraphrased and in terms of similarities to the original sentence, compared to original adaptation of UPSA and algorithm with modification of the Hill Climbing implementation. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Paraphrasing is a technique of processing information by changing the form of the text without changing its meaning. The system for automatic paraphrasing generation for Indonesian language that has been developed uses a rule-based approach, but its use is still limited to sentence with defined rules. This study uses an unsupervised approach with the Simulated Annealing algorithm adapted from Unsupervised Paraphrasing by Simulated Annealing system. Paraphrase candidates are generated by doing local editing. The acceptance probability of candidate is based on the objective function value which is a linear combination of semantic preservation score, diversity of language expressions score, and fluency score. Adaptation for Indonesian is done by changing language-specific resources. These resources included a language models for fluency score calculation, a dictionary, word embedding, and a stopword list used to extract keywords. In addition, this study also implementing modification by changing the implementation of Hill Climbing in determining word candidate in generate candidate process and using an Indonesian thesaurus to obtain synonyms for the words on replacement. Based on the experimental results, it was found that the modified algorithm using Indonesian thesaurus obtained the best results in terms of the number of sentences that were successfully paraphrased and in terms of similarities to the original sentence, compared to original adaptation of UPSA and algorithm with modification of the Hill Climbing implementation.
format Final Project
author Afra Sabrina, Nada
spellingShingle Afra Sabrina, Nada
AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING
author_facet Afra Sabrina, Nada
author_sort Afra Sabrina, Nada
title AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING
title_short AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING
title_full AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING
title_fullStr AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING
title_full_unstemmed AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING
title_sort automatic paraphrasing for indonesian language using simulated annealing
url https://digilib.itb.ac.id/gdl/view/56232
_version_ 1822274521927778304