AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING
Paraphrasing is a technique of processing information by changing the form of the text without changing its meaning. The system for automatic paraphrasing generation for Indonesian language that has been developed uses a rule-based approach, but its use is still limited to sentence with defined r...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/56232 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:56232 |
---|---|
spelling |
id-itb.:562322021-06-21T16:08:51ZAUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING Afra Sabrina, Nada Indonesia Final Project paraphrasing, Simulated Annealing, local editing INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/56232 Paraphrasing is a technique of processing information by changing the form of the text without changing its meaning. The system for automatic paraphrasing generation for Indonesian language that has been developed uses a rule-based approach, but its use is still limited to sentence with defined rules. This study uses an unsupervised approach with the Simulated Annealing algorithm adapted from Unsupervised Paraphrasing by Simulated Annealing system. Paraphrase candidates are generated by doing local editing. The acceptance probability of candidate is based on the objective function value which is a linear combination of semantic preservation score, diversity of language expressions score, and fluency score. Adaptation for Indonesian is done by changing language-specific resources. These resources included a language models for fluency score calculation, a dictionary, word embedding, and a stopword list used to extract keywords. In addition, this study also implementing modification by changing the implementation of Hill Climbing in determining word candidate in generate candidate process and using an Indonesian thesaurus to obtain synonyms for the words on replacement. Based on the experimental results, it was found that the modified algorithm using Indonesian thesaurus obtained the best results in terms of the number of sentences that were successfully paraphrased and in terms of similarities to the original sentence, compared to original adaptation of UPSA and algorithm with modification of the Hill Climbing implementation. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Paraphrasing is a technique of processing information by changing the form of the
text without changing its meaning. The system for automatic paraphrasing
generation for Indonesian language that has been developed uses a rule-based
approach, but its use is still limited to sentence with defined rules. This study uses
an unsupervised approach with the Simulated Annealing algorithm adapted from
Unsupervised Paraphrasing by Simulated Annealing system. Paraphrase
candidates are generated by doing local editing. The acceptance probability of
candidate is based on the objective function value which is a linear combination of
semantic preservation score, diversity of language expressions score, and fluency
score.
Adaptation for Indonesian is done by changing language-specific resources. These
resources included a language models for fluency score calculation, a dictionary,
word embedding, and a stopword list used to extract keywords. In addition, this
study also implementing modification by changing the implementation of Hill
Climbing in determining word candidate in generate candidate process and using
an Indonesian thesaurus to obtain synonyms for the words on replacement.
Based on the experimental results, it was found that the modified algorithm using
Indonesian thesaurus obtained the best results in terms of the number of sentences
that were successfully paraphrased and in terms of similarities to the original
sentence, compared to original adaptation of UPSA and algorithm with
modification of the Hill Climbing implementation. |
format |
Final Project |
author |
Afra Sabrina, Nada |
spellingShingle |
Afra Sabrina, Nada AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING |
author_facet |
Afra Sabrina, Nada |
author_sort |
Afra Sabrina, Nada |
title |
AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING |
title_short |
AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING |
title_full |
AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING |
title_fullStr |
AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING |
title_full_unstemmed |
AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING |
title_sort |
automatic paraphrasing for indonesian language using simulated annealing |
url |
https://digilib.itb.ac.id/gdl/view/56232 |
_version_ |
1822274521927778304 |