AUTOMATIC PARAPHRASING FOR INDONESIAN LANGUAGE USING SIMULATED ANNEALING

Paraphrasing is a technique of processing information by changing the form of the text without changing its meaning. The system for automatic paraphrasing generation for Indonesian language that has been developed uses a rule-based approach, but its use is still limited to sentence with defined r...

Full description

Saved in:
Bibliographic Details
Main Author: Afra Sabrina, Nada
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/56232
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Paraphrasing is a technique of processing information by changing the form of the text without changing its meaning. The system for automatic paraphrasing generation for Indonesian language that has been developed uses a rule-based approach, but its use is still limited to sentence with defined rules. This study uses an unsupervised approach with the Simulated Annealing algorithm adapted from Unsupervised Paraphrasing by Simulated Annealing system. Paraphrase candidates are generated by doing local editing. The acceptance probability of candidate is based on the objective function value which is a linear combination of semantic preservation score, diversity of language expressions score, and fluency score. Adaptation for Indonesian is done by changing language-specific resources. These resources included a language models for fluency score calculation, a dictionary, word embedding, and a stopword list used to extract keywords. In addition, this study also implementing modification by changing the implementation of Hill Climbing in determining word candidate in generate candidate process and using an Indonesian thesaurus to obtain synonyms for the words on replacement. Based on the experimental results, it was found that the modified algorithm using Indonesian thesaurus obtained the best results in terms of the number of sentences that were successfully paraphrased and in terms of similarities to the original sentence, compared to original adaptation of UPSA and algorithm with modification of the Hill Climbing implementation.