NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE
Voice assistant technology is growing rapidly now. Its use has begun to spread to daily use. However, voice assistants are still limited to the use of standard conversation languages. Meanwhile, Indonesian people are accustomed to saying non-formal language in everyday conversation. The execution...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/35827 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:35827 |
---|---|
spelling |
id-itb.:358272019-03-04T09:49:11ZNON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE Dwi Rizqullah, Rafi Indonesia Final Project voice assistant, dictionary, non-formal word, normalization, Levenshtein distance, Jaro-Winkler distance. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/35827 Voice assistant technology is growing rapidly now. Its use has begun to spread to daily use. However, voice assistants are still limited to the use of standard conversation languages. Meanwhile, Indonesian people are accustomed to saying non-formal language in everyday conversation. The execution of this Final Project includes solutions to overcome the problem of voice assistants with non-formal words or not included in the formal word dictionary. The approach used as a solution is to normalize the text using Levenshtein distance and Jaro-Winkler distance. Test result shows that normalization using Levenshtein distance outperform the normalization using LCS distance with accuracy difference of 8.34 percent. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Voice assistant technology is growing rapidly now. Its use has begun to spread to daily
use. However, voice assistants are still limited to the use of standard conversation
languages. Meanwhile, Indonesian people are accustomed to saying non-formal
language in everyday conversation. The execution of this Final Project includes solutions
to overcome the problem of voice assistants with non-formal words or not included
in the formal word dictionary. The approach used as a solution is to normalize the
text using Levenshtein distance and Jaro-Winkler distance. Test result shows that
normalization using Levenshtein distance outperform the normalization using LCS
distance with accuracy difference of 8.34 percent. |
format |
Final Project |
author |
Dwi Rizqullah, Rafi |
spellingShingle |
Dwi Rizqullah, Rafi NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE |
author_facet |
Dwi Rizqullah, Rafi |
author_sort |
Dwi Rizqullah, Rafi |
title |
NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE |
title_short |
NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE |
title_full |
NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE |
title_fullStr |
NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE |
title_full_unstemmed |
NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE |
title_sort |
non-formal and non-shortened word normalization with edit distance |
url |
https://digilib.itb.ac.id/gdl/view/35827 |
_version_ |
1822924502618603520 |