NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE

Voice assistant technology is growing rapidly now. Its use has begun to spread to daily use. However, voice assistants are still limited to the use of standard conversation languages. Meanwhile, Indonesian people are accustomed to saying non-formal language in everyday conversation. The execution...

Full description

Saved in:
Bibliographic Details
Main Author: Dwi Rizqullah, Rafi
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/35827
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:35827
spelling id-itb.:358272019-03-04T09:49:11ZNON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE Dwi Rizqullah, Rafi Indonesia Final Project voice assistant, dictionary, non-formal word, normalization, Levenshtein distance, Jaro-Winkler distance. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/35827 Voice assistant technology is growing rapidly now. Its use has begun to spread to daily use. However, voice assistants are still limited to the use of standard conversation languages. Meanwhile, Indonesian people are accustomed to saying non-formal language in everyday conversation. The execution of this Final Project includes solutions to overcome the problem of voice assistants with non-formal words or not included in the formal word dictionary. The approach used as a solution is to normalize the text using Levenshtein distance and Jaro-Winkler distance. Test result shows that normalization using Levenshtein distance outperform the normalization using LCS distance with accuracy difference of 8.34 percent. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Voice assistant technology is growing rapidly now. Its use has begun to spread to daily use. However, voice assistants are still limited to the use of standard conversation languages. Meanwhile, Indonesian people are accustomed to saying non-formal language in everyday conversation. The execution of this Final Project includes solutions to overcome the problem of voice assistants with non-formal words or not included in the formal word dictionary. The approach used as a solution is to normalize the text using Levenshtein distance and Jaro-Winkler distance. Test result shows that normalization using Levenshtein distance outperform the normalization using LCS distance with accuracy difference of 8.34 percent.
format Final Project
author Dwi Rizqullah, Rafi
spellingShingle Dwi Rizqullah, Rafi
NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE
author_facet Dwi Rizqullah, Rafi
author_sort Dwi Rizqullah, Rafi
title NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE
title_short NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE
title_full NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE
title_fullStr NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE
title_full_unstemmed NON-FORMAL AND NON-SHORTENED WORD NORMALIZATION WITH EDIT DISTANCE
title_sort non-formal and non-shortened word normalization with edit distance
url https://digilib.itb.ac.id/gdl/view/35827
_version_ 1822924502618603520