Sentiment analysis of the burmese language using the distributed representation of n-gram-based words

Thesis (M.Sc., Information Technology)--Prince of Songkla University, 2018

Saved in:
Bibliographic Details
Main Author: Myat lay phyu
Other Authors: Kiyota Hashimoto
Format: Theses and Dissertations
Language:English
Published: Prince of Songkla University 2023
Subjects:
Online Access:http://kb.psu.ac.th/psukb/handle/2016/19011
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Prince of Songkhla University
Language: English
id th-psu.2016-19011
record_format dspace
spelling th-psu.2016-190112023-11-02T02:52:21Z Sentiment analysis of the burmese language using the distributed representation of n-gram-based words Myat lay phyu Kiyota Hashimoto College of Computing (Information Technology) วิทยาลัยการคอมพิวเตอร์ สาขาเทคโนโลยีสารสนเทศ Natural language processing (Computer science) Computational linguistics Sentiment analysis Burmese language Thesis (M.Sc., Information Technology)--Prince of Songkla University, 2018 Due to the availability of people's opinions and customer reviews, the need to analyze those texts have been more important. Sentiment analysis, or opinion mining, estimates their polarity, whether they are positive or negative, using machine learning techniques. Many methods have been proposed but they assume the basic preprocessing of text data including word segmentation and word sentiment values. However, such preprocessing is not easily available for low resource languages such as Burmese, Khmer and Lao due to the unavailability of annotated big corpora and basic natural language processing tools. The objective of this research is to solve these difficulties of low resource language processing. The goal is to propose an effective and efficient method to enable sentiment analysis without considering language specific characteristics. The scope of the research is the languages without word boundaries in written text, specifically Burmese. The methodology consists of two proposals, a character-based variable-length n-gram word model and a word grouping method with word similarities calculated with distributive word representation models. The proposed method is compared with Conditional Random Field (CRF) baseline approach, which is also proposed newly in this thesis, and achieved a similar result as the CRF-based word segmentation with a small size of supervised data. The proposed method is also validated with a larger size of data using Amazon product reviews. Thus, the proposed methods in this thesis provide an effective and efficient way for low resource language processing without focusing on language specific characteristics. 2023-11-02T02:51:02Z 2023-11-02T02:51:02Z 2018 Thesis http://kb.psu.ac.th/psukb/handle/2016/19011 en Attribution-NonCommercial-NoDerivs 3.0 Thailand http://creativecommons.org/licenses/by-nc-nd/3.0/th/ application/pdf Prince of Songkla University
institution Prince of Songkhla University
building Khunying Long Athakravi Sunthorn Learning Resources Center
continent Asia
country Thailand
Thailand
content_provider Khunying Long Athakravi Sunthorn Learning Resources Center
collection PSU Knowledge Bank
language English
topic Natural language processing (Computer science)
Computational linguistics
Sentiment analysis
Burmese language
spellingShingle Natural language processing (Computer science)
Computational linguistics
Sentiment analysis
Burmese language
Myat lay phyu
Sentiment analysis of the burmese language using the distributed representation of n-gram-based words
description Thesis (M.Sc., Information Technology)--Prince of Songkla University, 2018
author2 Kiyota Hashimoto
author_facet Kiyota Hashimoto
Myat lay phyu
format Theses and Dissertations
author Myat lay phyu
author_sort Myat lay phyu
title Sentiment analysis of the burmese language using the distributed representation of n-gram-based words
title_short Sentiment analysis of the burmese language using the distributed representation of n-gram-based words
title_full Sentiment analysis of the burmese language using the distributed representation of n-gram-based words
title_fullStr Sentiment analysis of the burmese language using the distributed representation of n-gram-based words
title_full_unstemmed Sentiment analysis of the burmese language using the distributed representation of n-gram-based words
title_sort sentiment analysis of the burmese language using the distributed representation of n-gram-based words
publisher Prince of Songkla University
publishDate 2023
url http://kb.psu.ac.th/psukb/handle/2016/19011
_version_ 1781797473923104768