Development of a word segmentation algorithm for Myanmar language

This study is to develop a word segmentation algorithm and solution for Myanmar language. This is a first-of-its-kind for word segmentation in Myanmar language using the Unicode Standard version 5.1. The Unicode standard for Myanmar character set had not been very stable in the past and the recent v...

Full description

Saved in:
Bibliographic Details
Main Author: U Tun Thura Thet
Other Authors: Na, Jin Cheon
Format: Theses and Dissertations
Published: 2008
Subjects:
Online Access:http://hdl.handle.net/10356/1939
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Description
Summary:This study is to develop a word segmentation algorithm and solution for Myanmar language. This is a first-of-its-kind for word segmentation in Myanmar language using the Unicode Standard version 5.1. The Unicode standard for Myanmar character set had not been very stable in the past and the recent version 5.1 is now included with some significant changes in order to address the major issues faced in the previous versions. The literature review for research covers the studies of not only Myanmar script but also the other similar scripts such as Thai, Cambodia and Laos. Some word segmentation approaches for Thai, Vietnamese and Chinese languages which are relevant to the studies are also reviewed to understand how other solutions were developed and evaluated.