Developing a new statistical method for Chinese text segmentation

A new statistical formula for Chinese text segmentation called Contextual Information Formula (OF) was developed empirically for identifying 2 and 3-character words. It was developed by performing stepwise logistic regression using a sample of sentences that had been manually segmented. 300 sentence...

Full description

Saved in:
Bibliographic Details
Main Author: Dai, Yubin
Other Authors: Khoo, Christopher Soo Guan
Format: Theses and Dissertations
Published: 2008
Subjects:
Online Access:http://hdl.handle.net/10356/2614
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University