Chinese text segmentation for information retrieval

Chinese word segmentation is a prerequisite process in Chinese information retrieval (IR) to divide the originally continuous sentences in the documents into linguistic units (normally words) that can then be indexed for the purpose of retrieval. Despite a host of word segmentation approaches and Ch...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Hui
Other Authors: Foo, Schubert Shou Boon
Format: Theses and Dissertations
Published: 2008
Subjects:
Online Access:http://hdl.handle.net/10356/2353
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Description
Summary:Chinese word segmentation is a prerequisite process in Chinese information retrieval (IR) to divide the originally continuous sentences in the documents into linguistic units (normally words) that can then be indexed for the purpose of retrieval. Despite a host of word segmentation approaches and Chinese information systems that exist today, little work has been done to co-relate segmentation performance and IR effectiveness. Consequently, it is uncertain how word segmentation affects IR effectiveness.