Satellite Workshop On Language, Artificial Intelligence and Computer Science for Natural Language Processing Applications (LAICS-NLP): Discovery of Meaning from Text

This paper proposes a novel method to disambiguate important words from a collection of documents. The hypothesis that underlies this approach is that there is a minimal set of senses that are significant in characterizing a context. We extend Yarowsky’s one sense per discourse [13] further to a...

Full description

Saved in:
Bibliographic Details
Main Authors: Ong,, Siou Chin., Kulathuramaiyer, Narayanan, Yeo, Alvin Wee
Format: Conference or Workshop Item
Language:English
Published: Faculty of Engineering Kasetsart University, Bangkok, Thailand. 2006
Subjects:
Online Access:http://ir.unimas.my/id/eprint/525/1/discovery_of_meaning_from_text.pdf
http://ir.unimas.my/id/eprint/525/
http://naist.cpe.ku.ac.th/LAICS-NLP/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Sarawak
Language: English
Description
Summary:This paper proposes a novel method to disambiguate important words from a collection of documents. The hypothesis that underlies this approach is that there is a minimal set of senses that are significant in characterizing a context. We extend Yarowsky’s one sense per discourse [13] further to a collection of related documents rather than a single document. We perform distributed clustering on a set of features representing each of the top ten categories of documents in the Reuters-21578 dataset. Groups of terms that have a similar term distributional pattern across documents were identified. WordNet-based similarity measurement was then computed for terms within each cluster. An aggregation of the associations in WordNet that was employed to ascertain term similarity within clusters has provided a means of identifying clusters’ root senses.