Satellite Workshop On Language, Artificial Intelligence and Computer Science for Natural Language Processing Applications (LAICS-NLP): Discovery of Meaning from Text
This paper proposes a novel method to disambiguate important words from a collection of documents. The hypothesis that underlies this approach is that there is a minimal set of senses that are significant in characterizing a context. We extend Yarowsky’s one sense per discourse [13] further to a...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
Faculty of Engineering Kasetsart University, Bangkok, Thailand.
2006
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/525/1/discovery_of_meaning_from_text.pdf http://ir.unimas.my/id/eprint/525/ http://naist.cpe.ku.ac.th/LAICS-NLP/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaysia Sarawak |
Language: | English |
Summary: | This paper proposes a novel method to disambiguate important words from a collection of documents. The
hypothesis that underlies this approach is that there is a
minimal set of senses that are significant in characterizing a context. We extend Yarowsky’s one sense
per discourse [13] further to a collection of related
documents rather than a single document. We perform
distributed clustering on a set of features representing
each of the top ten categories of documents in the
Reuters-21578 dataset. Groups of terms that have a
similar term distributional pattern across documents were
identified. WordNet-based similarity measurement was
then computed for terms within each cluster. An
aggregation of the associations in WordNet that was
employed to ascertain term similarity within clusters has
provided a means of identifying clusters’ root senses. |
---|