A reexamination of MRD-based word sense disambiguation
This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 d...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2011
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/79580 http://hdl.handle.net/10220/6834 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-79580 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-795802020-04-27T10:05:33Z A reexamination of MRD-based word sense disambiguation Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki School of Humanities and Social Sciences DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 dictionary task, we demonstrate that sense-sensitive definition extension over hyponyms, hypernyms and synonyms, combined with definition extension and word tokenisation leads to WSD accuracy above both unsupervised and supervised baselines. In doing so, we demonstrate the utility of ontology induction and establish new opportunities for the development of baseline unsupervised WSD methods. Accepted version 2011-06-29T09:08:14Z 2019-12-06T13:28:38Z 2011-06-29T09:08:14Z 2019-12-06T13:28:38Z 2010 2010 Journal Article 1530-0226 https://hdl.handle.net/10356/79580 http://hdl.handle.net/10220/6834 10.1145/1731035.1731039 155494 en ACM transactions on Asian language information processing © 2010 Association for Computing Machinery. This is the author created version of a work that has been peer reviewed and accepted for publication by ACM Transactions on Asian Language Information Processing, Association for Computing Machinery. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [DOI: http://dx.doi.org/10.1145/1731035.1731039]. 21 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics |
spellingShingle |
DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki A reexamination of MRD-based word sense disambiguation |
description |
This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 dictionary task, we demonstrate that sense-sensitive definition extension over hyponyms, hypernyms and synonyms, combined with definition extension and word tokenisation leads to WSD accuracy above both unsupervised and supervised baselines. In doing so, we demonstrate the utility of ontology induction and establish new opportunities for the development of baseline unsupervised WSD methods. |
author2 |
School of Humanities and Social Sciences |
author_facet |
School of Humanities and Social Sciences Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki |
format |
Article |
author |
Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki |
author_sort |
Baldwin, Timothy |
title |
A reexamination of MRD-based word sense disambiguation |
title_short |
A reexamination of MRD-based word sense disambiguation |
title_full |
A reexamination of MRD-based word sense disambiguation |
title_fullStr |
A reexamination of MRD-based word sense disambiguation |
title_full_unstemmed |
A reexamination of MRD-based word sense disambiguation |
title_sort |
reexamination of mrd-based word sense disambiguation |
publishDate |
2011 |
url |
https://hdl.handle.net/10356/79580 http://hdl.handle.net/10220/6834 |
_version_ |
1681058729978494976 |