A reexamination of MRD-based word sense disambiguation

This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 d...

Full description

Saved in:
Bibliographic Details
Main Authors: Baldwin, Timothy, Kim, Su Nam, Bond, Francis, Fujita, Sanae, Martinez, David, Tanaka, Takaaki
Other Authors: School of Humanities and Social Sciences
Format: Article
Language:English
Published: 2011
Subjects:
Online Access:https://hdl.handle.net/10356/79580
http://hdl.handle.net/10220/6834
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-79580
record_format dspace
spelling sg-ntu-dr.10356-795802020-04-27T10:05:33Z A reexamination of MRD-based word sense disambiguation Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki School of Humanities and Social Sciences DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 dictionary task, we demonstrate that sense-sensitive definition extension over hyponyms, hypernyms and synonyms, combined with definition extension and word tokenisation leads to WSD accuracy above both unsupervised and supervised baselines. In doing so, we demonstrate the utility of ontology induction and establish new opportunities for the development of baseline unsupervised WSD methods. Accepted version 2011-06-29T09:08:14Z 2019-12-06T13:28:38Z 2011-06-29T09:08:14Z 2019-12-06T13:28:38Z 2010 2010 Journal Article 1530-0226 https://hdl.handle.net/10356/79580 http://hdl.handle.net/10220/6834 10.1145/1731035.1731039 155494 en ACM transactions on Asian language information processing © 2010 Association for Computing Machinery. This is the author created version of a work that has been peer reviewed and accepted for publication by ACM Transactions on Asian Language Information Processing, Association for Computing Machinery. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [DOI: http://dx.doi.org/10.1145/1731035.1731039]. 21 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics
spellingShingle DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics
Baldwin, Timothy
Kim, Su Nam
Bond, Francis
Fujita, Sanae
Martinez, David
Tanaka, Takaaki
A reexamination of MRD-based word sense disambiguation
description This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 dictionary task, we demonstrate that sense-sensitive definition extension over hyponyms, hypernyms and synonyms, combined with definition extension and word tokenisation leads to WSD accuracy above both unsupervised and supervised baselines. In doing so, we demonstrate the utility of ontology induction and establish new opportunities for the development of baseline unsupervised WSD methods.
author2 School of Humanities and Social Sciences
author_facet School of Humanities and Social Sciences
Baldwin, Timothy
Kim, Su Nam
Bond, Francis
Fujita, Sanae
Martinez, David
Tanaka, Takaaki
format Article
author Baldwin, Timothy
Kim, Su Nam
Bond, Francis
Fujita, Sanae
Martinez, David
Tanaka, Takaaki
author_sort Baldwin, Timothy
title A reexamination of MRD-based word sense disambiguation
title_short A reexamination of MRD-based word sense disambiguation
title_full A reexamination of MRD-based word sense disambiguation
title_fullStr A reexamination of MRD-based word sense disambiguation
title_full_unstemmed A reexamination of MRD-based word sense disambiguation
title_sort reexamination of mrd-based word sense disambiguation
publishDate 2011
url https://hdl.handle.net/10356/79580
http://hdl.handle.net/10220/6834
_version_ 1681058729978494976