Link type based pre-cluster pair model for coreference resolution

This paper presents our participation in the CoNLL-2011 shared task, Modeling Unrestricted Coreference in OntoNotes. Coreference resolution, as a difficult and challenging problem in NLP, has attracted a lot of attention in the research community for a long time. Its objective is to determine whethe...

Full description

Saved in:
Bibliographic Details
Main Authors: SONG, Yang, WANG, Houfeng, JIANG, Jing
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2011
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6950
https://ink.library.smu.edu.sg/context/sis_research/article/7953/viewcontent/W11_1922_pvoa_cc_by.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7953
record_format dspace
spelling sg-smu-ink.sis_research-79532022-03-04T09:05:34Z Link type based pre-cluster pair model for coreference resolution SONG, Yang WANG, Houfeng JIANG, Jing This paper presents our participation in the CoNLL-2011 shared task, Modeling Unrestricted Coreference in OntoNotes. Coreference resolution, as a difficult and challenging problem in NLP, has attracted a lot of attention in the research community for a long time. Its objective is to determine whether two mentions in a piece of text refer to the same entity. In our system, we implement mention detection and coreference resolution seperately. For mention detection, a simple classification based method combined with several effective features is developed. For coreference resolution, we propose a link type based pre-cluster pair model. In this model, pre-clustering of all the mentions in a single document is first performed. Then for different link types, different classification models are trained to determine wheter two pre-clusters refer to the same entity. The final clustering results are generated by closest-first clustering method. Official test results for closed track reveal that our method gives a MUC F-score of 59.95%, a B-cubed F-score of 63.23%, and a CEAF F-score of 35.96% on development dataset. When using gold standard mention boundaries, we achieve MUC F-score of 55.48%, B-cubed F-score of 61.29%, and CEAF F-score of 32.53%. 2011-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6950 https://ink.library.smu.edu.sg/context/sis_research/article/7953/viewcontent/W11_1922_pvoa_cc_by.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Databases and Information Systems
Numerical Analysis and Scientific Computing
SONG, Yang
WANG, Houfeng
JIANG, Jing
Link type based pre-cluster pair model for coreference resolution
description This paper presents our participation in the CoNLL-2011 shared task, Modeling Unrestricted Coreference in OntoNotes. Coreference resolution, as a difficult and challenging problem in NLP, has attracted a lot of attention in the research community for a long time. Its objective is to determine whether two mentions in a piece of text refer to the same entity. In our system, we implement mention detection and coreference resolution seperately. For mention detection, a simple classification based method combined with several effective features is developed. For coreference resolution, we propose a link type based pre-cluster pair model. In this model, pre-clustering of all the mentions in a single document is first performed. Then for different link types, different classification models are trained to determine wheter two pre-clusters refer to the same entity. The final clustering results are generated by closest-first clustering method. Official test results for closed track reveal that our method gives a MUC F-score of 59.95%, a B-cubed F-score of 63.23%, and a CEAF F-score of 35.96% on development dataset. When using gold standard mention boundaries, we achieve MUC F-score of 55.48%, B-cubed F-score of 61.29%, and CEAF F-score of 32.53%.
format text
author SONG, Yang
WANG, Houfeng
JIANG, Jing
author_facet SONG, Yang
WANG, Houfeng
JIANG, Jing
author_sort SONG, Yang
title Link type based pre-cluster pair model for coreference resolution
title_short Link type based pre-cluster pair model for coreference resolution
title_full Link type based pre-cluster pair model for coreference resolution
title_fullStr Link type based pre-cluster pair model for coreference resolution
title_full_unstemmed Link type based pre-cluster pair model for coreference resolution
title_sort link type based pre-cluster pair model for coreference resolution
publisher Institutional Knowledge at Singapore Management University
publishDate 2011
url https://ink.library.smu.edu.sg/sis_research/6950
https://ink.library.smu.edu.sg/context/sis_research/article/7953/viewcontent/W11_1922_pvoa_cc_by.pdf
_version_ 1770576164983668736