A plethora of methods for learning English countability

This paper compares a range of methods for classifying words based on linguistic diagnostics, focusing on the task of learning countabilities for English nouns. We propose two basic approaches to feature representation: distribution-based representation, whi...

Full description

Saved in:
Bibliographic Details
Main Authors: Baldwin, Timothy, Bond, Francis
Other Authors: School of Humanities and Social Sciences
Format: Conference or Workshop Item
Language:English
Published: 2011
Subjects:
Online Access:https://hdl.handle.net/10356/93871
http://hdl.handle.net/10220/6822
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This paper compares a range of methods for classifying words based on linguistic diagnostics, focusing on the task of learning countabilities for English nouns. We propose two basic approaches to feature representation: distribution-based representation, which simply looks at the distribution of features in the corpus data, and agreement-based representation which analyses the level of tokenwise agreement between multiple preprocessor systems. We additionally compare a single multiclass classifier architecture with a suite of binary classifiers, and combine analyses from multiple preprocessors. Finally, we present and evaluate a feature selection method.