High precision treebanking - blazing useful trees using POS information

In this paper we present a quantitative and qualitative analysis of annotation in the Hinoki treebank of Japanese, and investigate a method of speeding annotation by using part...

Full description

Saved in:
Bibliographic Details
Main Authors: Tanaka, Takaaki, Bond, Francis, Oepen, Stephan, Fujita, Sanae
Other Authors: School of Humanities and Social Sciences
Format: Conference or Workshop Item
Language:English
Published: 2011
Subjects:
Online Access:https://hdl.handle.net/10356/79569
http://hdl.handle.net/10220/6820
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this paper we present a quantitative and qualitative analysis of annotation in the Hinoki treebank of Japanese, and investigate a method of speeding annotation by using part-of-speech tags. The Hinoki treebank is a Redwoods-style treebank of Japanese dictionary de nition sentences. 5,000 sentences are annotated by three different annotators and the agreement evaluated. An average agreement of 65.4% was found using strict agreement, and 83.5% using labeled precision. Exploiting POS tags allowed the annotators to choose the best parse with 19.5% fewer decisions.