Semantic annotation of a Japanese speech corpus

This paper describes the semantic annotations we are performing on the CallHome Japanese corpus of spontaneous, unscripted telephone conversations (LDC, 1996). Our annotations include (i) semantic classes for all nouns and verbs; (ii) verb senses for all main verbs; and (iii) relations between main...

Full description

Saved in:
Bibliographic Details
Main Authors: Bond, Francis, Fry, John.
Other Authors: School of Humanities and Social Sciences
Format: Conference or Workshop Item
Language:English
Published: 2010
Subjects:
Online Access:https://hdl.handle.net/10356/83172
http://hdl.handle.net/10220/6434
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This paper describes the semantic annotations we are performing on the CallHome Japanese corpus of spontaneous, unscripted telephone conversations (LDC, 1996). Our annotations include (i) semantic classes for all nouns and verbs; (ii) verb senses for all main verbs; and (iii) relations between main verbs and their complements in the same utterance. Our semantic tagset is taken from NTT's Goi-Taikei semantic lexicon and ontology (Ikehara et al., 1997). A pilot study demonstrates that the verb sense tagging can be efficiently performed by native Japanese speakers using computergenerated HTML forms, and that good interannotator reliability can be obtained in the right conditions.