Semantic annotation of a Japanese speech corpus
This paper describes the semantic annotations we are performing on the CallHome Japanese corpus of spontaneous, unscripted telephone conversations (LDC, 1996). Our annotations include (i) semantic classes for all nouns and verbs; (ii) verb senses for all main verbs; and (iii) relations between main...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2010
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/83172 http://hdl.handle.net/10220/6434 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This paper describes the semantic annotations we are performing on the CallHome Japanese corpus of spontaneous, unscripted telephone conversations (LDC, 1996). Our annotations include (i) semantic classes for all nouns and verbs; (ii) verb senses for all main verbs; and (iii) relations between main verbs and their complements in the same utterance. Our semantic tagset is taken from NTT's Goi-Taikei semantic lexicon and ontology (Ikehara et al., 1997). A pilot study demonstrates that the verb sense tagging can be efficiently performed by native Japanese speakers using computergenerated HTML forms, and that good interannotator reliability can be obtained in the right conditions. |
---|