Enhancing an English/Korean dictionary

In this paper, we introduce a machine-tractable Korean/English lexicon. We use engdic, an open source dictionary. engdic is an English-Korean dictionary for human use. The formatting is sometimes inconsistent, and there is missing or duplicated information, therefore it is not ready for machine use....

Full description

Saved in:
Bibliographic Details
Main Authors: Bond, Francis, Paik, Kyonghee.
Other Authors: School of Humanities and Social Sciences
Format: Conference or Workshop Item
Language:English
Published: 2011
Subjects:
Online Access:https://hdl.handle.net/10356/93843
http://hdl.handle.net/10220/7274
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this paper, we introduce a machine-tractable Korean/English lexicon. We use engdic, an open source dictionary. engdic is an English-Korean dictionary for human use. The formatting is sometimes inconsistent, and there is missing or duplicated information, therefore it is not ready for machine use. We rearrange the disorganized format as well as improve the content. This makes it easier to use the dictionary bidirectionally. Our main purpose is to develop and document clear syntactic and semantic features useful for NLP applications such as machine translation. The original lexicon contains about 98,000 English lemmas and about 210,000 English-Korean pairs. Each entry consist of three parts: English lemma form, part of speech codes, and Korean translation/explanation. We transformed this to a more structured format consisting of eight fields.