Automatic resolution of target word ambiguity
Selecting the right word translation among several options in the lexicon is a core problem for machine translation. It is not enough that a word in context is translated, but an appropriate translation must be considered. An automated approach is presented here for resolving target word selection,...
Saved in:
Main Author: | |
---|---|
Format: | text |
Language: | English |
Published: |
Animo Repository
2005
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/etd_masteral/3306 https://animorepository.dlsu.edu.ph/cgi/viewcontent.cgi?article=10144&context=etd_masteral |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
Language: | English |
id |
oai:animorepository.dlsu.edu.ph:etd_masteral-10144 |
---|---|
record_format |
eprints |
spelling |
oai:animorepository.dlsu.edu.ph:etd_masteral-101442022-06-01T02:34:30Z Automatic resolution of target word ambiguity Domingo, Ebony C. Selecting the right word translation among several options in the lexicon is a core problem for machine translation. It is not enough that a word in context is translated, but an appropriate translation must be considered. An automated approach is presented here for resolving target word selection, based on word-to-sense and sense-to-word relationship between source words and its translations, utilizing syntactic relationships (subject-verb, verb-object, adjective noun). Translation selection proceeds from sense disambiguation of source words based on knowledge from a bilingual dictionary and word similarity measures from WordNet, and then selection of target a word using statistics from a target language corpus. The system was tested on 145,746 word pairs in syntactic relationships that were extracted from target corpora gathered from various online editorials, Tagalog readings and Tagalog New Testament with a total of 317,113 words. Sense profile, with 2681 entries for source words was built from an existing bilingual dictionary that includes clues for disambiguation and target translations. A test on 200 sentences with ambiguous words (average of 4 senses) in three categories: nouns, verbs and adjectives, produced an overall result of 63.89% accuracy for selecting word translation with a standardized precision of at least 80% for generating expected translations for different categories: nouns, verbs, adjectives. An addition of reliable clues for sense disambiguation, as well as application of some smoothing techniques can further improve overall performance of the method. The words produced by the system are root words. The system can further be improved with the integration of morphological generation into a machine translation system to produce even more fluent translations. In addition, the method developed in here can be extended to accommodate translation of other content words as well as other syntactic categories. Furthermore, the method presented here can be improved to support bidirectional translation (Tagalog to English). 2005-12-01T08:00:00Z text application/pdf https://animorepository.dlsu.edu.ph/etd_masteral/3306 https://animorepository.dlsu.edu.ph/cgi/viewcontent.cgi?article=10144&context=etd_masteral Master's Theses English Animo Repository Machine translating Information theory Computer Sciences |
institution |
De La Salle University |
building |
De La Salle University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
De La Salle University Library |
collection |
DLSU Institutional Repository |
language |
English |
topic |
Machine translating Information theory Computer Sciences |
spellingShingle |
Machine translating Information theory Computer Sciences Domingo, Ebony C. Automatic resolution of target word ambiguity |
description |
Selecting the right word translation among several options in the lexicon is a core problem for machine translation. It is not enough that a word in context is translated, but an appropriate translation must be considered. An automated approach is presented here for resolving target word selection, based on word-to-sense and sense-to-word relationship between source words and its translations, utilizing syntactic relationships (subject-verb, verb-object, adjective noun). Translation selection proceeds from sense disambiguation of source words based on knowledge from a bilingual dictionary and word similarity measures from WordNet, and then selection of target a word using statistics from a target language corpus. The system was tested on 145,746 word pairs in syntactic relationships that were extracted from target corpora gathered from various online editorials, Tagalog readings and Tagalog New Testament with a total of 317,113 words. Sense profile, with 2681 entries for source words was built from an existing bilingual dictionary that includes clues for disambiguation and target translations. A test on 200 sentences with ambiguous words (average of 4 senses) in three categories: nouns, verbs and adjectives, produced an overall result of 63.89% accuracy for selecting word translation with a standardized precision of at least 80% for generating expected translations for different categories: nouns, verbs, adjectives. An addition of reliable clues for sense disambiguation, as well as application of some smoothing techniques can further improve overall performance of the method. The words produced by the system are root words. The system can further be improved with the integration of morphological generation into a machine translation system to produce even more fluent translations. In addition, the method developed in here can be extended to accommodate translation of other content words as well as other syntactic categories. Furthermore, the method presented here can be improved to support bidirectional translation (Tagalog to English). |
format |
text |
author |
Domingo, Ebony C. |
author_facet |
Domingo, Ebony C. |
author_sort |
Domingo, Ebony C. |
title |
Automatic resolution of target word ambiguity |
title_short |
Automatic resolution of target word ambiguity |
title_full |
Automatic resolution of target word ambiguity |
title_fullStr |
Automatic resolution of target word ambiguity |
title_full_unstemmed |
Automatic resolution of target word ambiguity |
title_sort |
automatic resolution of target word ambiguity |
publisher |
Animo Repository |
publishDate |
2005 |
url |
https://animorepository.dlsu.edu.ph/etd_masteral/3306 https://animorepository.dlsu.edu.ph/cgi/viewcontent.cgi?article=10144&context=etd_masteral |
_version_ |
1736864002347106304 |