Text translation: Template extraction for a bidiretional english-filipino example-based machine translation

A bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates is presented. The system uses machine learning techniques to initially extract templates from a given bilingual corpus. These templates are subsequently used for translating English input text in...

Full description

Saved in:
Bibliographic Details
Main Authors: Go, Kathleen L., Morga, Manimin R., Nunez, Vince Andrew D., Veto, Francis Germiline S.
Format: text
Language:English
Published: Animo Repository 2006
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_bachelors/14396
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
id oai:animorepository.dlsu.edu.ph:etd_bachelors-15038
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:etd_bachelors-150382021-11-20T03:28:42Z Text translation: Template extraction for a bidiretional english-filipino example-based machine translation Go, Kathleen L. Morga, Manimin R. Nunez, Vince Andrew D. Veto, Francis Germiline S. A bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates is presented. The system uses machine learning techniques to initially extract templates from a given bilingual corpus. These templates are subsequently used for translating English input text into Filipino and vise versa. The system implements the similarity template learning algorithm performed by (Cicekli et. al, 2001) but goes further by introducing template refinement and derivation of templates from chunks learned. To improve translation quality, new chunk alignment and splitting algorithms are introduced into the training process while a flexible template and chunk matching scheme is establish for translation. Test results verify that a strict chunk alignment scheme in training is needed and that specific words such as commonly occurring words need to be filtered out to produce better templates, thereby improving overall quality by assuring complete template and chunk correctness in training and reducing word and sentence error rates by as much as half in translation. Tests also show that the translation with the highest score selected from various candidates is consistently the best choice as checked against automotive evaluation methods. Still, much of the system implementation is limited by the quality and coverage of the lexicon and morphological references which are patterned after those of TWiRL's a rule-based machine translator. This research is part of a three-year project on hybrid machine translation that is funded by the Philippine Council for Advanced Science and Technology Research and Development of the Department of Science and Technology (DOST-PCASTRD). 2006-01-01T08:00:00Z text https://animorepository.dlsu.edu.ph/etd_bachelors/14396 Bachelor's Theses English Animo Repository Computer Sciences
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
language English
topic Computer Sciences
spellingShingle Computer Sciences
Go, Kathleen L.
Morga, Manimin R.
Nunez, Vince Andrew D.
Veto, Francis Germiline S.
Text translation: Template extraction for a bidiretional english-filipino example-based machine translation
description A bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates is presented. The system uses machine learning techniques to initially extract templates from a given bilingual corpus. These templates are subsequently used for translating English input text into Filipino and vise versa. The system implements the similarity template learning algorithm performed by (Cicekli et. al, 2001) but goes further by introducing template refinement and derivation of templates from chunks learned. To improve translation quality, new chunk alignment and splitting algorithms are introduced into the training process while a flexible template and chunk matching scheme is establish for translation. Test results verify that a strict chunk alignment scheme in training is needed and that specific words such as commonly occurring words need to be filtered out to produce better templates, thereby improving overall quality by assuring complete template and chunk correctness in training and reducing word and sentence error rates by as much as half in translation. Tests also show that the translation with the highest score selected from various candidates is consistently the best choice as checked against automotive evaluation methods. Still, much of the system implementation is limited by the quality and coverage of the lexicon and morphological references which are patterned after those of TWiRL's a rule-based machine translator. This research is part of a three-year project on hybrid machine translation that is funded by the Philippine Council for Advanced Science and Technology Research and Development of the Department of Science and Technology (DOST-PCASTRD).
format text
author Go, Kathleen L.
Morga, Manimin R.
Nunez, Vince Andrew D.
Veto, Francis Germiline S.
author_facet Go, Kathleen L.
Morga, Manimin R.
Nunez, Vince Andrew D.
Veto, Francis Germiline S.
author_sort Go, Kathleen L.
title Text translation: Template extraction for a bidiretional english-filipino example-based machine translation
title_short Text translation: Template extraction for a bidiretional english-filipino example-based machine translation
title_full Text translation: Template extraction for a bidiretional english-filipino example-based machine translation
title_fullStr Text translation: Template extraction for a bidiretional english-filipino example-based machine translation
title_full_unstemmed Text translation: Template extraction for a bidiretional english-filipino example-based machine translation
title_sort text translation: template extraction for a bidiretional english-filipino example-based machine translation
publisher Animo Repository
publishDate 2006
url https://animorepository.dlsu.edu.ph/etd_bachelors/14396
_version_ 1718383308911411200