Integrated linguistic to Statistical Machine Translation

In the field of Natural Language Processing, automatic machine translation is an attractive application for a supporting user to translate some sentences in a language to others. Today, Phrase-based Statistical Machine Translation is the-state-of-the-art with benet in the word choosing, distor...

Full description

Saved in:

Bibliographic Details
Main Author:	Vương, Hoài Thu
Format:	Theses and Dissertations
Language:	other
Published:	Đại học Quốc gia Hà Nội 2016
Subjects:	Khoa học máy tính Xử lý ngôn ngữ tự nhiên Thông tin ngôn ngữ Dịch máy
Online Access:	http://repository.vnu.edu.vn/handle/VNU_123/8256
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Vietnam National University, Hanoi
Language:	other

id	oai:112.137.131.14:VNU_123-8256
record_format	dspace
spelling	oai:112.137.131.14:VNU_123-82562016-04-13T20:02:03Z Integrated linguistic to Statistical Machine Translation Vương, Hoài Thu Khoa học máy tính Xử lý ngôn ngữ tự nhiên Thông tin ngôn ngữ Dịch máy In the field of Natural Language Processing, automatic machine translation is an attractive application for a supporting user to translate some sentences in a language to others. Today, Phrase-based Statistical Machine Translation is the-state-of-the-art with benet in the word choosing, distortion based on the distance between words. However, we still have some problem with global dis-tortion model of different languages (long distance between words). In some previous studies, the linguistic information such as a syntax tree, morphology information or hierarchical of phrase is used. Similarly, we also use the syntax tree to help the distortion model. However, instead of using full parse tree, we use a shallow syntax tree (the height of tree is limited). By using some trans-formation rules, we can arrange the order of some nodes in the shallow syntax tree. Hence, we reorder the words in the sentence. A special point in our study is applying the transformation rule on the sentence in the source language to get new sentence with new order of words, which is similar with the target language, as preprocessing step before training translation model or decoding with beam search and log linear model. The experiment results from an English-Vietnamese pair showed that our approach achieves significant improvements over MOSES which is the state-of-the-art phrase based system 2016-04-13T07:17:58Z 2016-04-13T07:17:58Z 2012 Thesis 7 tr. http://repository.vnu.edu.vn/handle/VNU_123/8256 other application/pdf Đại học Quốc gia Hà Nội
institution	Vietnam National University, Hanoi
building	VNU Library & Information Center
country	Vietnam
collection	VNU Digital Repository
language	other
topic	Khoa học máy tính Xử lý ngôn ngữ tự nhiên Thông tin ngôn ngữ Dịch máy
spellingShingle	Khoa học máy tính Xử lý ngôn ngữ tự nhiên Thông tin ngôn ngữ Dịch máy Vương, Hoài Thu Integrated linguistic to Statistical Machine Translation
description	In the field of Natural Language Processing, automatic machine translation is an attractive application for a supporting user to translate some sentences in a language to others. Today, Phrase-based Statistical Machine Translation is the-state-of-the-art with benet in the word choosing, distortion based on the distance between words. However, we still have some problem with global dis-tortion model of different languages (long distance between words). In some previous studies, the linguistic information such as a syntax tree, morphology information or hierarchical of phrase is used. Similarly, we also use the syntax tree to help the distortion model. However, instead of using full parse tree, we use a shallow syntax tree (the height of tree is limited). By using some trans-formation rules, we can arrange the order of some nodes in the shallow syntax tree. Hence, we reorder the words in the sentence. A special point in our study is applying the transformation rule on the sentence in the source language to get new sentence with new order of words, which is similar with the target language, as preprocessing step before training translation model or decoding with beam search and log linear model. The experiment results from an English-Vietnamese pair showed that our approach achieves significant improvements over MOSES which is the state-of-the-art phrase based system
format	Theses and Dissertations
author	Vương, Hoài Thu
author_facet	Vương, Hoài Thu
author_sort	Vương, Hoài Thu
title	Integrated linguistic to Statistical Machine Translation
title_short	Integrated linguistic to Statistical Machine Translation
title_full	Integrated linguistic to Statistical Machine Translation
title_fullStr	Integrated linguistic to Statistical Machine Translation
title_full_unstemmed	Integrated linguistic to Statistical Machine Translation
title_sort	integrated linguistic to statistical machine translation
publisher	Đại học Quốc gia Hà Nội
publishDate	2016
url	http://repository.vnu.edu.vn/handle/VNU_123/8256
_version_	1680967905553940480

Integrated linguistic to Statistical Machine Translation

Similar Items