Cải tiến chất lượng hệ dịch máy thống kê bằng cách sử dụng kho ngữ liệu đơn ngữ trong ngôn ngữ nguồn
Nowadays, statistical machine translation is derived diverse interest of researchers thanks to its advantages. However, approaches based on statistic constantly confront deficiencies of parallel and specific domain corpora. Generating these corpora re-quires intensive human effo...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Published: |
ĐHCN
2017
|
Subjects: | |
Online Access: | http://repository.vnu.edu.vn/handle/VNU_123/43272 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Vietnam National University, Hanoi |
Summary: | Nowadays, statistical machine translation is derived diverse interest of researchers thanks to its advantages. However, approaches based on statistic constantly confront deficiencies of parallel and specific domain corpora. Generating these corpora re-quires intensive human effort and availability of experts. Unfortunately, only a few
popular languages in the world are derived continuous financial support and interest of researchers for development of machine translation systems. For most remaining languages, there is very small interest of funding available. Therefore it becomes an immense obstacle to apply approaches based on statistic for such languages. The
purpose of this thesis is to propose a method for utilizing unannotated corpora to address this impediment |
---|