Neural machine translation for discourse phenomena
In recent years, Neural Machine Translation (NMT) has received increasing interest in the natural language processing field (NLP) and has achieved the state-of-the-art on numerous tasks. In this final year project, we discover how NMT models can be adapted to handle discourse phenomena in machine t...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/144540 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-144540 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1445402020-11-19T00:34:09Z Neural machine translation for discourse phenomena Shen, Youlin Joty Shafiq Rayhan School of Computer Science and Engineering srjoty@ntu.edu.sg Engineering::Computer science and engineering In recent years, Neural Machine Translation (NMT) has received increasing interest in the natural language processing field (NLP) and has achieved the state-of-the-art on numerous tasks. In this final year project, we discover how NMT models can be adapted to handle discourse phenomena in machine translation and how to properly evaluate a model’s performance. Current NMT models are mainly sentence-level systems, and the performance has been improved to even reach human parity [2]. However, when evaluating the model outputs together at the document-level rather than individual sentences, humans show a strong preference for professional-translated text over machine translation [3]. This fact indicates that sentence-level NMT systems may be able to produce good translation on isolated sentences, but when put into context, these individual translations can contradict with each other. In other words, the sentence-level models perform badly on maintaining discourse phenomena. In this project, we apply Docrepair [1], a context-aware NMT model to tackle the dis- course phenomena problem. Docrepair is a monolingual document-level model for correcting the ’translation-ese’ produced in the process, i.e. Docrepair performs automatic post-editing on a sequence of sentences and refines the overall translation concerning each other as context. The model aims to map inconsistent groups of sentences into more a natural one from a human point of view. Furthermore, the evaluation criteria used for most NMT models are standard automatic metrics like BLEU scores. These metrics are poorly adapted to evaluate models’ performance on discourse phenomena. To give researchers in this field more evidence on the quality of the system output, Jwala et. al. proposed a comprehensive benchmark framework for evaluating discourse phenomena [4]. The benchmarking framework checks 4 discourse phenomena, namely Anaphora, Lexical Consistency, Coherence, and Readability. In this project, we build an online application, DiscourseGym, available to all the researchers in NLP field to utilize the benchmark framework. The application features testset downloads with various filters and options, automatic model output evaluation, enhanced visualizations to give the user more information on their model performance. Besides, the DiscourseGym also features a leaderboard for researchers to compare their model performances and allows community contribution by user-driven testset. Bachelor of Engineering (Computer Science) 2020-11-11T07:33:26Z 2020-11-11T07:33:26Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/144540 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering |
spellingShingle |
Engineering::Computer science and engineering Shen, Youlin Neural machine translation for discourse phenomena |
description |
In recent years, Neural Machine Translation (NMT) has received increasing interest in the natural language processing field (NLP) and has achieved the state-of-the-art on numerous tasks.
In this final year project, we discover how NMT models can be adapted to handle discourse phenomena in machine translation and how to properly evaluate a model’s performance.
Current NMT models are mainly sentence-level systems, and the performance has been improved to even reach human parity [2]. However, when evaluating the model outputs together at the document-level rather than individual sentences, humans show a strong preference for professional-translated text over machine translation [3]. This fact indicates that sentence-level NMT systems may be able to produce good translation on isolated sentences, but when put into context, these individual translations can contradict with each other. In other words, the sentence-level models perform badly on maintaining discourse phenomena.
In this project, we apply Docrepair [1], a context-aware NMT model to tackle the dis- course phenomena problem. Docrepair is a monolingual document-level model for correcting the ’translation-ese’ produced in the process, i.e. Docrepair performs automatic post-editing on a sequence of sentences and refines the overall translation concerning each other as context. The model aims to map inconsistent groups of sentences into more a natural one from a human point of view.
Furthermore, the evaluation criteria used for most NMT models are standard automatic metrics like BLEU scores. These metrics are poorly adapted to evaluate models’ performance on discourse phenomena. To give researchers in this field more evidence on the quality of the system output, Jwala et. al. proposed a comprehensive benchmark framework for evaluating discourse phenomena [4]. The benchmarking framework checks 4 discourse phenomena, namely Anaphora, Lexical Consistency, Coherence, and Readability.
In this project, we build an online application, DiscourseGym, available to all the researchers in NLP field to utilize the benchmark framework. The application features testset downloads with various filters and options, automatic model output evaluation, enhanced visualizations to give the user more information on their model performance. Besides, the DiscourseGym also features a leaderboard for researchers to compare their model performances and allows community contribution by user-driven testset. |
author2 |
Joty Shafiq Rayhan |
author_facet |
Joty Shafiq Rayhan Shen, Youlin |
format |
Final Year Project |
author |
Shen, Youlin |
author_sort |
Shen, Youlin |
title |
Neural machine translation for discourse phenomena |
title_short |
Neural machine translation for discourse phenomena |
title_full |
Neural machine translation for discourse phenomena |
title_fullStr |
Neural machine translation for discourse phenomena |
title_full_unstemmed |
Neural machine translation for discourse phenomena |
title_sort |
neural machine translation for discourse phenomena |
publisher |
Nanyang Technological University |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/144540 |
_version_ |
1688665590103801856 |