Order-agnostic cross entropy for non-autoregressive machine translation
We propose a new training objective named orderagnostic cross entropy (OAXE) for fully nonautoregressive translation (NAT) models. OAXE improves the standard cross-entropy loss to ameliorate the effect of word reordering, which is a common source of the critical multimodality problem in NAT. Concret...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2021
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/6660 https://ink.library.smu.edu.sg/context/sis_research/article/7663/viewcontent/du21c.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-7663 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-76632022-01-13T09:36:34Z Order-agnostic cross entropy for non-autoregressive machine translation DU, Cunxiao TU, Zhaopeng JIANG, Jing We propose a new training objective named orderagnostic cross entropy (OAXE) for fully nonautoregressive translation (NAT) models. OAXE improves the standard cross-entropy loss to ameliorate the effect of word reordering, which is a common source of the critical multimodality problem in NAT. Concretely, OAXE removes the penalty for word order errors, and computes the cross entropy loss based on the best possible alignment between model predictions and target tokens. Since the log loss is very sensitive to invalid references, we leverage cross entropy initialization and loss truncation to ensure the model focuses on a good part of the search space. Extensive experiments on major WMT benchmarks show that OAXE substantially improves translation performance, setting new state of the art for fully NAT models. Further analyses show that OAXE alleviates the multimodality problem by reducing token repetitions and increasing prediction confidence. Our code, data, and trained models are available at https://github.com/ tencent-ailab/ICML21_OAXE. 2021-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6660 https://ink.library.smu.edu.sg/context/sis_research/article/7663/viewcontent/du21c.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Artificial Intelligence and Robotics |
spellingShingle |
Artificial Intelligence and Robotics DU, Cunxiao TU, Zhaopeng JIANG, Jing Order-agnostic cross entropy for non-autoregressive machine translation |
description |
We propose a new training objective named orderagnostic cross entropy (OAXE) for fully nonautoregressive translation (NAT) models. OAXE improves the standard cross-entropy loss to ameliorate the effect of word reordering, which is a common source of the critical multimodality problem in NAT. Concretely, OAXE removes the penalty for word order errors, and computes the cross entropy loss based on the best possible alignment between model predictions and target tokens. Since the log loss is very sensitive to invalid references, we leverage cross entropy initialization and loss truncation to ensure the model focuses on a good part of the search space. Extensive experiments on major WMT benchmarks show that OAXE substantially improves translation performance, setting new state of the art for fully NAT models. Further analyses show that OAXE alleviates the multimodality problem by reducing token repetitions and increasing prediction confidence. Our code, data, and trained models are available at https://github.com/ tencent-ailab/ICML21_OAXE. |
format |
text |
author |
DU, Cunxiao TU, Zhaopeng JIANG, Jing |
author_facet |
DU, Cunxiao TU, Zhaopeng JIANG, Jing |
author_sort |
DU, Cunxiao |
title |
Order-agnostic cross entropy for non-autoregressive machine translation |
title_short |
Order-agnostic cross entropy for non-autoregressive machine translation |
title_full |
Order-agnostic cross entropy for non-autoregressive machine translation |
title_fullStr |
Order-agnostic cross entropy for non-autoregressive machine translation |
title_full_unstemmed |
Order-agnostic cross entropy for non-autoregressive machine translation |
title_sort |
order-agnostic cross entropy for non-autoregressive machine translation |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2021 |
url |
https://ink.library.smu.edu.sg/sis_research/6660 https://ink.library.smu.edu.sg/context/sis_research/article/7663/viewcontent/du21c.pdf |
_version_ |
1770576018865651712 |