Order-agnostic cross entropy for non-autoregressive machine translation

We propose a new training objective named orderagnostic cross entropy (OAXE) for fully nonautoregressive translation (NAT) models. OAXE improves the standard cross-entropy loss to ameliorate the effect of word reordering, which is a common source of the critical multimodality problem in NAT. Concret...

Full description

Saved in:
Bibliographic Details
Main Authors: DU, Cunxiao, TU, Zhaopeng, JIANG, Jing
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6660
https://ink.library.smu.edu.sg/context/sis_research/article/7663/viewcontent/du21c.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7663
record_format dspace
spelling sg-smu-ink.sis_research-76632022-01-13T09:36:34Z Order-agnostic cross entropy for non-autoregressive machine translation DU, Cunxiao TU, Zhaopeng JIANG, Jing We propose a new training objective named orderagnostic cross entropy (OAXE) for fully nonautoregressive translation (NAT) models. OAXE improves the standard cross-entropy loss to ameliorate the effect of word reordering, which is a common source of the critical multimodality problem in NAT. Concretely, OAXE removes the penalty for word order errors, and computes the cross entropy loss based on the best possible alignment between model predictions and target tokens. Since the log loss is very sensitive to invalid references, we leverage cross entropy initialization and loss truncation to ensure the model focuses on a good part of the search space. Extensive experiments on major WMT benchmarks show that OAXE substantially improves translation performance, setting new state of the art for fully NAT models. Further analyses show that OAXE alleviates the multimodality problem by reducing token repetitions and increasing prediction confidence. Our code, data, and trained models are available at https://github.com/ tencent-ailab/ICML21_OAXE. 2021-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6660 https://ink.library.smu.edu.sg/context/sis_research/article/7663/viewcontent/du21c.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Artificial Intelligence and Robotics
spellingShingle Artificial Intelligence and Robotics
DU, Cunxiao
TU, Zhaopeng
JIANG, Jing
Order-agnostic cross entropy for non-autoregressive machine translation
description We propose a new training objective named orderagnostic cross entropy (OAXE) for fully nonautoregressive translation (NAT) models. OAXE improves the standard cross-entropy loss to ameliorate the effect of word reordering, which is a common source of the critical multimodality problem in NAT. Concretely, OAXE removes the penalty for word order errors, and computes the cross entropy loss based on the best possible alignment between model predictions and target tokens. Since the log loss is very sensitive to invalid references, we leverage cross entropy initialization and loss truncation to ensure the model focuses on a good part of the search space. Extensive experiments on major WMT benchmarks show that OAXE substantially improves translation performance, setting new state of the art for fully NAT models. Further analyses show that OAXE alleviates the multimodality problem by reducing token repetitions and increasing prediction confidence. Our code, data, and trained models are available at https://github.com/ tencent-ailab/ICML21_OAXE.
format text
author DU, Cunxiao
TU, Zhaopeng
JIANG, Jing
author_facet DU, Cunxiao
TU, Zhaopeng
JIANG, Jing
author_sort DU, Cunxiao
title Order-agnostic cross entropy for non-autoregressive machine translation
title_short Order-agnostic cross entropy for non-autoregressive machine translation
title_full Order-agnostic cross entropy for non-autoregressive machine translation
title_fullStr Order-agnostic cross entropy for non-autoregressive machine translation
title_full_unstemmed Order-agnostic cross entropy for non-autoregressive machine translation
title_sort order-agnostic cross entropy for non-autoregressive machine translation
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/6660
https://ink.library.smu.edu.sg/context/sis_research/article/7663/viewcontent/du21c.pdf
_version_ 1770576018865651712