ADVERSARIAL ATTACK ON NEURAL MACHINE TRANSLATION MODELS AS DISCRETE OPTIMIZATION

Machine Translation models are commonly trained only on perfect grammar corpora. Training on perfect grammar corpora might predispose the model to be biased against minorities from non-standard linguistics backgrounds. Adversarial attack is one way to protect machine learning models from this bia...

Full description

Saved in:
Bibliographic Details
Main Author: Kuwanto, Garry
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/65291
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Machine Translation models are commonly trained only on perfect grammar corpora. Training on perfect grammar corpora might predispose the model to be biased against minorities from non-standard linguistics backgrounds. Adversarial attack is one way to protect machine learning models from this bias. In this final project, We perform an adversarial attack by means of perturbation of words using the different morphological forms. This perturbation crafts adversarial examples with non-perfect grammar for English sentences. Because of how choosing the morphological forms is seen as finding an optimal choice from a discrete search space, we can see this as a discrete optimization problem. Here, we focus on exploring three different discrete optimization method, Genetic Algorithm, Simulated Annealing, and a modified version of a reinforcement learning algorithm, the REINFORCE algorithm. Comparison with the greedy algorithm as a commonly used method, our results show little difference between sentence crafted from these methods. In terms of efficiency, Simulated Annealing is a significantly faster algorithm, three times faster than the slowest algorithm (Genetic Algorithm) and faster than the second best algorithm (REINFORCE) by a factor of two.