Adversarial training using meta-learning for BERT

Deep learning is currently the most successful method of semantic analysis in natural language processing. However, in recent years, many variants of carefully crafted inputs designed to cause misclassification, known as adversarial attacks, have been engineered with tremendous success. One well-...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Low, Timothy Jing Haen
مؤلفون آخرون: Joty Shafiq Rayhan
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2022
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/156635
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Deep learning is currently the most successful method of semantic analysis in natural language processing. However, in recent years, many variants of carefully crafted inputs designed to cause misclassification, known as adversarial attacks, have been engineered with tremendous success. One well-known, efficient method to develop models to be robust against adversarial attacks is known as adversarial training, where models are iteratively trained on samples produces by the specific attack algorithm. However, adversarial training only works when the model has access to the attack generation algorithm or a large dataset of attack samples, and so cannot defend against attacks of which they have access to a low number of samples. This project proposes to overcome this challenge using meta-learning, which uses a large number of similar tasks from a different domain to train a classifier to learn another task for which a small number of labelled samples are available. We show that by using the Model-Agnostic Meta-Learning algorithm in adversarial training, a model trained on a large number of different adversarial attacks can become more robust to an adversarial attack that it has few samples of. This project will also explore augmenting the training set with a large number of non-adversarial perturbations, in order to possibly better mitigate adversarial attacks