Learning transferable perturbations for image captioning
Present studies have discovered that state-of-the-art deep learning models can be attacked by small but well-designed perturbations. Existing attack algorithms for the image captioning task is time-consuming, and their generated adversarial examples cannot transfer well to other models. To generate...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2022
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8371 https://ink.library.smu.edu.sg/context/sis_research/article/9374/viewcontent/Learning_Transferable_Perturbations_for_Image_Captioning.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9374 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-93742023-12-13T02:51:05Z Learning transferable perturbations for image captioning WU, Hanjie LIU, Yongtuo CAI, Hongmin HE, Shengfeng Present studies have discovered that state-of-the-art deep learning models can be attacked by small but well-designed perturbations. Existing attack algorithms for the image captioning task is time-consuming, and their generated adversarial examples cannot transfer well to other models. To generate adversarial examples faster and stronger, we propose to learn the perturbations by a generative model that is governed by three novel loss functions. Image feature distortion loss is designed to maximize the encoded image feature distance between original images and the corresponding adversarial examples at the image domain, and local-global mismatching loss is introduced to separate the mapping encoding representation of the adversarial images and the ground true captions from a local and global perspective in the common semantic space as far as possible cross image and caption domain. Language diversity loss is to make the image captions generated by the adversarial examples as different as possible from the correct image caption at the language domain. Extensive experiments show that our proposed generative model can efficiently generate adversarial examples that successfully generalize to attack image captioning models trained on unseen large-scale datasets or with different architectures, or even the image captioning commercial service. 2022-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8371 info:doi/10.1145/3478024 https://ink.library.smu.edu.sg/context/sis_research/article/9374/viewcontent/Learning_Transferable_Perturbations_for_Image_Captioning.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Adversarial example Generative model Image caption Image captioning Image features Learn+ Learning models Neural-networks Robustness of neural network State of the art Databases and Information Systems Theory and Algorithms |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Adversarial example Generative model Image caption Image captioning Image features Learn+ Learning models Neural-networks Robustness of neural network State of the art Databases and Information Systems Theory and Algorithms |
spellingShingle |
Adversarial example Generative model Image caption Image captioning Image features Learn+ Learning models Neural-networks Robustness of neural network State of the art Databases and Information Systems Theory and Algorithms WU, Hanjie LIU, Yongtuo CAI, Hongmin HE, Shengfeng Learning transferable perturbations for image captioning |
description |
Present studies have discovered that state-of-the-art deep learning models can be attacked by small but well-designed perturbations. Existing attack algorithms for the image captioning task is time-consuming, and their generated adversarial examples cannot transfer well to other models. To generate adversarial examples faster and stronger, we propose to learn the perturbations by a generative model that is governed by three novel loss functions. Image feature distortion loss is designed to maximize the encoded image feature distance between original images and the corresponding adversarial examples at the image domain, and local-global mismatching loss is introduced to separate the mapping encoding representation of the adversarial images and the ground true captions from a local and global perspective in the common semantic space as far as possible cross image and caption domain. Language diversity loss is to make the image captions generated by the adversarial examples as different as possible from the correct image caption at the language domain. Extensive experiments show that our proposed generative model can efficiently generate adversarial examples that successfully generalize to attack image captioning models trained on unseen large-scale datasets or with different architectures, or even the image captioning commercial service. |
format |
text |
author |
WU, Hanjie LIU, Yongtuo CAI, Hongmin HE, Shengfeng |
author_facet |
WU, Hanjie LIU, Yongtuo CAI, Hongmin HE, Shengfeng |
author_sort |
WU, Hanjie |
title |
Learning transferable perturbations for image captioning |
title_short |
Learning transferable perturbations for image captioning |
title_full |
Learning transferable perturbations for image captioning |
title_fullStr |
Learning transferable perturbations for image captioning |
title_full_unstemmed |
Learning transferable perturbations for image captioning |
title_sort |
learning transferable perturbations for image captioning |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2022 |
url |
https://ink.library.smu.edu.sg/sis_research/8371 https://ink.library.smu.edu.sg/context/sis_research/article/9374/viewcontent/Learning_Transferable_Perturbations_for_Image_Captioning.pdf |
_version_ |
1787136844886966272 |