TY - GEN
T1 - Detection and Correction of Adversarial Examples Based on IPEG-Compression-Derived Distortion
AU - Tsunomori, Kenta
AU - Yamasaki, Yuma
AU - Kuribayashi, Minoru
AU - Funabiki, Nobuo
AU - Echizen, Isao
N1 - Funding Information:
This research was supported by the JSPS KAKENHI Grant Number 19K22846 and 22K19777, JST SICORP Grant Number JPMJSC20C3, JST CREST Grant Number JPMJCR20D3, and ROIS NII Open Collaborative Research 2022-22S1402, Japan.
Publisher Copyright:
© 2022 Asia-Pacific of Signal and Information Processing Association (APSIPA).
PY - 2022
Y1 - 2022
N2 - An effective way to defend against adversarial examples (AEs), which are used, for example, to attack applications such as face recognition, is to detect in advance whether an input image is an AE. Some AE defense methods focus on the response characteristics of image classifiers when denoising filters are applied to the input image. However, several filters are required, which results in a large amount of computation. Because JPEG compression of AEs effectively removes adversarial perturbations, the difference between the image before and after JPEG compression should be highly correlated with the perturbations. However, the difference should not be completely consistent with adversarial perturbations. We have developed a filtering operation that modulates this difference by varying their magnitude and positive/negative sign and adding them to an image so that adversarial perturbations can be effectively removed. We consider that adversarial perturbations that could not be removed by simply applying JPEG compression can be removed by modulating this difference. Furthermore, applying a resizing process to the image after adding these distortions enables us to remove perturbations that could not be removed otherwise. The filtering operation will successfully remove the adversarial noise and reconstruct the corrected samples from AEs. We also consider a simple but effective reconstruction method based on the filtering operations. Experiments in which the adversarial attack used was not known to the detector demonstrated that the proposed method could achieve better performance in terms of accuracy with reasonable computational complexity. In addition, the percentage of correct classification results after applying the proposed filter for non-targeted attacks was higher than that of JPEG compression and scaling. These results suggest that the proposed method effectively removes adversarial perturbations and is an effective filter for detecting AEs.
AB - An effective way to defend against adversarial examples (AEs), which are used, for example, to attack applications such as face recognition, is to detect in advance whether an input image is an AE. Some AE defense methods focus on the response characteristics of image classifiers when denoising filters are applied to the input image. However, several filters are required, which results in a large amount of computation. Because JPEG compression of AEs effectively removes adversarial perturbations, the difference between the image before and after JPEG compression should be highly correlated with the perturbations. However, the difference should not be completely consistent with adversarial perturbations. We have developed a filtering operation that modulates this difference by varying their magnitude and positive/negative sign and adding them to an image so that adversarial perturbations can be effectively removed. We consider that adversarial perturbations that could not be removed by simply applying JPEG compression can be removed by modulating this difference. Furthermore, applying a resizing process to the image after adding these distortions enables us to remove perturbations that could not be removed otherwise. The filtering operation will successfully remove the adversarial noise and reconstruct the corrected samples from AEs. We also consider a simple but effective reconstruction method based on the filtering operations. Experiments in which the adversarial attack used was not known to the detector demonstrated that the proposed method could achieve better performance in terms of accuracy with reasonable computational complexity. In addition, the percentage of correct classification results after applying the proposed filter for non-targeted attacks was higher than that of JPEG compression and scaling. These results suggest that the proposed method effectively removes adversarial perturbations and is an effective filter for detecting AEs.
UR - http://www.scopus.com/inward/record.url?scp=85146277897&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146277897&partnerID=8YFLogxK
U2 - 10.23919/APSIPAASC55919.2022.9980147
DO - 10.23919/APSIPAASC55919.2022.9980147
M3 - Conference contribution
AN - SCOPUS:85146277897
T3 - Proceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
SP - 1831
EP - 1836
BT - Proceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
Y2 - 7 November 2022 through 10 November 2022
ER -