TY - GEN
T1 - Detection of Adversarial Examples Based on Sensitivities to Noise Removal Filter
AU - Higashi, Akinori
AU - Kuribayashi, Minoru
AU - Funabiki, Nobuo
AU - Nguyen, Huy H.
AU - Echizen, Isao
N1 - Publisher Copyright:
© 2020 APSIPA.
PY - 2020/12/7
Y1 - 2020/12/7
N2 - An injection of malicious noise causes a serious problem in machine learning system. Due to the uncertainty of the system, the noise may misleads the system to the wrong output determined by a malicious party. The created images, videos, speeches are called adversarial examples. The study of fooling an image classifier have been reported as a potential threat for the CNN-based systems. The noise is well-designed so that the existence in an image is kept hidden from human eyes as well as computer-based classifiers. In this paper, we propose a novel method for detecting adversarial images by using the sensitivities of image classifiers. As adversarial images are created by adding noise, we focus on the behavior of outputs of image classifier for differently filtered images. Our idea is to observe the outputs by changing the strength of a noise removal filtering operation, which is called operation-oriented characteristics. With the increase of the strength, the output from a softmax function in an image classifier is drastically changed in case of adversarial images, while it is rather stable in case of normal images. We investigate the operation-oriented characteristics for some noise removal operations and the propose a simple detector of adversarial images. The performance is quantitatively evaluated by experiments for some typical attacks.
AB - An injection of malicious noise causes a serious problem in machine learning system. Due to the uncertainty of the system, the noise may misleads the system to the wrong output determined by a malicious party. The created images, videos, speeches are called adversarial examples. The study of fooling an image classifier have been reported as a potential threat for the CNN-based systems. The noise is well-designed so that the existence in an image is kept hidden from human eyes as well as computer-based classifiers. In this paper, we propose a novel method for detecting adversarial images by using the sensitivities of image classifiers. As adversarial images are created by adding noise, we focus on the behavior of outputs of image classifier for differently filtered images. Our idea is to observe the outputs by changing the strength of a noise removal filtering operation, which is called operation-oriented characteristics. With the increase of the strength, the output from a softmax function in an image classifier is drastically changed in case of adversarial images, while it is rather stable in case of normal images. We investigate the operation-oriented characteristics for some noise removal operations and the propose a simple detector of adversarial images. The performance is quantitatively evaluated by experiments for some typical attacks.
UR - http://www.scopus.com/inward/record.url?scp=85100939763&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100939763&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85100939763
T3 - 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings
SP - 1386
EP - 1391
BT - 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2020
Y2 - 7 December 2020 through 10 December 2020
ER -