TY - GEN
T1 - White-Box Watermarking Scheme for Fully-Connected Layers in Fine-Tuning Model
AU - Kuribayashi, Minoru
AU - Tanaka, Takuro
AU - Suzuki, Shunta
AU - Yasui, Tatsuya
AU - Funabiki, Nobuo
N1 - Funding Information:
This research was supported by the JSPS KAKENHI Grant Number 19K22846, JST SICORP Grant Number JPMJSC20C3, and JST CREST Grant Number JPMJCR20D3, Japan.
Publisher Copyright:
© 2021 Owner/Author.
PY - 2021/6/17
Y1 - 2021/6/17
N2 - For the protection of trained deep neural network(DNN) models, embedding watermarks into the weights of the DNN model have been considered. However, the amount of change in the weights is large in the conventional methods, and it is reported that the existence of hidden watermark can be detected from the analysis of weight variance. This helps attackers to modify the watermark by effectively adding noise to the weight. In this paper, we focus on the fully-connected layers of fine-tuning models and apply a quantization-based watermarking method to the weights sampled from the layers. The advantage of the proposed method is that the change caused by watermark embedding is much smaller and the distortion converges gradually without using any loss function. The validity of the proposed method was evaluated by varying the conditions during the training of DNN model. The results shows the impact of training for DNN model, effectiveness of the embedding method, and high robustness against pruning attacks.
AB - For the protection of trained deep neural network(DNN) models, embedding watermarks into the weights of the DNN model have been considered. However, the amount of change in the weights is large in the conventional methods, and it is reported that the existence of hidden watermark can be detected from the analysis of weight variance. This helps attackers to modify the watermark by effectively adding noise to the weight. In this paper, we focus on the fully-connected layers of fine-tuning models and apply a quantization-based watermarking method to the weights sampled from the layers. The advantage of the proposed method is that the change caused by watermark embedding is much smaller and the distortion converges gradually without using any loss function. The validity of the proposed method was evaluated by varying the conditions during the training of DNN model. The results shows the impact of training for DNN model, effectiveness of the embedding method, and high robustness against pruning attacks.
KW - convergence
KW - fine-tuning
KW - local minima
KW - QIM
KW - watermark
UR - http://www.scopus.com/inward/record.url?scp=85109675864&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85109675864&partnerID=8YFLogxK
U2 - 10.1145/3437880.3460402
DO - 10.1145/3437880.3460402
M3 - Conference contribution
AN - SCOPUS:85109675864
T3 - IH and MMSec 2021 - Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security
SP - 165
EP - 170
BT - IH and MMSec 2021 - Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security
PB - Association for Computing Machinery, Inc
T2 - 2021 ACM Workshop on Information Hiding and Multimedia Security, IH and MMSec 2021
Y2 - 22 June 2021 through 25 June 2021
ER -