TY - GEN
T1 - Prediction of Software Defects Using Automated Machine Learning
AU - Tanaka, Kazuya
AU - Monden, Akito
AU - Yucel, Zeynep
N1 - Funding Information:
Part of this research was supported by JSPS KAKENHI Grant number 17H00731.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - The effectiveness of defect prediction depends on modeling techniques as well as their parameter optimization, data preprocessing and ensemble development. This paper focuses on auto-sklearn, which is a recently-developed software library for automated machine learning, that can automatically select appropriate prediction models, hyperparameters and data preprocessing techniques for a given data set and develop their ensemble with optimized weights. In this paper we empirically evaluate the effectiveness of auto-sklearn in predicting the number of defects in software modules. In the experiment, we used software metrics of 20 OSS projects for cross-release defect prediction and compared auto-sklearn with random forest, decision tree and linear discriminant analysis by using Norm(Popt) as a performance measure. As a result, auto-sklearn showed similar prediction performance as random forest, which is one of the best prediction models for defect prediction in past studies. This indicates that auto-sklearn can obtain good prediction performance for defect prediction without any knowledge of machine learning techniques and models.
AB - The effectiveness of defect prediction depends on modeling techniques as well as their parameter optimization, data preprocessing and ensemble development. This paper focuses on auto-sklearn, which is a recently-developed software library for automated machine learning, that can automatically select appropriate prediction models, hyperparameters and data preprocessing techniques for a given data set and develop their ensemble with optimized weights. In this paper we empirically evaluate the effectiveness of auto-sklearn in predicting the number of defects in software modules. In the experiment, we used software metrics of 20 OSS projects for cross-release defect prediction and compared auto-sklearn with random forest, decision tree and linear discriminant analysis by using Norm(Popt) as a performance measure. As a result, auto-sklearn showed similar prediction performance as random forest, which is one of the best prediction models for defect prediction in past studies. This indicates that auto-sklearn can obtain good prediction performance for defect prediction without any knowledge of machine learning techniques and models.
KW - auto-sklearn
KW - cross-release prediction
KW - defect prediction
KW - meta-learning
KW - software quality
UR - http://www.scopus.com/inward/record.url?scp=85077969906&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85077969906&partnerID=8YFLogxK
U2 - 10.1109/SNPD.2019.8935839
DO - 10.1109/SNPD.2019.8935839
M3 - Conference contribution
AN - SCOPUS:85077969906
T3 - Proceedings - 20th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2019
SP - 490
EP - 494
BT - Proceedings - 20th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2019
A2 - Nakamura, Masahide
A2 - Hirata, Hiroaki
A2 - Ito, Takayuki
A2 - Otsuka, Takanobu
A2 - Okuhara, Shun
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 20th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2019
Y2 - 8 July 2019 through 11 July 2019
ER -