An Adversarial Training Based Machine Learning Approach to Malware Classification under Adversarial Conditions
Proceedings of the 54th Hawaii International Conference on System Sciences
The use of machine learning (ML) has become an established practice in the realm of malware classiﬁcation and other areas within cybersecurity. Characteristic of the contemporary realm of intelligent malware classiﬁcation is the threat of adversarial ML. Adversaries are looking to target the underlying data and/or models responsible for the functionality of malware classiﬁcation to map its behavior or corrupt its functionality. The ends of such adversaries are bypassing the cybersecurity measures and increasing malware effectiveness. We develop an adversarial training based ML approach for malware classiﬁcation under adversarial conditions that leverages a stacking ensemble method, which compares the performance of 10 base ML models when adversarially trained on three data sets of varying data perturbation schemes. This comparison ultimately reveals the best performing model per data set, which includes random forest, bagging and gradient boosting. Experimentation also includes stacking a mixture of ML models in both the ﬁrst and second levels in the stack. A ﬁrst level stack across all 10 ML models with a second level support vector machine is top performing. Overall, this work reveals that a malware classiﬁer can be developed to account for potential forms of training data perturbation with minimal effect on performance.
Conference presentations, papers, posters
Accountability, Evaluation, and Obscurity of AI Algorithms, adversarial training, ai system assurance, cybersecurity, machine learning, malware detection
Devine, Sean M., and Nathaniel D. Bastian. 2021. “An Adversarial Training Based Machine Learning Approach to Malware Classification under Adversarial Conditions.” Proceedings of the ... Annual Hawaii International Conference on System Sciences, January. https://doi.org/10.24251/hicss.2021.102.