An Adversarial Training Based Machine Learning Approach to Malware Classification under Adversarial Conditions

Date

2021

Journal Title

Journal ISSN

Volume Title

Publisher

Proceedings of the 54th Hawaii International Conference on System Sciences

Abstract

The use of machine learning (ML) has become an established practice in the realm of malware classification and other areas within cybersecurity. Characteristic of the contemporary realm of intelligent malware classification is the threat of adversarial ML. Adversaries are looking to target the underlying data and/or models responsible for the functionality of malware classification to map its behavior or corrupt its functionality. The ends of such adversaries are bypassing the cybersecurity measures and increasing malware effectiveness. We develop an adversarial training-based ML approach for malware classification under adversarial conditions that leverages a stacking ensemble method, which compares the performance of 10 base ML models when adversarially trained on three data sets of varying data perturbation schemes. This comparison ultimately reveals the best performing model per data set, which includes random forest, bagging and gradient boosting. Experimentation also includes stacking a mixture of ML models in both the first and second levels in the stack. A first level stack across all 10 ML models with a second level support vector machine is top performing. Overall, this work reveals that a malware classifier can be developed to account for potential forms of training data perturbation with minimal effect on performance.

Description

Keywords

Accountability, Evaluation, Obscurity of AI Algorithms, adversarial training, ai system assurance, Cyber Security, Machine Learning, Malware

Citation

Sean M. Devine and Nathaniel D. Bastian. "An Adversarial Training Based Machine Learning Approach to Malware Classification under Adversarial Conditions". Proceedings of the 54th Hawaii International Conference on System Sciences, 2021.

DOI