Malware Task Identification: A data driven approach

Abstract

Identifying the tasks a given piece of malware was designed to perform (e.g. logging keystrokes, recording video, establishing remote access, etc.) is a difficult and time-consuming operation that is largely human-driven in practice. In this paper, we present an automated method to identify malware tasks. Using two different malware collections, we explore various circumstances for each - including cases where the training data differs significantly from test; where the malware being evaluated employs packing to thwart analytical techniques; and conditions with sparse training data. We find that this approach consistently out-performs the current state-of-the art software for malware task identification as well as standard machine learning approaches - often achieving an unbiased F1 score of over 0.9. In the near future, we look to deploy our approach for use by analysts in an operational cyber-security environment.

Description

Keywords

malware identification, cognitive modeling

Citation

Nunes, Eric, Casey Buto, Paulo Shakarian, Christian Lebiere, Stefano Bennati, Robert Thomson, and Holger Jaenisch. "Malware task identification: A data driven approach." In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp. 978-985. 2015.