Payload-Byte: A Tool for Extracting and Labeling Packet Capture Files of Modern Network Intrusion Detection Datasets

Loading...
Thumbnail Image

Authors

Farrukh, Yasir
Khan, Irfan
Wali, Syed
Bierbrauer, David A.
Pavlik, John
Bastian, Nathaniel D.

Issue Date

2022

Type

Journal articles

Language

Keywords

Artificial Intelligence , Machine Learning , Computer Network Security

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Adapting modern approaches for network intrusion detection is becoming critical, given the rapid technological advancement and adversarial attack rates. Therefore, packet-based methods utilizing payload data are gaining much popularity due to their effectiveness in detecting certain attacks. However, packet-based approaches suffer from a lack of standardization, resulting in incomparability and reproducibility issues. Unlike flow-based datasets, no standard labeled dataset exists, forcing researchers to follow bespoke labeling pipelines for individual approaches. Without a standardized baseline, proposed approaches cannot be compared and evaluated with each other. One cannot gauge whether the proposed approach is a methodological advancement or is just being benefited from the proprietary interpretation of the dataset. Addressing comparability and reproducibility issues, we introduce Payload-Byte, an open-source tool for extracting and labeling network packets in this work. Payload-Byte utilizes metadata information and labels raw traffic captures of modern intrusion detection datasets in a generalized manner. Moreover, we transformed the labeled data into a byte-wise feature vector that can be utilized for training machine learning models. The whole cycle of processing and labeling is explicitly stated in this work. Furthermore, source code and processed data are made publicly available so that it may act as a standardized baseline for future research work. Lastly, we present a brief comparative analysis of machine learning models trained on packet-based and flow-based data.

Description

Citation

Farrukh, Yasir; Khan, Irfan; Wali, Syed; Bierbrauer, David A.; Pavlik, John; and Bastian, Nathaniel D., "Payload-Byte: A Tool for Extracting and Labeling Packet Capture Files of Modern Network Intrusion Detection Datasets" (2022).

Publisher

IEEE

License

Journal

Volume

Issue

PubMed ID

ISSN

EISSN