CHALLENGES AND OPPORTUNITIES FOR GENERATIVE METHODS IN THE CYBER DOMAIN

No Thumbnail Available

Authors

Chalé, Marc
Bastian, Nathaniel D.

Issue Date

2021-12-15

Type

Other

Language

en_US

Keywords

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Large, high quality data sets are essential for training machine learning models to perform their tasks accurately. The lack of such training data has constrained machine learning research in the cyber domain. This work explores how Markov Chain Monte Carlo (MCMC) methods can be used for realistic synthetic data generation and compares it to several existing generative machine learning techniques. The performance of MCMC is compared to generative adversarial network (GAN) and variational autoencoder (VAE) methods to estimate the joint probability distribution of network intrusion detection system data. A statistical analysis of the synthetically generated cyber data determines the goodness of fit, aiming to improve cyber threat detection. The experimental results suggest that the data generated from MCMC fits the true distribution approximately as well as the data generated from GAN and VAE; however, the MCMC requires a significantly longer training period and is unproven for higher dimensional cyber data.

Description

Citation

M. Chalé and N. D. Bastian, "CHALLENGES AND OPPORTUNITIES FOR GENERATIVE METHODS IN THE CYBER DOMAIN," 2021 Winter Simulation Conference (WSC), Phoenix, AZ, USA, 2021, pp. 1-12, doi: 10.1109/WSC52266.2021.9715504.

Publisher

2021 Winter Simulation Conference (WSC)

License

Journal

Volume

Issue

PubMed ID

ISSN

EISSN