The Unbalanced Classification Problem: Detecting Breaches in Security

dc.contributor.authorEvangelista, Paul
dc.date.accessioned2023-08-24T12:24:27Z
dc.date.available2023-08-24T12:24:27Z
dc.date.issued2006
dc.description.abstractThis research proposes several methods designed to improve solutions for security classification problems. The security classification problem involves unbalanced, high-dimensional, binary classification problems that are prevalent today. The imbalance within this data involves a significant majority of the negative class and a minority positive class. Any system that needs protection from malicious activity, intruders, theft, or other types of breaches in security must address this problem. These breaches in security are considered instances of the positive class. Given numerical data that represent observations or instances which require classification, state of the art machine learning algorithms can be applied. However, the unbalanced and high-dimensional structure of the data must be considered prior to applying these learning methods. High-dimensional data poses a “curse of dimensionality” which can be overcome through the analysis of subspaces. Exploration of intelligent subspace modeling and the fusion of subspace models is proposed. De-tailed analysis of the one-class support vector machine, as well as its weaknesses and proposals to overcome these shortcomings are included. A fundamental method for evaluation of the binary classification model is the receiver operating characteristic (ROC) curve and the area under the curve (AUC). This work details the underlying statistics involved with ROC curves, contributing a comprehensive review of ROC curve construction and analysis techniques to include a novel graphic for illustrating the connection between ROC curves and classifier decision values. The major innovations of this work include synergistic classifier fusion through the analysis of ROC curves and rankings, insight into the statistical behavior of the gaussian kernel, and novel methods for applying machine learning techniques to defend against computer intrusion detection. The primary empirical vehicle for this research is computer in-trusion detection data, and both host-based intrusion detection systems (HIDS) and network-based intrusion detection systems (NIDS) are addressed. Empirical studies also include military tactical scenarios.
dc.description.sponsorshipDepartment of Systems Engineering
dc.identifier.citationEvangelista, Paul, "The Unbalanced Classification Problem: Detecting Breaches in Security" (2006).
dc.identifier.otherNA
dc.identifier.urihttps://hdl.handle.net/20.500.14216/447
dc.publisherRensselaer Polytechnic Institute
dc.subjectSecurity Classification Problems
dc.titleThe Unbalanced Classification Problem: Detecting Breaches in Security
dc.typeTheses or dissertations
local.peerReviewedYes

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
The Unbalanced Classification Problem_ Detecting Breaches in Sec.pdf
Size:
3.21 MB
Format:
Adobe Portable Document Format