Robustness Evaluation of Deep Unsupervised Learning Algorithms for Intrusion Detection Systems

This repository collects different unsupervised machine learning algorithms to detect anomalies.

Implemented models

We have implemented the following models. Our implementations of ALAD closely follows the original implementations already available on GitHub.

Dependencies

A complete dependency list is available in requirements.txt. We list here the most important ones:

[email protected] with CUDA 11.3
numpy
pandas
scikit-learn

Installation

Assumes latest version of Anaconda was installed.

$ conda create --name [ENV_NAME] python=3.8
$ conda activate [ENV_NAME]
$ pip install -r requirements.txt

Replace [ENV_NAME] with the name of your environment.

Usage

From the root of the project.

$ python -m src.main 
-m [model_name]
-d [/path/to/dataset/file.{npz,mat}]
--dataset [dataset_name]
--batch-size [batch_size]

Our model contains the following parameters:

-m: selected machine learning model (required)
-d: path to the dataset (required)
--batch-size: size of a training batch (required)
--dataset: name of the selected dataset. Choices are Arrhythmia, KDD10, IDS2018, NSLKDD, USBIDS, Thyroid (required).
-e: number of training epochs (default=200)
--n-runs: number of time the experiment is repeated (default=1)
--lr: learning rate used during optimization (default=1e-4)
--pct: percentage of the original data to keep (useful for large datasets, default=1.)
rho: anomaly ratio within the training set (default=0.)
--results-path: path where the results are stored (default="../results")
--model-path: path where models will be stored (default="../models")
--test-mode: loads models from --model_path and tests them (default=False)
--hold_out: Percentage of anomalous data to holdout for possible contamination of the training set (default=0)
--rho: Contamination ratio of the training set(default=0)

Please note that datasets must be stored in .npz or .mat files. Use the preprocessing scripts within data_process to generate these files.

Example

To train a DAGMM on the KDD 10 percent dataset with the default parameters described in the original paper:

$ python  -m src.main -m DAGMM -d [/path/to/dataset.npz] --dataset KDD10 --batch-size 1024 --results-path ./results/KDD10 --models-path ./models/KDD10

Replace [/path/to/dataset.npz] with the path to the dataset in a numpy-friendly format.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data_process		data_process
src		src
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robustness Evaluation of Deep Unsupervised Learning Algorithms for Intrusion Detection Systems

Implemented models

Dependencies

Installation

Usage

Example

About

Releases

Packages

Languages

watarungurunnn/robevalanodetect

Folders and files

Latest commit

History

Repository files navigation

Robustness Evaluation of Deep Unsupervised Learning Algorithms for Intrusion Detection Systems

Implemented models

Dependencies

Installation

Usage

Example

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages