This is the implementation of the Seq2Seq model for web attack detection. The Seq2Seq model is usually used in Neural Machine Translation. The main goal of this project is to demonstrate the relevance of the NLP approach for web security.
The problem of web attack detection is considered in terms of anomaly detection. On the training step the model is given only benign HTTP requests. On the testing step the model determines whether a received request is anomalous or not.
Check out our slides and a post at AI Village (DEFCON 26).
The step-by-step solution is presented in seq2seq.ipynb that contains the main stages such as a model initialization, training, validation, prediction and results.
Unfortunately, github ui doesn't correctly visualize cell output with colored malicious parts of requests. So, we suggest to download the notebook or use this link for correctly displaying cells outputs.
The dataset contains data with 21991 benign and 1097 anomalous HTTP requests from a banking application.
Please make sure that you have the same requirements and python 2.7.*
This repository contains environment.yml so it can be dockerized using jupyter/repo2docker. We have already dockerized it for you and you can run this playbook by
docker run -it -p 8888:8888 montekki/seq2seq-web-attack-detection:latest jupyter notebook --ip=0.0.0.0