Skip to content

Latest commit

 

History

History
37 lines (26 loc) · 2.57 KB

README.md

File metadata and controls

37 lines (26 loc) · 2.57 KB

Invoke-Evasion

This repository contains various datasets, Jupyter notebooks, and machine learning models that accompany the "Learning Machine Learning" series of blog posts:

Structure

./notebooks/

  • Feature Selection.ipynb - code for performing the various types of feature selection
  • LogisticRegression.ipynb - training a tuned Logistic Regression model on the augmented obfuscated PowerShell dataset
  • TreeModels.ipynb - training various tree ensemble models on the augmented obfuscated PowerShell dataset
  • NeuralNetworks.ipynb - training various Neural Network models on the augmented obfuscated PowerShell dataset
  • WhiteBox.ipynb - white box attacks against the trained Logistic Regression and LightGBM Classifier
  • WhiteBox-NeutalNetwork.ipynb - white box attacks against the trained Neural Network
  • BlackBox.ipynb - black box attacks against the trained models
  • BlackBox-Model3.ipynb - optimization attacks against model 3, the trained Neural Network

./models/

  • tuned_ridge.bin - Pickled tuned L2 (Ridge) regularized Logistic Regression model pipeline trained on the augmented obfuscated PowerShell dataset
  • tuned_lgbm.bin - Pickled tuned LightGBM classifier model trained on the augmented obfuscated PowerShell dataset
  • ./neural_network/ - Saved model weights for a 4-layer 192 neuron Neural Network with a dropout of .5

./datasets/

  • PowerShellCorpus.ast.csv.7z - compressed csv of AST features extracted from an augmented PowerShell corpus dataset of 14702 samples
  • BlackBoxData.ast.csv.7z - compressed csv of AST features extracted from a subset of the PowerShell corpus (3000 samples)

./PS-AST/

  • C# project that integrates the checks from Revoke-Obfuscation (by Daniel Bohannon & Lee Holmes, Apache License 2.0) for AST file generation. Also contains SplitScriptFunctions that outputs every function in a script to a separate file, used for data augmentation.

./samples/

  • Various adversarial samples generated by white/black box evasion methods