Skip to content

Latest commit

 

History

History
63 lines (42 loc) · 2.35 KB

README.md

File metadata and controls

63 lines (42 loc) · 2.35 KB

Decision Stream





1. Overview [paper]

This repository provides a basic implementation of the Decision Stream regression and classification algorithm. Unlike the classical decision tree approach, this method builds a directed acyclic graph with high degree of connectivity by merging statistically indistinguishable nodes at each iteration.

2. Prerequisites and Dependencies

The dependencies are configured in the pom.xml file.

3. First steps

  • Extract the archive data.gz with training data by running tar -xvzf data.gz
  • Optional: rebuild decision-stream.jar with Leiningen (lein uberjar) or Maven (mvn package).

4. Train the model

java -jar decision-stream.jar base-directory train-data train-answers test-data test-answers learning_mode significance-threshold

The program takes 7 input parameters:

base-directory   -   path to the dataset
train-data   -   file with training data
train-answers   -   file with training answers
test-data   -   file with test data
test-answers   -   file with test answers
learning_mode:   classification or regression   -   classification or regression problem
significance-threshold   -   threshold for merging/splitting operations

Example:

java -jar decision-stream.jar data/ailerons/ train_data.csv train_answ.csv test_data.csv test_answ.csv regression 0.02

5. Provided datasets

The datasets prepared for training in the data folder: