Name		Name	Last commit message	Last commit date
parent directory ..
modules		modules
README.md		README.md
model_manager.js		model_manager.js
pipeline_runner.js		pipeline_runner.js

README.md

Model building pipeline

Model building pipeline combines feature extraction and model building into a single pipeline. Each pipeline is described with a configuration file in JSON format.

Pipeline can perform feature transformation, model training and prediction step.

Install

In the ./ew-shopp-public/ directory run

npm install
npm rebuild qminer --update-binary

to install all the necessary dependencies.

Note: Execute all node scripts from the ./ew-shopp-public/ directory.

Usage

Example of execution:

node analytics/pipeline/pipeline_runner.js -m {transform,fit,predict} -d PATH_TO_CONFIG

Description of arguments:

Argument	Type	Description
-d	String	Path to the JSON configuration file.
-m	String	Pipeline execution mode (fit, predict or transform).

API

Feature transformation

node analytics/pipeline/pipeline_runner.js -m transform -d PATH_TO_CONFIG

Currently supports weather feature transformation and media attention feature transformation. However, you can also provide your script to transform data and define a JSON configuration file structured as follows:

{
    "name": "<transformation_name>",
    "version": "<transformation_version>",
    "description": "<transformation_short_description>",
    "transformation": [
        {
            "module": "<path_to_module_script_from_pipeline_runner>",
            "params": {
                "input_db": "<path_to_input_database>",
                "output_db": "<path_to_output_database>",
                "...": "..."
            }
        }
    ]
}

Custom module

The script analytics/pipeline/pipeline_runner.js with mode -m transform will try to execute function function exec(params) {...} inside your custom script module and pass all the parameters defined within params parameter. As an example of such script please see weather_builder.js

Fit/Predict

Pipeline is specified with a single file in JSON format. The main building block of the pipeline is a module (see below).

A configuration file is specified in a JSON format and structured as follows:

{
    "name": "<model_name>",
    "version": "<model_version>",
    "description": "<model_short_description>",
    "pipeline": {
        "input": {
            "primary_key": ["Timestamp"]
        },
        "extraction": [
            {},{}, "..."
        ],
        "model": {
            
        }
    },
    "input_extraction": {
        
    }
}

Parameters

Parameter	Type	Required	Description
name	String	Yes	Name (unique identifier) of the pipeline.
version	String	No	Version of the pipeline (user specified).
description	String	No	Short description.
pipeline	Object	Yes	Body of the pipeline specifying input handling, feature extraction and modelling.
pipeline.input.primary_key	List	Yes	Primary key (unique identification fields) of each record from the input.
pipeline.extraction	List	Yes	List of modules performing feature extraction.
pipeline.model	Object	Yes	Module performing data modelling.
input_extraction	Object	Yes	Module populating pipeline's input.

Modules

Module is the main building block of the pipeline. It is a custom JavaScript class that executes a specific operation. Input to the module should be stored in a QMiner database and after module finishes with the execution it stores the result in the specified output database (to a fixed store with predefined name). Each module has its own set of configurable parameters.

List of modules

Modules are used for input extraction and model fitting and everything in between.

Weather

See weather module documentation.

Media attention

See media attention module documentation.

Generic feature selector

Module name: generic_feature_selector

Description: Used for selecting prebuilt features from given QMiner Database.

Parameters:

Parameter	Type	Required	Description
input_db	String	Yes	Location of the QMiner database containing prebuilt features.
input_store	String	Yes, if `search_query` is not defined	Name of the store containing features.
search_query	List	No	QMiner search query to filter records from feature store.
features	List	No	Names of the feature fields. Strings in the list can also be given as regex expressions (i.e. "features": ["span_.*"]).
not_features	List	No	Names of the fields not consider as features. Strings in the list can also be given as regex expressions (i.e. "not_features": ["Time.*"]).

Example:

{
    "module": "generic_feature_selector",
    "params": {
        "input_db": "sales_db",
        "input_store": "discounts",
        "features": [
            "Seller", "Brand", "Price", "Discount"
        ]
    }
}

Parameter search_query is set to filter out only the relevant records from feature stores, while the features parameter sets which fields (features) to use.

Support vector regression model (SVR)

Module name: SVR

Description: Support vector regression model

Parameters:

Parameter	Type	Description
SVM_param	Object	Hyperparameters of the model described in detail in QMiner's documentation.
score	String	Model validation method (currently only "cv" - cross-validation).
model_filename	String	File to save model to.

Example:

{
    "module": "SVR",
    "params": {
        "SVM_param": {
            "algorithm": "LIBSVM",
            "c": 10,
            "j": 3,
            "kernel": "POLY"
        },
        "score": "cv",
        "verbose": true,
        "model_filename": "regressor.model"
    }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pipeline

pipeline

README.md

Model building pipeline

Install

Usage

API

Feature transformation

Custom module

Fit/Predict

Parameters

Modules

List of modules

Weather

Media attention

Generic feature selector

Example:

Support vector regression model (SVR)

Example:

Files

pipeline

Directory actions

More options

Directory actions

More options

Latest commit

History

pipeline

Folders and files

parent directory

README.md

Model building pipeline

Install

Usage

API

Feature transformation

Custom module

Fit/Predict

Parameters

Modules

List of modules

Weather

Media attention

Generic feature selector

Example:

Support vector regression model (SVR)

Example: