Skip to content

Commit

Permalink
Merge branch 'feature/forecasting' of https://github.com/oracle/accel…
Browse files Browse the repository at this point in the history
…erated-data-science into feature/forecasting
  • Loading branch information
mrDzurb committed Oct 12, 2023
2 parents 2563bf1 + 983a58b commit 42b0202
Show file tree
Hide file tree
Showing 3 changed files with 88 additions and 79 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -4,31 +4,33 @@ Advanced Use Cases

**Documentation: Forecasting Science and Model Parameterization**

## The Science of Forecasting
**The Science of Forecasting**

Forecasting is a complex yet essential discipline that involves predicting future values or events based on historical data and various mathematical and statistical techniques. To achieve accurate forecasts, it is crucial to understand some fundamental concepts:

### Seasonality
**Seasonality**

Seasonality refers to patterns in data that repeat at regular intervals, typically within a year. For example, retail sales often exhibit seasonality with spikes during holidays or specific seasons. Seasonal components can be daily, weekly, monthly, or yearly, and understanding them is vital for capturing and predicting such patterns accurately.

### Stationarity
**Stationarity**

Stationarity is a critical property of time series data. A time series is considered stationary when its statistical properties, such as mean, variance, and autocorrelation, remain constant over time. Stationary data simplifies forecasting since it allows models to assume that future patterns will resemble past patterns.

### Cold Start
**Cold Start**

The "cold start" problem arises when you have limited historical data for a new product, service, or entity. Traditional forecasting models may struggle to make accurate predictions in these cases due to insufficient historical context.

## Passing Parameters to Models
**Passing Parameters to Models**

To enhance the accuracy and adaptability of forecasting models, our system allows you to pass parameters directly. Here's how to do it:


**Forecast Configuration YAML File:**

In your ``forecast.yaml`` configuration file, you can specify various model parameters under the ``model_params`` section. For instance:
In your ``forecast.yaml`` configuration file, you can specify various model parameters under the ``model_params`` section. For instance:

.. code-block:: yaml
```yaml
kind: operator
type: forecast
version: v1
Expand All @@ -40,10 +42,10 @@ To enhance the accuracy and adaptability of forecasting models, our system allow
num_trees: 100
max_depth: 5
learning_rate: 0.01
```
## When Models Perform Poorly and the "Auto" Method
**When Models Perform Poorly and the "Auto" Method**

Forecasting models are not one-size-fits-all, and some models may perform poorly under certain conditions. Common scenarios where models might struggle include:

Expand Down
114 changes: 58 additions & 56 deletions docs/source/user_guide/operators/forecasting_operator/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,64 +6,66 @@ Examples

The simplest yaml file is generated by the ``ads operator init --type forecast`` and looks like the following:

```
kind: operator
type: forecast
version: v1
spec:
datetime_column:
name: Date
historical_data:
url: data.csv
horizon:
interval_unit: M
periods: 3
model: auto
target_column: target
```
.. code-block:: yaml
kind: operator
type: forecast
version: v1
spec:
datetime_column:
name: Date
historical_data:
url: data.csv
horizon:
interval_unit: M
periods: 3
model: auto
target_column: target
**Complex Example**

The yaml can also be maximally stated as follows:

```
kind: operator
type: forecast
version: v1
spec:
historical_data:
columns:
- Date
- target
- Series
format: "csv"
url: historical_data.csv
additional_data:
url: additional_data.csv
test_data:
url: test_data.csv
output_directory:
url: oci://<bucket>@<namespace>/results/
target_category_columns:
- Series
target_column: target
confidence_interval_width: 0.8
datetime_column:
format: %dd%mm%yy
name: Date
forecast_filename: forecast.csv
horizon:
interval: 1
interval_unit: M
periods: 3
metric: smape
metrics_filename: metrics.csv
model: automlx
model_kwargs:
preprocessing: true
report_file_name: report.html
report_theme: light
report_title: report
tuning:
n_trials: 5
```
.. code-block:: yaml
kind: operator
type: forecast
version: v1
spec:
historical_data:
columns:
- Date
- target
- Series
format: "csv"
url: historical_data.csv
additional_data:
url: additional_data.csv
test_data:
url: test_data.csv
output_directory:
url: oci://<bucket>@<namespace>/results/
target_category_columns:
- Series
target_column: target
confidence_interval_width: 0.8
datetime_column:
format: %dd%mm%yy
name: Date
forecast_filename: forecast.csv
horizon:
interval: 1
interval_unit: M
periods: 3
metric: smape
metrics_filename: metrics.csv
model: automlx
model_kwargs:
preprocessing: true
report_file_name: report.html
report_theme: light
report_title: report
tuning:
n_trials: 5
Original file line number Diff line number Diff line change
Expand Up @@ -16,19 +16,21 @@ After having set up ``ads opctl`` on your desired machine using ``ads opctl conf
These details exactly match the inital forecast.yaml file generated by running ``ads operator init --type forecast``:

```
kind: operator
type: forecast
version: v1
spec:
datetime_column:
name: Date
historical_data:
url: data.csv
horizon:
interval_unit: M
periods: 3
model: auto
target_column: target

kind: operator
type: forecast
version: v1
spec:
datetime_column:
name: Date
historical_data:
url: data.csv
horizon:
interval_unit: M
periods: 3
model: auto
target_column: target

```

Optionally, you are able to specify much more. The most common additions are:
Expand All @@ -46,7 +48,10 @@ Run

After you have your forecast.yaml written, you simply run the forecast using:

``ads operator run -f forecast.yaml``
.. code-block:: bash
ads operator run -f forecast.yaml
Interpret Results
-----------------
Expand Down

0 comments on commit 42b0202

Please sign in to comment.