Skip to content

Commit

Permalink
breaking: raise error for gaps in series (#504)
Browse files Browse the repository at this point in the history
  • Loading branch information
jmoralez authored Oct 31, 2024
1 parent 08a9f13 commit 8b0660a
Show file tree
Hide file tree
Showing 17 changed files with 425 additions and 1,951 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
python-version: '3.10'

- name: Install dependencies
run: pip install black nbdev pre-commit
run: pip install black nbdev==2.3.25 pre-commit

- name: Run pre-commit
run: pre-commit run --files nixtla/*
run: pre-commit run --show-diff-on-failure --files nixtla/*
6 changes: 4 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,12 @@ repos:
entry: sh -c 'nbdev_clean && nbdev_clean --fname nbs/src --clear_all'
language: system

- repo: https://github.com/fastai/nbdev
rev: 2.2.10
- repo: local
hooks:
- id: nbdev_export
name: nbdev_export
entry: sh -c 'nbdev_export'
language: system

- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.2.1
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ nixtla_client.plot(df, fcst_df, level=[80, 90])
nixtla_client = NixtlaClient(api_key = 'YOUR API KEY HERE')

# 2. Read Data # Wikipedia visits of NFL Star (
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/peyton_manning.csv')
df = pd.read_csv('https://datasets-nixtla.s3.amazonaws.com/peyton-manning.csv')


# 3. Detect Anomalies
Expand Down
27 changes: 6 additions & 21 deletions nbs/docs/capabilities/anomaly-detection/01_quickstart.ipynb

Large diffs are not rendered by default.

Large diffs are not rendered by default.

23 changes: 6 additions & 17 deletions nbs/docs/capabilities/anomaly-detection/04_confidence_levels.ipynb

Large diffs are not rendered by default.

84 changes: 68 additions & 16 deletions nbs/docs/capabilities/forecast/11_irregular_timestamps.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,10 @@
"source": [
"# Irregular timestamps\n",
"\n",
"TimeGPT can handle data with irregular timestamps. Simply specify the `freq` parameter with the right frequency of your data."
"* For pandas dataframes TimeGPT will try to infer the frequency of your timestamps. If you don't want TimeGPT to infer the frequency for you, you have to set the `freq` argument to a valid [pandas frequency string](https://pandas.pydata.org/docs/user_guide/timeseries.html#dateoffset-objects), e.g. 'MS', 'YE'.\n",
"* For polars dataframes you have to set the `freq` argument to a valid [polars offset](https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.offset_by.html), e.g. '1d', '2w'.\n",
"\n",
"If your data has missing timestamps please refer to our [tutorial on missing values](https://docs.nixtla.io/docs/tutorials-missing_values)."
]
},
{
Expand Down Expand Up @@ -83,7 +86,8 @@
"outputs": [],
"source": [
"import pandas as pd\n",
"from nixtla import NixtlaClient"
"from nixtla import NixtlaClient\n",
"from utilsforecast.data import generate_series"
]
},
{
Expand Down Expand Up @@ -120,6 +124,24 @@
" nixtla_client = NixtlaClient()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:nixtla.nixtla_client:Querying model metadata...\n"
]
}
],
"source": [
"#| hide\n",
"_ = nixtla_client._get_model_params('timegpt-1', 'B')"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -130,9 +152,8 @@
"output_type": "stream",
"text": [
"INFO:nixtla.nixtla_client:Validating inputs...\n",
"INFO:nixtla.nixtla_client:Inferred freq: B\n",
"INFO:nixtla.nixtla_client:Preprocessing dataframes...\n",
"WARNING:nixtla.nixtla_client:You did not provide X_df. Exogenous variables in df are ignored. To surpress this warning, please add X_df with exogenous variables: Open, High, Low, Adj Close, Volume, Dividends, Stock Splits\n",
"WARNING:nixtla.nixtla_client:The specified horizon \"h\" exceeds the model horizon. This may lead to less accurate forecasts. Please consider using a smaller horizon.\n",
"INFO:nixtla.nixtla_client:Restricting input...\n",
"INFO:nixtla.nixtla_client:Calling Forecast Endpoint...\n"
]
Expand All @@ -141,19 +162,57 @@
"source": [
"# Read data\n",
"# Dates for the weekends are missing\n",
"df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/openbb/pltr.csv')\n",
"df = pd.read_csv(\n",
" 'https://datasets-nixtla.s3.amazonaws.com/pltr.csv',\n",
" usecols=['date', 'Close'],\n",
")\n",
"\n",
"# Forecast\n",
"# We use B for the freq, as only business days are represented in the dataset\n",
"# the frequency is inferred as B, as only business days are represented in the dataset\n",
"forecast_df = nixtla_client.forecast(\n",
" df=df, \n",
" h=14, \n",
" df=df,\n",
" h=5,\n",
" time_col='date', \n",
" target_col='Close',\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:nixtla.nixtla_client:Validating inputs...\n",
"INFO:nixtla.nixtla_client:Preprocessing dataframes...\n",
"INFO:nixtla.nixtla_client:Restricting input...\n",
"INFO:nixtla.nixtla_client:Calling Forecast Endpoint...\n"
]
}
],
"source": [
"# manually set the frequency\n",
"forecast_df2 = nixtla_client.forecast(\n",
" df=df,\n",
" freq='B',\n",
" h=5,\n",
" time_col='date', \n",
" target_col='Close',\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pd.testing.assert_frame_equal(forecast_df, forecast_df2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -168,13 +227,6 @@
"> \n",
"> By default, `timegpt-1` is used. Please see [this tutorial](https://docs.nixtla.io/docs/tutorials-long_horizon_forecasting) on how and when to use `timegpt-1-long-horizon`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For more details on handling datasets with irregular timesteps, check out our [tutorial](https://docs.nixtla.io/docs/tutorials-irregular_timestamps). "
]
}
],
"metadata": {
Expand All @@ -185,5 +237,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}
12 changes: 9 additions & 3 deletions nbs/docs/getting-started/1_introduction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
"\n",
"* **[Prediction Intervals](https://docs.nixtla.io/docs/tutorials-prediction_intervals)**: Provide intervals in your predictions to quantify uncertainty effectively.\n",
"\n",
"* **[Irregular Timestamps](https://docs.nixtla.io/docs/tutorials-irregular_timestamps)**: Handle data with irregular timestamps, accommodating non-uniform interval series without preprocessing.\n",
"* **[Irregular Timestamps](https://docs.nixtla.io/docs/capabilities-forecast-irregular_timestamps)**: Handle data with irregular timestamps, accommodating non-uniform interval series without preprocessing.\n",
"\n",
"* **[Anomaly Detection](https://docs.nixtla.io/docs/tutorials-anomaly_detection)**: Automatically detect anomalies in time series, and use exogenous features for enhanced performance."
]
Expand Down Expand Up @@ -90,7 +90,13 @@
"source": []
}
],
"metadata": {},
"metadata": {
"kernelspec": {
"display_name": "python3",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}
2 changes: 1 addition & 1 deletion nbs/docs/getting-started/5_faq.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -324,7 +324,7 @@
"<details>\n",
" <summary>Can TimeGPT handle missing values?</summary>\n",
"\n",
"`TimeGPT` cannot handle missing values or series with irregular timestamps. For more information, see the [Forecasting Time Series with Irregular Timestamps](https://docs.nixtla.io/docs/tutorials-irregular_timestamps) and the [Dealing with Missing Values](https://docs.nixtla.io/docs/tutorials-dealing_with_missing_values_in_timegpt) tutorial. \n",
"`TimeGPT` cannot handle missing values or series with irregular timestamps. For more information, see the [Forecasting Time Series with Irregular Timestamps](https://docs.nixtla.io/docs/capabilities-forecast-irregular_timestamps) and the [Dealing with Missing Values](https://docs.nixtla.io/docs/tutorials-dealing_with_missing_values_in_timegpt) tutorial. \n",
"\n",
"</details>"
]
Expand Down
323 changes: 159 additions & 164 deletions nbs/docs/tutorials/08_cross_validation.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion nbs/docs/tutorials/120_special_topics.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
"\n",
"### What You Will Learn\n",
"\n",
"1. **[Irregular Timestamps](https://docs.nixtla.io/docs/tutorials-irregular_timestamps)**\n",
"1. **[Irregular Timestamps](https://docs.nixtla.io/docs/capabilities-forecast-irregular_timestamps)**\n",
"\n",
" - Learn how to deal with irregular timestamps for correct usage of `TimeGPT`.\n",
"\n",
Expand Down
Loading

0 comments on commit 8b0660a

Please sign in to comment.