Click the above button to launch this repository as a notebook in your browser.
Last updated of the README: 03-01-2019
This repository is currently being created, it is not yet finished.
The notebooks in the repository look the best when using Jupyter.
This repository contains a throughout explanation on how to create different deep learning models in Keras for multivariate (tabular) time-series prediction. The data being used in this repository is from the KB-74 OPSCHALER project. The goal of this project is to do gas consumption prediction of houses on an hourly resolution, for the minor Applied Data Science at The Hague University of Applied Sciences.
The jargon used in this repository.
- Dwelling: An individual house
- EDA: Exploratory Data Annalysis
- MVLR: Multivariate Linear Regression
- DNN: Deep Neural Network
- CNN: Convolutional Neural Network
- RNN: Recurrent Neural Network
- LSTM: Long Short-Term Memory
- GRU: Gated Recurrent Unit
The original data is as follows.
Parameter | Unit | Sample rate | Description |
---|---|---|---|
Timestamp | - | 10 s | Timestamp of data telegram (set by smart meter) in local time |
eMeter | kWh | 10 s | Meter reading electricity delivered to client, normal tariff |
eMeterReturn | kWh | 10 s | Meter reading electricity delivered by client, normal tariff |
eMeterLow | kWh | 10 s | Meter reading electricity delivered to client, low tariff |
eMeterLowReturn | kWh | 10 s | Meter reading electricity delivered by client, low tariff |
ePower | kWh | 10 s | Actual electricity power delivered to client |
ePowerReturn | kWh | 10 s | Actual electricity power delivered by client |
gasTimestamp | - | 1 h | Timestamp of the gasMeter reading (set by smart meter) in local time |
gasMeter | m3 | 1 h | Last hourly value (temperature converted0, gas delivered to client |
This is weather data from the KNMI weather station in Rotterdam with a sample rate of 15 minutes.
A representative from OPSCHALER says that this weather station is the most nearby most of the dwellings, the exact dwelling locations however are unknown.
The dwelling furthest away from the weather station is 103 km north east.
When this weather data is used to make predictions on the validation and test dataset (which is future data for the model), this weather data is assumed to be the 'predictions' for the weather at given timestamp.
In reality the weather predictions made by climate models should be used.
Parameter | Unit | Description |
---|---|---|
DD | degrees | Wind direction |
DR | s | Precipitation time |
FX | m/s | Maximum gust of wind at 10 m |
FF | m/s | Windspeed at 10 m |
N | okta | Cloud coverage |
P | hPa | Outside pressure |
Q | W/m2 | Global radiation |
RG | mm/h | Rain intensity |
SQ | m | Sunshine duration (in minutes) |
T | deg C | Temperature at 1,5 m (1 minute mean) |
T10 | deg C | Minimum temperature at 10 cm |
TD | deg C | Dew point temperature |
U | % | Relative humidity at 1,5 m |
VV | m | Horizontal sight |
WW | - | Weather- and station-code |
The original data has been resampled to an hour, this is the data available in this repository.
Features:
- Electrical power consumption (ePower)
- Wind speed (FF)
- Rain intensity (RG)
- Temperature (T)
- Timestamp YYYY:MM:DD HH:MM:SS (datetime)
Target:
- Gas consumption (gasPower)
The notebooks are written in order.
Due to this reason certain information that has been put in notebook 1 might for example not appear in notebook 2 and so on.
The hyperas MODEL.py
files contains the Python scripts that use Hyperas for the hyperparameter optimazation.
The MODEL.py
files contains the .py
versions of the notebooks, these train quicker than training from within Jupyter (e.g. 50 epochs/s instead of 2 epochs/s, for DNN).