-
Notifications
You must be signed in to change notification settings - Fork 4
Home
Tiago Sanona edited this page Mar 21, 2021
·
21 revisions
Operators:
- SGC (l + nl), GC (l + nl), AGC, ASGC, ASGCP, GlobalSLC
Hyperparameters:
- batch-size: 16
- bottleneck-channels: 128
- spatial-channels: 96
- dropout: 0.1
- dropout-att: 0.5 (TBD)
- forecast-horizon: 3, 6 separately
Datasets:
- METR-LA
- PEMS-BAY
- Context
- Spatio-temporal learning traffic networks
- Inter- & Intra- series correlation; link this to spatial and temporal correlations/dependencies
- ARIMA -(highly non linear dynamics & inter-series correlations)-> GNNs --> GNNs that learn graph structure
- Inter series correlation:
- prior graph based on road; connected (by roads) Nodes influence each other
- learned based on patterns in data (not expressed the road network)
- Issue
- Misrepresentation of capabilities of certain mechanisms
- Problem
- Zhang et al. and Chao et al. do both learn laplacian; but with different mechanisms and the impact these mechanisms have is not clear
- Others
- Usually they to propose Ablation studies to show effectiveness
- Compare their Architecture to competing architectures
- Only whole architectures are compared & not components
- Effectiveness of components is only shown in own temporal model
- TODO: find out how temporal modeling is done in Chao et al. (maybe with DFT)
- Novelty
- Contrast to others
- Experiment with convolutions in fixed temporal framework
- Compare components rather than models
- Challenges
- Temporal modeling --> Main cause for lack in performance convolution operations have a fixed math. definition and therefore require no engineering or tuning. If implemented correctly there is nothing to tune about conv. operators.
- Structuring 8 convolution kernels;
- Approach
- Similarly as in abstract
- Contributions
- list all novel implementations: GC-l, ASGCP, ASGC, AGC
- one sentence reffering to repo --> gives one place for all implementation
- Benefits from our approach:
- Insight on which mechanisms for learning a graph structure might be well suited for spatial modelling; without discussing temporal modelling
- For example: researchers know how to build their convolution kernels for spatio-temporal problems
- Central GitHub repo for lots of kernels and temporal models.
- Dataset:
- Descriptive metrics; #Sensors (Nodes), time-frame (e.g. aggregated over 5 min intervals), Time-Frame (Mar 1st 2012 - Jun 30th 2012), Caltrans PeMS (Performance Measuring System)
- Features: Signal (Mph), Timestamp (Cyclical) over 1 day
- Graph Convolution Spatial vs. Spectral on a high-level and some history (Kipf & Welling)
- Spatial Convolution (GC)
- Spectral Convolution (SGC)
- Structure Learning with Parameters Zhang et al. combine SLCs which do learn global and local graph representations. State that we only cover Global view and not local view.
- Define spectral convolution with learnable laplacian (GC-l, SGC-l)
- Latent Correlation Layer Explain Attention mechanism on a high level.
- Define spectral and spatial convolution kernels (AGC, ASGC)
- Combinations
- Global SLC
- ASGCP
- Temporal Model (image of P3D with substitutable Graph Convolution)
- Table with all Kernels listed.
- External Validity
- Only tested on Traffic Prediction (based on speed readings)
- Model might not generalize for different types of road (inner city vs highway, big city vs small city).
- Long forecast horizons not tested
- Same country
- Internal Validity
- Hyperparameter tuning on Val Dataset
- Structure Learning vs. #Parameters
- Construct Validity
- RMSE vs MAE vs MAPE; these are all valid in traffic forecasting;
- Conclusion Validity
- Confirm not prove;
- Performance P3D not in general
- Avoid by having at least 2 samples for each concept
- 2 Datasets
- Highlevel description of the work:
- Measure the effect of learnable laplacian vs pre-defined based on human knowledge.
- we do the above with spectral convolution and spacial convolution.
- table with with model names and their explanations.
- summary of the next subsections.
- Model:
- start by introducing the architecture
- Linear layer
- Dropout
- P3D:
- Blocks A, B, C; Downsample, spacial and temporal convolution, Upsample, Batch Norm, Relu
- Upsample
- Graph convolution(s)
- start by introducing the architecture
- Combining convolution operators:
- Recapping Global SLC
- The idea of substituting the dynamic part of SLC by attention
- recap ASGCP
- Experimental Setup:
- Basically talk about the config files
- Hyper params (batch size, bottle neck chans, learning rate,...)
- Forcast horizon
- Datasets
- Loss function
- Train, val test split (70, 10, 20) (split in time why?)
- Explain that we pick the model parameters by the best val resoults