diff --git a/_posts/2023-10-05-(CL_paper100)Nonst_Trans.md b/_posts/2023-10-05-(CL_paper100)Nonst_Trans.md
index 21f9cbe97435..e21d7924e137 100644
--- a/_posts/2023-10-05-(CL_paper100)Nonst_Trans.md
+++ b/_posts/2023-10-05-(CL_paper100)Nonst_Trans.md
@@ -17,6 +17,19 @@ https://github.com/thuml/Nonstationary_Transformers.
0. Abstract
0. Introduction
+0. Related Works
+ 0. Deep Models for TSF
+ 0. Stationarization for TSF
+
+0. Non-stationary Transformers
+ 0. Series Stantionarization
+ 0. De-stationary Attention
+
+0. Experiments
+ 0. Experimental Setups
+ 0. Main Results
+ 0. Ablation Study
+
@@ -24,7 +37,7 @@ https://github.com/thuml/Nonstationary_Transformers.
Previous studies : use stationarization to attenuate the non-stationarity of TS
-$\rightarrow$ can be less instructive for real-world TS
+$$\rightarrow$$ can be less instructive for real-world TS
@@ -52,7 +65,7 @@ Non-stationarity of data
However, non-stationarity is the inherent property
-$\rightarrow$ also good guidance for discovering temporal dependencies
+$$\rightarrow$$ also good guidance for discovering temporal dependencies
@@ -62,7 +75,7 @@ Example) Figure 1
- ( Figure 1 (b) ) Transformers trained on the stationarized series tend to generate indistinguishable attentions
- $\rightarrow$ ***over-stationarization*** problem
+ $$\rightarrow$$ ***over-stationarization*** problem
- unexpected side-effect ... makes Transformers fail to capture eventful temporal dependencies
@@ -166,27 +179,122 @@ De-stationary Attention mechanism
-Self-Attention: $\operatorname{Attn}(\mathbf{Q}, \mathbf{K}, \mathbf{V})=\operatorname{Softmax}\left(\frac{\mathbf{Q K}^{\top}}{\sqrt{d_k}}\right) \mathbf{V}$.
+Self-Attention: $$\operatorname{Attn}(\mathbf{Q}, \mathbf{K}, \mathbf{V})=\operatorname{Softmax}\left(\frac{\mathbf{Q K}^{\top}}{\sqrt{d_k}}\right) \mathbf{V}$$.
Bring the vanished non-stationary information back to its calculation
- approximate the
- - positive scaling scalar $\tau=\sigma_{\mathbf{x}}^2 \in \mathbb{R}^{+}$
- - shifting vector $\boldsymbol{\Delta}=\mathbf{K} \mu_{\mathbf{Q}} \in \mathbb{R}^{S \times 1}$,
+ - positive scaling scalar $$\tau=\sigma_{\mathbf{x}}^2 \in \mathbb{R}^{+}$$
+ - shifting vector $$\boldsymbol{\Delta}=\mathbf{K} \mu_{\mathbf{Q}} \in \mathbb{R}^{S \times 1}$$,
which are defined as de-stationary factors.
-- try to learn de-stationary factors directly from the statistics of unstationarized $\mathbf{x}, \mathbf{Q}$ and $\mathbf{K}$ by MLP
+- try to learn de-stationary factors directly from the statistics of unstationarized $$\mathbf{x}, \mathbf{Q}$$ and $$\mathbf{K}$$ by MLP
-$\log \tau=\operatorname{MLP}\left(\sigma_{\mathbf{x}}, \mathbf{x}\right)$.
+$$\log \tau=\operatorname{MLP}\left(\sigma_{\mathbf{x}}, \mathbf{x}\right)$$.
-$\boldsymbol{\Delta}=\operatorname{MLP}\left(\mu_{\mathbf{x}}, \mathbf{x}\right)$.
-$\operatorname{Attn}\left(\mathbf{Q}^{\prime}, \mathbf{K}^{\prime}, \mathbf{V}^{\prime}, \tau, \boldsymbol{\Delta}\right)=\operatorname{Softmax}\left(\frac{\tau \mathbf{Q}^{\prime} \mathbf{K}^{\prime}+\mathbf{1} \boldsymbol{\Delta}^{\top}}{\sqrt{d_k}}\right) \mathbf{V}^{\prime}$.
+$$\boldsymbol{\Delta}=\operatorname{MLP}\left(\mu_{\mathbf{x}}, \mathbf{x}\right)$$.
+$$\operatorname{Attn}\left(\mathbf{Q}^{\prime}, \mathbf{K}^{\prime}, \mathbf{V}^{\prime}, \tau, \boldsymbol{\Delta}\right)=\operatorname{Softmax}\left(\frac{\tau \mathbf{Q}^{\prime} \mathbf{K}^{\prime}+\mathbf{1} \boldsymbol{\Delta}^{\top}}{\sqrt{d_k}}\right) \mathbf{V}^{\prime}$$.
# 4. Experiments
+## (1) Experimental Setups
+
+### a) Datasets
+
+- Electricity
+- ETT datasets
+- IExchange
+- ILI
+- Traffic
+- Weather
+
+
+
+### b) Degree of stationarity
+
+Augmented Dick-Fuller (ADF) test statistic
+
+- small value = high stationarity
+
+![figure2](/assets/img/ts/img472.png)
+
+
+
+### c) Baselines
+
+- pass
+
+
+
+## (2) Main Results
+
+### a) Forecasting
+
+MTS Forecasting
+
+![figure2](/assets/img/ts/img473.png)
+
+
+
+UTS Forecasting
+
+![figure2](/assets/img/ts/img474.png)
+
+
+
+### b) Framework Generality
+
+![figure2](/assets/img/ts/img475.png)
+
+Conclusion: Non-stationary Transformer is an **effective and lightweight** framework that can be widely **applied to Transformer-based models** and enhances their non-stationary predictability
+
+
+
+## (3) Ablation Study
+
+### a) Quality evaluation
+
+Dataset: ETTm2
+
+Models:
+
+- vanilla Transformer
+- Transformer with only Series Stationarization
+- Non-stationary Transformer
+
+
+
+![figure2](/assets/img/ts/img476.png)
+
+
+
+### b) Quantitative performance
+
+![figure2](/assets/img/ts/img477.png)
+
+
+
+## (3) Model Analysis
+
+### a) Over-stationarization problem
+
+Transformers with ....
+
+- v1) Transformer + Ours ( = Non-stationary Transformer )
+- v2) Transformer + RevIN
+- v3) Transformer + Series Stationarization
+
+
+
+![figure2](/assets/img/ts/img478.png)
+
+Result
+
+- v2 & v3) tend to output series with unexpected high degree of stationarity
+
diff --git a/_posts/non-stationary_transformer.pdf b/_posts/non-stationary_transformer.pdf
deleted file mode 100644
index efa297187131..000000000000
Binary files a/_posts/non-stationary_transformer.pdf and /dev/null differ
diff --git a/assets/img/ts/img472.png b/assets/img/ts/img472.png
new file mode 100644
index 000000000000..749fd6949fd3
Binary files /dev/null and b/assets/img/ts/img472.png differ
diff --git a/assets/img/ts/img473.png b/assets/img/ts/img473.png
new file mode 100644
index 000000000000..611e9a19e77b
Binary files /dev/null and b/assets/img/ts/img473.png differ
diff --git a/assets/img/ts/img474.png b/assets/img/ts/img474.png
new file mode 100644
index 000000000000..434c23d1c7dc
Binary files /dev/null and b/assets/img/ts/img474.png differ
diff --git a/assets/img/ts/img475.png b/assets/img/ts/img475.png
new file mode 100644
index 000000000000..635b1a6f8b2a
Binary files /dev/null and b/assets/img/ts/img475.png differ
diff --git a/assets/img/ts/img476.png b/assets/img/ts/img476.png
new file mode 100644
index 000000000000..e7a34abacb56
Binary files /dev/null and b/assets/img/ts/img476.png differ
diff --git a/assets/img/ts/img477.png b/assets/img/ts/img477.png
new file mode 100644
index 000000000000..da73e5668b96
Binary files /dev/null and b/assets/img/ts/img477.png differ
diff --git a/assets/img/ts/img478.png b/assets/img/ts/img478.png
new file mode 100644
index 000000000000..38421469e003
Binary files /dev/null and b/assets/img/ts/img478.png differ