2020/11/02 HOMEWORK 1 - FINAL REPROT

ARCHITECTURE

CONTENT

PROBLEM DESPRICTION(專案描述)
HIGHLIGHT(重點部分)
LISTING(條列程式檔案)
TEST AND RUN(程式的測試與執行)
DISCUSSION(結果討論)

PROBLEM DESPRICTION

Through Prof. Hung-Yi Lee’s OCW (Open Course Ware) and the homework “PM2.5 prediction” learning the knowledge of Linear Regression.

Data source
“環境保護署 2019 年桃園市平鎮區空氣品質監測資料”
Choose Model
- Let "y'" be the prediction PM2.5,
  "x" be the features of 9 hours sensor types (ex. CO, NO, PM10, PM2.5, Rainfalls),
  "w" be the weights of features
- Function 1: y’ = w * x
- Function 2: y’ = w1 * x + w2 * x^2
Loss function
RMSE (root-mean-square error)
Gradient descent
Vanilla (basic), SGDM, MBSGD, AdaGrad, RMSProp, Adam
Flow chart:

HIGHLIGHT

PRE-PROCESSING

Using Python rather than excel
Invalid value, null value
filled with the mean of sensor types
Split training dataset
20 days/month
Split testing dataset
Remaining days/month (8~11 days)
Data extraction
Every 9 hours with 15 sensor type/hour be the features and predict the PM2.5 of 10th hour

ADJUST LEARNING RATE

Model: Power of one
Gradient: Vanilla g radient descent
Iteration: 1,000 times

FEATURE SELECTION

Training set number: 4,521
Validation set number: 1,131
Iteration: 10,000 times
Gradient: AdaGrad

FEATURE SCALING

Z-Score (Standard Score): (x – μ) / σ
Max-Min: x - min(x) / max(x) - min(x)

PERFORMANCE IMPORVEMENT

Training set number: 5,652
Testing set number: 1,446
Iteration: 150,000 times

LISTING

DataPreprocession.py: Data cleaning, Dataset spliting
TestDataPreprocessing.py: Feature extraction, Shuffle
LinearRegression.py: Feature extraction, Shuffle, Normalization,Split validation set, Training, Prediction
ReDesignModel_Seq.py: Same as LinearRegression.py but use "square function model"
getHT.py: Raspberry Pi reads temperature and humidity datas from DHT11
CSVDownloader_GoogleSheet.py: Automatically download datas frominternet google sheet
getUbidotsData.gs: Get datas from Ubidots to Google sheet, "gs"means "Google Apps Script"

TEST AND RUN

Demo code running

Comparison of gradient descent

DISCUSSION

What's happened to adjusting learning rate?
- If learning rate is not minimal enough, loss would not converge (even become greater than greater)
- If learning rate is too small, loss convergence would be inefficient
- Only learning rate just fit, loss convergence would be efficient
Why removing some features would improve the performance?
- Maybe the other features effect on PM2.5 be not that fast
Why Z-Score is better in performance of convergence?
- Z-Score is appropriate in unknowing maximum and minimum
How to improve model?
1. Shuffle training and testing data
2. Using AdaGrad rather than Vanilla gradient descent
3. Feature only select PM10 and PM2.5
Final improvement of model:
converge loss value descents 0.0225