Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Methodology Explainer #114

Open
ReedRodgers opened this issue Dec 7, 2017 · 7 comments
Open

Methodology Explainer #114

ReedRodgers opened this issue Dec 7, 2017 · 7 comments
Assignees

Comments

@ReedRodgers
Copy link
Contributor

Write a high-level explanation of the data manipulation process to include in the internal dashboard.

@ReedRodgers
Copy link
Contributor Author

ReedRodgers commented Dec 7, 2017

  • Low-level explanation of data aggregation.
  • Summary of low-level.
  • Decide if Bluetooth reader explanation should be included.
  • Add baseline filtering example
  • Add list of filtered baseline dates

@radumas
Copy link
Member

radumas commented Dec 12, 2017

@ReedRodgers
Copy link
Contributor Author

ReedRodgers commented Dec 21, 2017

@aharpalaniTO, @radumas, Any critiques?

image

Also available in full on Murmering Waters

@q-schen
Copy link
Contributor

q-schen commented Dec 21, 2017

Maybe align the text to each other? And would be nice to have the X stand out a bit more.
Is it possible to fiddle with transparency of the box?

@radumas radumas self-assigned this Jan 5, 2018
radumas added a commit that referenced this issue Jan 8, 2018
@radumas
Copy link
Member

radumas commented Jan 8, 2018

The third type of plot was put together to analyze the impact of removing a date from a given baseline. This plot showed the new baseline overlaid on the old baseline to demonstrate the effect of removing the outlier. It was determined that removing dates with outliers from the baseline could have an impact on the quality of the data.

source
"Could have an impact" is an empty statement. Did we do anything following these graphs?

Finally, for each baseline with notable outliers, a scatter plot was produced for the weeks the outliers were found. The percentile band plots were shown for reference, now with the 100th percentile shown as x's, and the last band showing up to the 90th percentile.

Lastly, the baseline comparison graphs were plotted with the outlying dates removed from the new baseline. Each of these sets of figures was analyzed to see if the outlier's impact on the baseline was great enough to warrant it's removal.

source
Are these two separate graphs?

Not sure what this following paragraph adds that couldn't be in the numbered list

When looking at the travel time scatterplot for Queen Street University to Yonge, a major change in travel times was noticed at midnight on Saturday, September 30th. The baseline for Saturday was examined using the percentile band plot, and it looked like the event significantly impacted the baseline, pulling it beyond the 10-90 percentile band, and forming a slight upwards trend where no such trend is reflected in the bulk of the data. Because of this, the original baseline was compared to a new baseline with September 30th and October 1st removed, and the new weekend baseline was significantly lower during early morning and midnight. Finally, it was learned that the event occurred during Nuit Blanche, and the Bluetooth readers likely picked up pedestrian phones as there were no cars on the street. Even though this didn't affect the data during peak hours, its impact on the baseline was so large it was excluded from the baseline data.
source

Don't fully understand the relationship between this paragraph and the list above it

Many outliers are single points, which are likely due to pedestrian phones being picked up during low traffic during the nighttime hours. The exceptions to this rule are Nuit Blanche, the three large peaks on Adelaide, and the Sunday on Dufferin, which seemed to have an abnormal slowdown event ongoing at the time of the outlier.
source

  • Make sure to clarify distinction between outlier 30-min observations and outlier days

@radumas
Copy link
Member

radumas commented Jan 8, 2018

  • README should also explain how the scripts in the folder work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants