-
-
Notifications
You must be signed in to change notification settings - Fork 25
Radiation Index
Pheipp edited this page Jan 31, 2019
·
4 revisions
The idea behind the radiation index was to display the average, the minimum and the maximum value for all the capitol cities of the G20 states. Because of several outliers in the data, a more statistic approach was used in form of the median and whiskers. The documentation will cover a short insight which files are created/altered and what are they used for.
To get the data out of the database, 3 scripts are needed. All these files are currently within the scripts folder.
This script contains all the needed SQL-Queries. One for each city. The query selects all measurements located 5000 meters (depending on the amount of data, some cities have a bigger, and some a lesser distance) within a radius of the coordinates of the city center and exports these into separate csv files.
Calculate_radiation_index is a Ruby script with the main task to read all the csv files, get rid of not needed information and save all the information needed in a new csv file.
Because every city has a separate csv file, every file needs to be read. For that, the city_names – array is iterated. This array stores all the names of the cities that will be displayed later. In every iteration, two arrays are needed. One array is the data-array. This array can store three additional arrays, one of them is for the average, one for the minimal and one for the maximal values. The other array (city_values) contains all the values that were read from the latest csv file. This array is different for each iteration, while the data-array stays the same, just with additional data each time. To calculate the data for the three arrays inside the data-array, the calculate_data method is called. First, the average value is calculated. As mentioned earlier, for better statistical evaluation, the median is calculated and not the exact average.
Before the minimum and maximum values are also stored in their representative arrays, outliers need to be removed. For that, the Whiskers of a Box plot where calculated. To calculate the Whiskers, the values for the IQR, the first and the third quartile are needed. After obtaining the value for the first and third quartile, which are calculated similar to the median, the value for of the first quartile need to be subtracted from the value of the third quartile to get the IQR. The mathematical rule is that every value, that is bigger than the value for the third quartile plus the IQR*1.5, is an outlier. Mirroring that, every value smaller than the value for the first quartile minus the IQR*1.5 is also an outlier. These outliers are removed from the city_values-array. After that, the minimal and maximal value is calculated from the city_values-array, using a standard Ruby Method and is stored in their representative data-sub-arrays.
After all csv file are read and all the needed data is calculated, a new csv file needs to be created. This is done inside the create_g20_csv method, where a new csv file is created, and every sub array of the data-array is stored as a separate line.
The resulting csv file has 4 different lines:
1. Header with with the city names
2. Average Value of all the cities
3. Minimal Value of all the cities
4. Maximal Value of all the cities