UK Road Crash Analysis
Period of time: from 2021 to 2022
-
This repository contains the datasource, an EDA SQL script, a data cleaning SQL script and an interactive Tableau workbook for visulization.
-
The datasource has 307.973 rows and 19 columns.
-
Link to the Tableau dashboard
Details
- **Initial requirements**:
Clients want to create a UK Road Crash dashboard for the years from 2021 to 2022 so that they can have the following insights with the hope in minimizing the loss of lives.
- KPIs
📌 Total Casualties, categorized into levels of accident severity, which took place after the accident per year
📌 Total Casualties with respect to level of accident severity and total casualties by type of vehicle per year.
📌 Total Casualties with respect to vehicle type per year.
📌 Monthly trend showing a comparison of casualties for the current year and the previous year.
📌 Total Casualties by Road type.
📌 Distribution of total casualties by road surface.
📌 Relation between Casualties by Area, Location & by Day/ Night.
Details
-
SQL Microsoft Server for EDA and Data Cleaning Process.
-
Tableau Desktop for Data Visualization in parallel with Data Analysis.
Details
Step 1 - Data Cleaning by Excel
The raw csv file contained inconsistent data type within column "accident_date", which need to be fixed by using Excel.
Step 2 - EDA using SQL
I started with exploratory data analysis to deeper understand about the data patterns, inconsistences and potential insights. This step drives effective decision-making, aids in data cleaning, and informs appropriate analysis methods for enhanced project outcomes.
Step 3 - Data Cleaning using SQL
This step involves getting rid of data inconsistencies, errors, and duplicates, ensuring the accuracy and reliability of the data I'm using for subsequent analysis.
Step 4 - Data Processing
This step contains generating additional columns by organizing, sorting, and filtering the data to derive valuable insights.
Step 5 - Data Analyzing and Data Visualizing
I conducted analysis and visualization simultaneously to grasp insights quickly, validate results, communicate effectively, and refine hypotheses iteratively. It aids in comprehensive understanding and facilitates efficient data exploration and interpretation.
Step 6 - Dashboard Building
Finally, I translate all the insights into an interactive Tableau Public dashboard, enabling users to engage with the data.
Details
Note: Audience can easily find the completed EDA Script in the attached file name: 03_UKRoadCrash_SQLQuery_DataCleaningScript.sql
✅ The Total Casualties occurring post the accident amount to 417.883 and was categorized into 3 groups of the level of accident severity including Slight, Serious and Fatal with retrieved numbers are 351.436, 59.312,7.135; respectively.
SELECT SUM(number_of_casualties) AS "Total Casualties", COUNT(DISTINCT accident_index) AS "Total Accidents",
SUM(CASE WHEN accident_severity = 'Slight' THEN number_of_casualties END) AS "Total Slight Casualties",
SUM(CASE WHEN accident_severity = 'Serious' THEN number_of_casualties END) AS "Total Serious Casualties",
SUM(CASE WHEN accident_severity = 'Fatal' THEN number_of_casualties END) AS "Total Fatal Casualties"
FROM [UK-Road-Crash-DB]..uk_road_crashSELECT SUM(number_of_casualties) AS CY_Casualties
FROM [UK-Road-Crash-DB] .. uk_road_crash
✅ Total Casualties, categorized into levels of accident severity, which took place after the accident, per year
SELECT
YEAR(accident_date) AS "Year",
SUM(number_of_casualties) AS "Total Casualties",
COUNT(DISTINCT accident_index) AS "Total Accidents",
SUM(CASE WHEN accident_severity = 'Slight' THEN number_of_casualties END) AS "Total Slight Casualties",
SUM(CASE WHEN accident_severity = 'Serious' THEN number_of_casualties END) AS "Total Serious Casualties",
SUM(CASE WHEN accident_severity = 'Fatal' THEN number_of_casualties END) AS "Total Fatal Casualties"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
- Result
✅ Total Casualties with respect to each accident severity and total casualties by each type of vehicle per year. Severity Level: Slight
SELECT YEAR(accident_date) AS "Year",
SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Agricultural%'THEN number_of_casualties END) AS "Slight Casualties by Agricultural Vehicle",
SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Bus%' OR vehicle_type LIKE 'bus%' THEN number_of_casualties END) AS "Slight Casualties by Bus",
SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Car%' OR vehicle_type LIKE 'car%' THEN number_of_casualties END) AS "Slight Casualties by Car",
SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Motorcycle%' THEN number_of_casualties END) AS "Slight Casualties by Motorcycle",
SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Other%' OR vehicle_type LIKE 'Pedal%'OR vehicle_type LIKE 'missing%' OR vehicle_type LIKE 'horse%'THEN number_of_casualties END) AS "Slight Casualties by Others",
SUM(CASE WHEN accident_severity = 'Slight' AND vehicle_type LIKE 'Goods%' THEN number_of_casualties END) AS "Slight Casualties by Van"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
- Result:
Severity Level: Serious
SELECT YEAR(accident_date) AS "Year",
SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Agricultural%'THEN number_of_casualties END) AS "Serious Casualties by Agricultural Vehicle",
SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Bus%' OR vehicle_type LIKE 'bus%' THEN number_of_casualties END) AS "Serious Casualties by Bus",
SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Car%' OR vehicle_type LIKE 'car%' THEN number_of_casualties END) AS "Serious Casualties by Car",
SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Motorcycle%' THEN number_of_casualties END) AS "Serious Casualties by Motorcycle",
SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Other%' OR vehicle_type LIKE 'Pedal%'OR vehicle_type LIKE 'missing%' OR vehicle_type LIKE 'horse%'THEN number_of_casualties END) AS "Serious Casualties by Others",
SUM(CASE WHEN accident_severity = 'Serious' AND vehicle_type LIKE 'Goods%' THEN number_of_casualties END) AS "Serious Casualties by Van"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
- Result:
Severity Level: Fatal
SELECT YEAR(accident_date) AS "Year",
SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Agricultural%'THEN number_of_casualties END) AS "Fatal Casualties by Agricultural Vehicle",
SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Bus%' OR vehicle_type LIKE 'bus%' THEN number_of_casualties END) AS "Fatal Casualties by Bus",
SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Car%' OR vehicle_type LIKE 'car%' THEN number_of_casualties END) AS "Fatal Casualties by Car",
SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Motorcycle%' THEN number_of_casualties END) AS "Fatal Casualties by Motorcycle",
SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Other%' OR vehicle_type LIKE 'Pedal%'OR vehicle_type LIKE 'missing%' OR vehicle_type LIKE 'horse%'THEN number_of_casualties END) AS "Fatal Casualties by Others",
SUM(CASE WHEN accident_severity = 'Fatal' AND vehicle_type LIKE 'Goods%' THEN number_of_casualties END) AS "Fatal Casualties by Van"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
✅ Total Casualties with respect to vehicle type per year.
SELECT YEAR(accident_date) AS "Year",
SUM(CASE WHEN vehicle_type LIKE 'Agricultural%'THEN number_of_casualties END) AS "Total Casualties by Agricultural Vehicle",
SUM(CASE WHEN vehicle_type LIKE 'Bus%' OR vehicle_type LIKE 'bus%' THEN number_of_casualties END) AS "Total Casualties by Bus",
SUM(CASE WHEN vehicle_type LIKE 'Car%' OR vehicle_type LIKE 'car%' THEN number_of_casualties END) AS "Total Casualties by Car",
SUM(CASE WHEN vehicle_type LIKE 'Motorcycle%' THEN number_of_casualties END) AS "Total Casualties by Motorcycle",
SUM(CASE WHEN vehicle_type LIKE 'Other%' OR vehicle_type LIKE 'Pedal%'OR vehicle_type LIKE 'missing%' OR vehicle_type LIKE 'horse%'THEN number_of_casualties END) AS "Total Casualties by Others",
SUM(CASE WHEN vehicle_type LIKE 'Goods%' THEN number_of_casualties END) AS "Total Casualties by Van"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
✅ Monthly trend showing a comparison of casualties for the current year and the previous year.
SELECT Year(accident_date) AS "Year", DATENAME(MONTH, accident_date) AS Month_Name, SUM(number_of_casualties) AS total_casualties
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY Year(accident_date), DATENAME(MONTH, accident_date)
ORDER BY "Year"
✅ Total Casualties by Road type.
SELECT YEAR(accident_date) AS "Year",
SUM(CASE WHEN road_type LIKE 'Single%' THEN number_of_casualties END) AS "Single Carriagetway",
SUM(CASE WHEN road_type LIKE 'Dual%' THEN number_of_casualties END) AS "Dual Carriagetway",
SUM(CASE WHEN road_type LIKE 'Roundabout%' THEN number_of_casualties END) AS "Roundabout",
SUM(CASE WHEN road_type LIKE 'one%' THEN number_of_casualties END) AS "One way street",
SUM(CASE WHEN road_type LIKE 'Slip%' THEN number_of_casualties END) AS "Slip road",
SUM(CASE WHEN road_type IS NULL THEN number_of_casualties END) AS "Null",
SUM(number_of_casualties) AS Total_Casualties
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
- Result
✅ Distribution of total casualties by road surface condition
SELECT YEAR(accident_date) AS "Year",
ROUND(SUM(CASE WHEN road_surface_conditions LIKE 'Dry%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "Dry",
ROUND(SUM(CASE WHEN road_surface_conditions LIKE 'Frost%' OR road_surface_conditions LIKE 'Snow%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "Frost/Snow",
ROUND(SUM(CASE WHEN road_surface_conditions LIKE 'Flood%' OR road_surface_conditions LIKE 'Wet%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "Wet",
ROUND(SUM(CASE WHEN road_surface_conditions IS NULL THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "Unknown",
SUM(number_of_casualties) AS Total_Casualties
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY YEAR(accident_date)
✅ Relation between Casualties by Area.
SELECT YEAR(accident_date) AS "Year",
ROUND(SUM(CASE WHEN urban_or_rural_area = 'Rural' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "PCT_Rural",
ROUND(SUM(CASE WHEN urban_or_rural_area = 'Urban' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "PCT_Urban"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY Year(accident_date)
- Result:
✅ Relation between Casualtiesby by Day/ Night.
SELECT YEAR(accident_date) AS "Year",
ROUND(SUM(CASE WHEN light_conditions LIKE 'Day%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "PCT_Day",
ROUND(SUM(CASE WHEN light_conditions NOT LIKE 'Day%' THEN number_of_casualties END)/SUM(number_of_casualties)*100,2) AS "PCT_Night"
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY Year(accident_date)
- Result:
✅ Relation between Casualtiesby by Location.
SELECT TOP 10
local_authority, SUM(number_of_casualties) AS Total_Casualties
FROM [UK-Road-Crash-DB]..uk_road_crash
GROUP BY local_authority
ORDER BY Total_Casualties DESC
- Result: