Build a Traffic Sign Recognition Project
The goals / steps of this project are the following:
- Load the data set (see below for links to the project data set)
- Explore, summarize and visualize the data set
- Design, train and test a model architecture
- Use the model to make predictions on new images
- Analyze the softmax probabilities of the new images
- Summarize the results with a written report
Link to project code
I used the numpy library to calculate the summary statistics of the traffic signs data set:
- The size of the training set is
34799
- The size of the validation set is
4410
- The size of the test set is
12630
- The shape of a traffic sign image is
(32, 32, 3)
- The number of unique classes/labels in the data set is
43
Here are the examples of the images from the training set:
Here is an example of classes distribution from the training set:
As a first step, I decided to convert the images to grayscale. It reduces the dimension of input and makes normalization of the image easier.
At the next step, I am normalizing the data in order to decrease complexity and potentially increase network accuracy and learning speed.
Raw | Gray | Normalized |
---|---|---|
My final model consisted of the following layers:
Layer | Description |
---|---|
Input | 32x32x1 RGB image |
1. Convolution 3x3 | 1x1 stride, valid padding, outputs 28x28x6 |
RELU | |
Max pooling | 2x2 stride, outputs 14x14x6 |
2. Convolution 3x3 | 1x1 stride, valid padding, outputs 10x10x16 |
3. Fully connected | input 5x5x16, output 400 |
RELU | |
4. Fully connected | imput 120, output 84 |
RELU | |
5. Fully Connected | input = 84. Output = 43 |
To train the model, I used an AdamOptimizer
, with rate = 0.001
, EPOCHS = 30
, BATCH_SIZE = 128
.
AdamOptimizer is known as computationally efficient, low memory consumption and easy to tune optimizer. Rate
rate
and BATCH_SIZE
were selected via trial and error method. Smaller rate
makes learning too slow, high one does not converge well.
EPOCHS
was chosen to achieve the necessary precision.
My final model results were:
- training set accuracy of
0.999
- validation set accuracy of
0.945
- test set accuracy of
0.933
Some thoughts about the architecture:
- Current
LaNet
architecture was one I knew from previous course lessons and it did its job. - For current architecture was quite tricky to find proper hyperparameters and normalize input.
- I tuned the
EPOCHS
parameter, increased it to 20 to reach learning threshold.
Here are five German traffic signs that I found on the web:
Image | Prediction (top 5 softmax probabilities) |
---|---|
Speed limit (30km/h) | [9.9999928e-01 4.6627065e-07 1.8073200e-07 1.1348734e-08 4.4676702e-12] |
General caution | [1.0000000e+00 1.7135681e-08 4.0559920e-09 3.0214271e-09 1.0853046e-09] |
Priority road | [1.0000000e+00 7.0696413e-24 1.4433812e-24 1.0934178e-25 5.1488289e-26] |
No entry | [1.0000000e+00 5.7582330e-20 2.2768636e-24 8.4445566e-30 7.0737169e-31] |
Road work | [1.0000000e+00 3.8328948e-15 6.1613616e-16 3.8493330e-16 7.0987985e-17] |
End of all speed and passing limits | [9.9997044e-01 2.9554330e-05 2.5264612e-08 2.8319382e-11 8.3975883e-14] |
Stop | [1.0000000e+00 9.8190428e-17 3.0260198e-19 1.7365903e-19 1.4812312e-19] |
Keep right | [1.0000000e+00 8.9166329e-25 7.2660056e-32 4.0964350e-32 1.1183832e-32] |
The model was able to correctly guess all the signs, which gives an accuracy of 100%. This is the expected result because newly found images are clear, without noise and have good lighting conditions. Taking into account that the model has shown > 93%
accuracy on the test set, there should be no problem to classify these images.
Layer | Image |
---|---|
Original image: | |
Conv1 | |
Conv2 |