This git is my implementation from scratch of Semantic Segmentation tasks with TensorFlow and Keras. At first step, I started this project with some basic deep learning models in semantic segmentation task, such as FCN (Fully Convolutional Network) and U-Net. These models are trained with Kitti Road Dataset (download here) and ISPRS Dataset (link dataset).
-
29 June 2019 : The first version of this project is well done for training a FCN-Alexnet model with Kitti Road Dataset. However, it is quite difficult to restore this model for a prediction. So these constraints will be optimized in the following update.
-
5 July 2019 : This version is the implementation of FCN-Alexnet, U-Net and FCN-8s (based on VGG16-Net), training with Kitti Road Dataset. Also, transfer learning and fine tuning are applied when using FCN-8s (restore weights from file vgg16.npy, another solution is to use the checkpoint vgg_16.ckpt that will be developped in the future). The idea about "caching the frozen layers" is tested but not really work yet, this idea will be also developped in the future.
-
9 July 2019: This version updates the demos for training and predicting with Kitti Road Dataset, by using FCN-AlexNet, U-Net and FCN-8s (demos in files notebook .ipynb). However, there are also some bugs about saving and restoring the model. In fact, the model cannot be restored when using the module tf.train.Saver, but it works if I use the module tf.saved_model. Any idea?
-
18 July 2019: This version deals with ISPRS Dataset (on the Vaihingen dataset) along with the implementation of the base FCN in semantic labeling task. The model is tested with only 3 classes : road, buiding and background.
1. Training step
- Create a configuration file .json
"exp_name" : Folder that you want save the checkpoints and summaries for tensorboard
"num_epochs" : Number of epochs for training, pay attention to overfitting !
"num_iter_per_epoch" : Number of iterations executed in each epoch
"learning_rate" : Used for optimizer. So, what is the best rate ? How to choose the best
learning rate ?
"batch_size" : Number of samples used for training in each iteration
"max_to_keep" : Number maximum of checkpoints that you want to keep
"data_path" : Path to dataset
"image_size" : Input image size with format [height,width,channels]
"loss" : Name of loss function you want to use
"accuracy" : Name of accuracy function you want to use
- Read the config file
from utils.config import process_config
config = process_config("PATH/TO/CONFIG/FILE")
- Create your data generator
from data_loader.kitti_road_data_loader import KittiRoadLoader
data = KittiRoadLoader(config)
- Create and build an instance of model
from models.fcn_alexnet_model import FcnAlexnetModel
model = FcnAlexnetModel(config)
model.build()
- (Optional) Create a builder for saving the model
builder = tf.saved_model.builder.SavedModelBuilder(config.final_model_dir)
- Create a session
from tensorflow as tf
sess = tf.Session()
- Create an instance of logger for saving checkpoints and summaries.
from utils.logger import Logger
logger = Logger(sess,config)
- Create an trainer for training the created model with your above dataset
from trainers.road_trainer import RoadTrainer
trainer = RoadTrainer(sess,model,data,config,logger)
- Train your model by the trainer
trainer.train()
- (Optional) Load your model if exists, then saving the final model in binary files. These files will be used for predicting the results or deploying with TensorFlow Serving.
model.load(sess)
print("Saving the final model..")
builder.add_meta_graph_and_variables(sess,
[tf.saved_model.tag_constants.TRAINING],
signature_def_map=None,
assets_collection=None)
builder.save()
print("Final model saved")
- Close the session when you finish
sess.close()
2. Predict results with trained model
- If you don't close the the training session yet, you can predict the result by insert directly these lines before closing the session :
model.load(sess)
test = [data.get_data_element("test_data",i) for i in range(5)]
for item in test :
img = item[0]
mask = item[1]
model.predict(sess,img,mask)
sess.close()
- Or if you want to predict the results in another session
with tf.Session() as sess:
print("Loading final model ")
tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.TRAINING], config.final_model_dir)
print("Final model loaded")
test = [data.get_data_element("test_data",i) for i in range(5)]
for item in test :
img = item[0]
mask = item[1]
model.predict(sess,img,mask)
1. Start the training
python road_segmentation.py -c configs/unet_KittiRoadDataset_config.json
2. Start Tensorboard visualization
tensorboard --logdir=experiments/unet_kittiroad/summary/
The template is inspired from TensorFlow Project Template.