Semantic Segmentation using architectures like UNET, UNET Attention and UNET ASPP (Atrous Spatial Pyramid Pooling) along with web interface. Models have been trained and are provided for quick prediction. In addition to provided models, it facilitates training of new models based on pre-supplied / user-supplied, training data and provide web UI to tune the hyper-parameters. Web deployment is crucial now a days to complete MLOps lifecycle, hence this repo attempts to provide a web interface to control training and evaluate already trained models with prediction on uploaded unknown image.
Access PDF File in main repository or Here.
- Src- Source Code
- in - Uploaded files to predict
- out - predcited image
- models - models directories that contains learned weights for model. Models can be accessed through Google Drive Link.
Download and extract zip in Python 3.9 environment with installed dependencies. Major dependencies are:
pip install keras shutil tifffile rasterio pillow streamlit
streamlit run web.py
Models are developed using archictures Unet and Unet-Attention.
UNET is used for performing semantic segmentation of satellite imagery. The architecture contains both encoder and decoder paths. First path is the contraction path (also called as the encoder) which is used to capture the context in the image. The encoder is stack of convolutional and max pooling layers. The second path is the symmetric expanding path (also called as the decoder) which is used to provide precise localization using transposed convolutions. It is an end-to-end fully convolutional network (FCN). Labelled data is generated using satellite imagery for providing it to model with images for training.
The Atrous Spatial Pyramid Pooling os ASPP layer is applied at bottleneck layer i.e. between encoder and decoder part of the UNET. This layer captures multi-scale features by applying multiple parallel filters at different dilation rate, thus increasing effective Field of View (EFoV). These filters are later passed to decoder path after concatenating and convoluting.
While traditional UNET takes skip connections directly as input, UNET-Attention or Attention aware UNET applies attention or ‘weights based on importance’ over skip connections. The skip connections are later on concatenated with output layers at each depth in decoder path.
Use of DEM channel improved Jaccard Coefficient by ~10%. Rooftop/Built-Up class is misclassified in ‘road’ and ‘open area’ classes. This could be due to improper labelling of pixels in input mask. Some houses has open roofs exposing underneath floor. Such floors and roads has similar tone and texture which makes it hard to distinguish between them provided RGB and DEM bands.
Location: Bibipur
Model: UNET (Depth=5, Filters=32)
Jaccard Coef (Val) in % = 76.11%
Location: Bibipur
Model: UNET-ASPP (Depth=4, Filters=32)
Jaccard Coef (Val) in % = 73.54%
Location: Bibipur
Model: UNET-Attention (Depth=4, Filters=32)
Jaccard Coef (Val) in % = 76.61%
Customizable & Data independent deployment strategy. Training Data Preprocessing Automatic Data Normalizationatrous spatial pyramid pooling Augmentation Image tiling Random data splitting in train & validation sets. Custom data generators Live feedback from training process via plots Automatic saving best model Prediction Selection of Trained Models with training accuracy. Facility to upload Dataset for on-the-fly prediction. Results visualization
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.