This page walks through the steps required to generate Cityscapes data for DeepLab2. DeepLab2 uses sharded TFRecords for efficient processing of the data.
Before running any Deeplab2 scripts, the user should 1. register on the
Cityscapes dataset website to download the
dataset (gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.zip). 2. install
cityscapesscripts via pip: bash # This will install the cityscapes scripts and its stand-alone tools. pip install cityscapesscripts
- run the tools provided by Cityscapes to generate the training groundtruth. See sample commandlines below:
# Set CITYSCAPES_DATASET to your dataset root.
# Create train ID label images.
CITYSCAPES_DATASET='.' csCreateTrainIdLabelImgs
# To generate panoptic groundtruth, run the following command.
CITYSCAPES_DATASET='.' csCreatePanopticImgs --use-train-id
# [Optional] Generate panoptic groundtruth with EvalId to match evaluation
# on the server. This step is not required for generating TFRecords.
CITYSCAPES_DATASET='.' csCreatePanopticImgs
After running above commandlines, the expected directory structure should be as follows:
cityscapes
+-- gtFine
| |
| +-- train
| | |
| | +-- aachen
| | |
| | +-- *_color.png
| | +-- *_instanceIds.png
| | +-- *_labelIds.png
| | +-- *_polygons.json
| | +-- *_labelTrainIds.png
| | ...
| +-- val
| +-- test
| +-- cityscapes_panoptic_{train|val|test}_trainId.json
| +-- cityscapes_panoptic_{train|val|test}_trainId
| | |
| | +-- *_panoptic.png
| +-- cityscapes_panoptic_{train|val|test}.json
| +-- cityscapes_panoptic_{train|val|test}
| |
| +-- *_panoptic.png
|
+-- leftImg8bit
|
+-- train
+-- val
+-- test
Note: the rest of this doc and released DeepLab2 models use TrainId
instead of
EvalId
(which is used on the evaluation server). For evaluation on the server,
you would need to convert the predicted labels to EvalId
.
Use the following commandline to generate cityscapes TFRecords:
# Assuming we are under the folder where deeplab2 is cloned to:
# For generating data for semantic segmentation task only
python deeplab2/data/build_cityscapes_data.py \
--cityscapes_root=${PATH_TO_CITYSCAPES_ROOT} \
--output_dir=${OUTPUT_PATH_FOR_SEMANTIC} \
--create_panoptic_data=false
# For generating data for panoptic segmentation task
python deeplab2/data/build_cityscapes_data.py \
--cityscapes_root=${PATH_TO_CITYSCAPES_ROOT} \
--output_dir=${OUTPUT_PATH_FOR_PANOPTIC}
Commandline above will output three sharded tfrecord files:
{train|val|test}@10.tfrecord
. In the tfrecords, for train
and val
set, it
contains the RGB image pixels as well as corresponding annotations. For test
set, it contains RGB images only. These files will be used as the input for the
model training and evaluation.
The Example proto contains the following fields:
image/encoded
: encoded image content.image/filename
: image filename.image/format
: image file format.image/height
: image height.image/width
: image width.image/channels
: image channels.image/segmentation/class/encoded
: encoded segmentation content.image/segmentation/class/format
: segmentation encoding format.
For semantic segmentation (--create_panoptic_data=false
), the encoded
segmentation map will be the same as PNG file created by
createTrainIdLabelImgs.py
.
For panoptic segmentation, the encoded segmentation map will be the raw bytes of
a int32 panoptic map, where each pixel is assigned to a panoptic ID. Unlike the
ID used in Cityscapes script (json2instanceImg.py
), this panoptic ID is
computed by:
panoptic ID = semantic ID * label divisor + instance ID
where semantic ID will be:
- ignore label (255) for pixels not belonging to any segment
- for segments associated with
iscrowd
label:- (default): ignore label (255)
- (if set
--treat_crowd_as_ignore=false
while runningbuild_cityscapes_data.py
):category_id
(use TrainId)
category_id
(use TrainId) for other segments
The instance ID will be 0 for pixels belonging to
stuff
classthing
class withiscrowd
label- pixels with ignore label
and [1, label divisor)
otherwise.