A fast C++ implementation of TensorFlow Lite Unet on a Jetson Nano.
Once overclocked to 2015 MHz, the app runs at 11 FPS.
Special made for a Jetson Nano see Q-engineering deep learning examples
Papers: https://arxiv.org/abs/1606.00915
Training set: VOC2017
Size: 257x257
CPU 2015 MHz | GPU 2015 MHz | CPU 1479 MHz | GPU 1479 MHZ | RPi 4 64os 1950 MHz |
---|---|---|---|---|
11 FPS | 9.1 FPS | 9 FPS | 8.3 FPS | 7.2 FPS |
To run the application, you have to:
- TensorFlow Lite framework installed. Install TensorFlow Lite
- Optional OpenCV installed. Install OpenCV 4.5
- Code::Blocks installed. (
$ sudo apt-get install codeblocks
)
To extract and run the network in Code::Blocks
$ mkdir MyDir
$ cd MyDir
$ wget https://github.com/Qengineering/TensorFlow_Lite_Segmentation_Jetson-Nano/archive/refs/heads/main.zip
$ unzip -j master.zip
Remove master.zip, LICENSE and README.md as they are no longer needed.
$ rm master.zip
$ rm README.md
Your MyDir folder must now look like this:
cat.jpg.mp4
deeplabv3_257_mv_gpu.tflite
TestUnet.cpb
Unet.cpp
Run TestTensorFlow_Lite.cpb with Code::Blocks.
You may need to adapt the specified library locations in TestTensorFlow_Lite.cpb to match your directory structure.
With the #define GPU_DELEGATE
uncommented, the TensorFlow Lite will deploy GPU delegates, if you have, of course, the appropriate libraries compiled by bazel. Install GPU delegates
See the RPi 4 movie at: https://www.youtube.com/watch?v=Kh9DLMgCIIE