Real time object detection Android application using OpenCV 4.1 and YOLO.
Author: Matteo Medioli
YOLO: https://pjreddie.com/darknet/yolo/
- Dowload OpenCV SDK from https://sourceforge.net/projects/opencvlibrary/files/4.1.0/opencv-4.1.0-android-sdk.zip/download
- Clone this project.
- Open Android Studio and import this project.
- Build project.
- From AndroidStudio top-menù select New -> Import Module and select your path to OpenCV sdk folder (i.e /where_opencv_saved/OpenCV-android-sdk/sdk) and rename module as OpenCV.
- After load OpenCV module, re-build project.
After OpenCV module import:
- From AndroidStudio top-menù select File -> Project Structure
- Navigate to Dependencies and click on app. On the right panel there's a plus button + for add Dependency. Click on it and choose Module Dependecy.
- Select OpenCV module loaded before.
- Click Ok and Apply changes.
- Build project.
This activity is the core of application and it implements org.opencv.android.CameraBridgeViewBase.CvCameraViewListener2. It has 2 main private instance variable: a net (org.opencv.dnn.Net) and a cameraView (org.opencv.android.CameraBridgeViewBase). Basically has three main features:
Load convolutional net from *.cfg and *.weights files and read labels name (COCO Dataset) in assets folder when calls onCameraViewStarted() using Dnn.readNetFromDarknet(String path_cfg, String path_weights).
NOTE: this repo doesn't contain weights file. You have to download it from YOLO site.
Iteratively generate a frame from CameraBridgeViewBase preview and analize it as an image. Real time detection and the frames flow generation is managed by onCameraFrame(CvCameraViewFrame inputFrame). Preview frame is translate in a Mat matrix and set as input for Dnn.blobFromImage(frame, scaleFactor, frame_size, mean, true, false) to preprocess frames. Note that frame_size is 416x416 for YOLO Model (you can find input dimension in *.cfg file). We can change the size by adding or subtracting by a factor of 32. Reducing the framesize increases the performance but worsens the accuracy. The detection phase is implemented by net.forward(List<Mat> results, List<String> outNames) that runs forward pass to compute output of layer with name outName. In results the method writes all detections in preview frame as Mat objects. Theese Mat instances contain all information such as positions and labels of detected objects.
Performing Non Maximum Suppression by YOLO, in List<Mat> results are stored all coordinates of optimal bounding boxes (the first 4 numbers are [center_x, center_y, width, height], followed by all class probabilities). classId is the corresponding index for label of detection in COCO Dataset list className.
- Full screen JavaCameraView portrait mode
- Speed up JavaCameraView
- Add model chioce