Palm/Hand Tracking using ABB's YuMi Robot

Gesture Recognition, Hand Tracking, REST API communications

My personal project during the lockdown period using an ABB's IRB14000 (or otherwise named as YuMi).

How to Use

Server computer running the python files should be connected to the robot's LAN port (or service port) using an ethernet cable.
Robot's controller should have a robot controlling command file similar to the one at Here
Run "python main_starter.py 192.168.125.1" on cmd, and start tracking!

If the robot is not availble, a virtual robot on Robot Studio can be used. In this case, the hostip should be 127.0.0.1
Any robot using robotware version 6.x can be controlled. (any prior version doesn't support REST API communication. In this case, socket communication is advised)
For the newer robot batch using robotware version 7.x, REST API methods delineated on https://developercenter.robotstudio.com/api/RWS has to be used. Some calling methods may differ.

Hand gesture recognition (Palm, ok sign, V sign, index finger)
Recognized palm on a vision camera is constantly tracked in real time. The position of the palm is sent using a REST API protocol to the ABB robot for synchronization.
V sign is identified as the "open the gripper" signal, which is also sent using the REST API protocol to the ABB robot
An index finger pointing upwared is recognized as the "close the gripper" signal, which is sent using the REST API to the ABB robot
OK sign starts the execution of a real-time robot tracking.
After some palm-tracking fun, a newly issued ok sign is recognized as the end of real-time robot-tracking. The robot starts revisiting the path tracked by the palm.
Robot's leanred path points and Gripper Open/Close signals are saved sequentially in .txt format (This requires ABB's Robot Studio), serving as the "taught" motion instruction for the robot.

Initially started with Oxford Hand Dataset for training. The recognized hand shape was limited for my purpose using this data. Therefore, I have collected data personally, drew bounding boxes, and annotated labels for every picture.
Intentionally, pictures of humans posing different hand gestures at 0.2~1.2m distance from a camera (roughly 2.5% ~ 20% of a frame) are collected
1100 such pictures are collected
Trained using YOLOv3 for bounding box and detection of general human hands at 0.2~1.2m distance from a cam.
Then 200 pics each for the intended 4 gestures for training dataset, and 20 pics each for validation dataset are collected.
These data are trained using the InceptionV3's Imagenet weights

Recognized gestures are inaccurate at times depending on the background. This can be solved by applying a better data augmentation scheme or simply collecting more data
Data latency (delay time between a gesture recognition and the robot's data acquisition) - it is currently in a few ms range. Perhaps using a socket communication could slightly ameliorate the issue.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
robot_program		robot_program
trainer		trainer
LICENSE		LICENSE
README.md		README.md
REST_comme_class.py		REST_comme_class.py
mainDetection.py		mainDetection.py
main_starter.py		main_starter.py
util_funcs.py		util_funcs.py