Merge pull request NVIDIA-AI-IOT#244 from tokk-nv/dev-clickable

Use jupyter_clickable_image_widget for Road Following data colleciton
jaybdub · Aug 26, 2020 · 0f3b79d · 0f3b79d
2 parents d51c395 + 44a48c2
commit 0f3b79d
Show file tree

Hide file tree

Showing 4 changed files with 429 additions and 129 deletions.
diff --git a/notebooks/road_following/data_collection.ipynb b/notebooks/road_following/data_collection.ipynb
@@ -83,6 +83,7 @@
    "outputs": [],
    "source": [
     "# IPython Libraries for display and widgets\n",
+    "import ipywidgets\n",
     "import traitlets\n",
     "import ipywidgets.widgets as widgets\n",
     "from IPython.display import display\n",
@@ -105,114 +106,26 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Display Live Camera Feed"
+    "### Data Collection"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "First, let's initialize and display our camera like we did in the teleoperation notebook. \n",
+    "Let's display our camera like we did in the teleoperation notebook, however this time with using a special ipywidget called `jupyter_clickable_image_widget` that lets you click on the image and take the coordinates for data annocation.\n",
+    "This eliminates the needs of using the gamepad for data annocation.\n",
     "\n",
-    "We use Camera Class from JetBot to enable CSI MIPI camera. Our neural network takes a 224x224 pixel image as input. We'll set our camera to that size to minimize the filesize of our dataset (we've tested that it works for this task). In some scenarios it may be better to collect data in a larger image size and downscale to the desired size later."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "camera = Camera()\n",
-    "\n",
-    "image_widget = widgets.Image(format='jpeg', width=224, height=224)\n",
-    "target_widget = widgets.Image(format='jpeg', width=224, height=224)\n",
-    "\n",
-    "x_slider = widgets.FloatSlider(min=-1.0, max=1.0, step=0.001, description='x')\n",
-    "y_slider = widgets.FloatSlider(min=-1.0, max=1.0, step=0.001, description='y')\n",
-    "\n",
-    "def display_xy(camera_image):\n",
-    "    image = np.copy(camera_image)\n",
-    "    x = x_slider.value\n",
-    "    y = y_slider.value\n",
-    "    x = int(x * 224 / 2 + 112)\n",
-    "    y = int(y * 224 / 2 + 112)\n",
-    "    image = cv2.circle(image, (x, y), 8, (0, 255, 0), 3)\n",
-    "    image = cv2.circle(image, (112, 224), 8, (0, 0,255), 3)\n",
-    "    image = cv2.line(image, (x,y), (112,224), (255,0,0), 3)\n",
-    "    jpeg_image = bgr8_to_jpeg(image)\n",
-    "    return jpeg_image\n",
-    "\n",
-    "time.sleep(1)\n",
-    "traitlets.dlink((camera, 'value'), (image_widget, 'value'), transform=bgr8_to_jpeg)\n",
-    "traitlets.dlink((camera, 'value'), (target_widget, 'value'), transform=display_xy)\n",
-    "\n",
-    "display(widgets.HBox([image_widget, target_widget]), x_slider, y_slider)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Create Gamepad Controller\n",
-    "\n",
-    "This step is similar to \"Teleoperation\" task. In this task, we will use gamepad controller to label images.\n",
-    "\n",
-    "The first thing we want to do is create an instance of the Controller widget, which we'll use to label images with \"x\" and \"y\" values as mentioned in introduction. The Controller widget takes a index parameter, which specifies the number of the controller. This is useful in case you have multiple controllers attached, or some gamepads appear as multiple controllers. To determine the index of the controller you're using,\n",
-    "\n",
-    "Visit http://html5gamepad.com.\n",
-    "Press buttons on the gamepad you're using\n",
-    "Remember the index of the gamepad that is responding to the button presses\n",
-    "Next, we'll create and display our controller using that index."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "controller = widgets.Controller(index=0)\n",
-    "\n",
-    "display(controller)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Connect Gamepad Controller to Label Images\n",
-    "\n",
-    "Now, even though we've connected our gamepad, we haven't yet attached the controller to label images! We'll connect that to the left and right vertical axes using the dlink function. The dlink function, unlike the link function, allows us to attach a transform between the source and target. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "widgets.jsdlink((controller.axes[2], 'value'), (x_slider, 'value'))\n",
-    "widgets.jsdlink((controller.axes[3], 'value'), (y_slider, 'value'))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Collect data\n",
+    "We use Camera Class from JetBot to enable CSI MIPI camera. Our neural network takes a 224x224 pixel image as input. We'll set our camera to that size to minimize the filesize of our dataset (we've tested that it works for this task). In some scenarios it may be better to collect data in a larger image size and downscale to the desired size later.\n",
     "\n",
-    "The following block of code will display the live image feed, as well as the number of images we've saved.  We store\n",
-    "the target X, Y values by\n",
+    "The following block of code will display the live image feed for you to click on for annocation on the left, as well as the snapshot of last annotated image (with a green circle showing where you clicked) on the right.\n",
+    "Below it shows the number of images we've saved.  \n",
     "\n",
-    "1. Place the green dot on the target\n",
-    "2. Press 'down' on the DPAD to save\n",
-    "\n",
-    "This will store a file in the ``dataset_xy`` folder with files named\n",
+    "When you clicki on the left live image, it stores a file in the ``dataset_xy`` folder with files named\n",
     "\n",
     "``xy_<x value>_<y value>_<uuid>.jpg``\n",
     "\n",
-    "When we train, we load the images and parse the x, y values from the filename"
+    "When we train, we load the images and parse the x, y values from the filenam"
    ]
   },
   {
@@ -221,6 +134,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "from jupyter_clickable_image_widget import ClickableImageWidget\n",
+    "\n",
     "DATASET_DIR = 'dataset_xy'\n",
     "\n",
     "# we have this \"try/except\" statement because these next functions can throw an error if the directories exist already\n",
@@ -229,28 +144,45 @@
     "except FileExistsError:\n",
     "    print('Directories not created becasue they already exist')\n",
     "\n",
-    "for b in controller.buttons:\n",
-    "    b.unobserve_all()\n",
-    "\n",
-    "count_widget = widgets.IntText(description='count', value=len(glob.glob(os.path.join(DATASET_DIR, '*.jpg'))))\n",
-    "\n",
-    "def xy_uuid(x, y):\n",
-    "    return 'xy_%03d_%03d_%s' % (x * 50 + 50, y * 50 + 50, uuid1())\n",
+    "camera = Camera()\n",
     "\n",
-    "def save_snapshot(change):\n",
-    "    if change['new']:\n",
-    "        uuid = xy_uuid(x_slider.value, y_slider.value)\n",
+    "# create image preview\n",
+    "camera_widget = ClickableImageWidget(width=camera.width, height=camera.height)\n",
+    "snapshot_widget = ipywidgets.Image(width=camera.width, height=camera.height)\n",
+    "traitlets.dlink((camera, 'value'), (camera_widget, 'value'), transform=bgr8_to_jpeg)\n",
+    "\n",
+    "# create widgets\n",
+    "count_widget = ipywidgets.IntText(description='count')\n",
+    "# manually update counts at initialization\n",
+    "count_widget.value = len(glob.glob(os.path.join(DATASET_DIR, '*.jpg')))\n",
+    "\n",
+    "def save_snapshot(_, content, msg):\n",
+    "    if content['event'] == 'click':\n",
+    "        data = content['eventData']\n",
+    "        x = data['offsetX']\n",
+    "        y = data['offsetY']\n",
+    "        \n",
+    "        # save to disk\n",
+    "        #dataset.save_entry(category_widget.value, camera.value, x, y)\n",
+    "        uuid = 'xy_%03d_%03d_%s' % (x, y, uuid1())\n",
     "        image_path = os.path.join(DATASET_DIR, uuid + '.jpg')\n",
     "        with open(image_path, 'wb') as f:\n",
-    "            f.write(image_widget.value)\n",
+    "            f.write(camera_widget.value)\n",
+    "        \n",
+    "        # display saved snapshot\n",
+    "        snapshot = camera.value.copy()\n",
+    "        snapshot = cv2.circle(snapshot, (x, y), 8, (0, 255, 0), 3)\n",
+    "        snapshot_widget.value = bgr8_to_jpeg(snapshot)\n",
     "        count_widget.value = len(glob.glob(os.path.join(DATASET_DIR, '*.jpg')))\n",
+    "        \n",
+    "camera_widget.on_msg(save_snapshot)\n",
     "\n",
-    "controller.buttons[13].observe(save_snapshot, names='value')\n",
-    "\n",
-    "display(widgets.VBox([\n",
-    "    target_widget,\n",
+    "data_collection_widget = ipywidgets.VBox([\n",
+    "    ipywidgets.HBox([camera_widget, snapshot_widget]),\n",
     "    count_widget\n",
-    "]))"
+    "])\n",
+    "\n",
+    "display(data_collection_widget)"
    ]
   },
   {
@@ -314,9 +246,9 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.8"
+   "version": "3.6.9"
   }
  },
  "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }