Skip to content

Lane Detection

Basheer Subei edited this page Jul 2, 2015 · 5 revisions

List of line detection steps or tools that could be used as part of a line detection algorithm

  1. Robust Eigenvalue Analysis
  2. Hough Transforms (in many forms)
  3. Skeletonizing (erosion and dilation)
  4. Histogram backprojection to extract colors
  5. RANSAC and other related interpolation(?) methods
  6. Reconstructing a disparity image from pointcloud (to remove pixels from the image which are not on the ground e.g. barrels or sawhorses with white stripes on them)
  7. Edge detection/Canny/Contours
  8. Gabor filter
  9. Dynamic/Directional Brightest pixel threshold (per row or column). Here's how Brightest pixel threshold normally works: for every row of pixels in the image, you find the brightest pixel and zero out all other pixels (like a binary threshold). Do that for every row. This works great but only if the lanes are vertical in the image. So my idea is to change the direction at which you take brightest pixel (instead of looking at every row, you could look at every column, or look diagonally). I will try doing all 3 at the same time (row, column, diagonal) and use the three images to find lines (maybe average the lines or take the best one or merge the images).
  10. Subtraction filter for bright top of image (similar to histogram equalization?)
  11. Look at Hough transform + clustering in order to get the actual line (instead of the many overlayed but not colinear lines). Also consider just straight up line-segment clustering on Hough results like using Eigenclustering.
  12. Adaptive Histogram Equalization (CLAHE), which is only available in OpenCV 3.0.
  13. Since the direction of the lanes doesn't change too drastically, we should look for lanes that are similar in slope to current ones. Based on current slope of the lanes, set the direction of the kernels when filtering (like in Gabor) to be similar.
  14. Applying histogram equalization AFTER initial thresholding (such as with backprojection) will spread the values around so a second thresholding stage will be less sensitive. Maybe not, apparently. It seems that equalizing the histogram after backprojection might lead to making the bright lines a bit faded
  15. In order to solve the problem of lineFit not working for multiple lines, we can use clustering (think PCA) to cluster lines into separate clusters. This could even work great to remove noise (by only grabbing largest 1 or 2 clusters and throwing away small ones)

Going from Image coordinates (pixels) to World coordinates (x,y,z relative to camera)

Using a library (image_geometry) function called projectPixelTo3dRay(uv), a 3d ray (from the center of the camera) that represents the pixel in the 3d world is calculated. This is done for every pixel (and it's given in the camera_optical frame).

The camera_optical frame has the Z axis pointing away from the camera into the image, and the X axis pointing to the right, and the Y axis pointing down (the X and Y axes correspond to the image frame). This is the frame image_geometry works with, and is different from the camera frame (which is the camera in real life, with Z pointing up, X pointing forwards, and Y pointing left). Set up a static tf between camera_optical and camera to fix this. camera_optical vs camera frame

Now, since these 3d rays we got are in the camera_optical frame, we have to transform them into the base_link frame (the base of the robot, which has a Z origin starting at the ground plane). Once they are transformed, you simply find the intersection of these rays with the ground plane defined by z=0. The point where they intersect will give you the x and y coordinate in real life where the pixel will lie (the Z is assumed zero). The node pixel_to_coordinate_calculator is run once (uses camera intrinsics and the TFs for extrinsics), and then it calculates these x and y intersections for each pixel, and writes them to a file.

This file is used as a look-up table during every frame during an actual run. For every image output by line detection (usually a black image with a few white pixels where the lines should be), the pointcloud_publisher node looks up the x and y coordinates for each of those white pixels, and populates a pointcloud with each point representing a white pixel. This pointcloud is published and is placed on the costmap.

Stuff to look into

TODOs

  • Look into clustering.
  • Look into ml/statistical-learning methods for lane detection
  • Varying intensities of line points in the pointcloud (not binary). Also look at Bayesian inference and the Bayes++ library