Skip to content

Advanced lane line detection for Self-Driving Cars using Python and OpenCV

Notifications You must be signed in to change notification settings

ajimenezjulio/P2_Advanced_Lane_Finding

Repository files navigation

Advanced Lane Finding

Python 3.6 Udacity - Self-Driving Car NanoDegree

Lane line detection is a critical technique for the design of systems that allows a car to drive itself. The detection of these lines allows our car to stay on right path and follow it, in the same way that we use our eyes to stay on the lane. This project uses an approach based on image processing, it uses Python as the main language and OpenCV as complementary framework for image analysis and processing.


1. Approach

The approach consisted of 6 steps:

  1. Distortion correction: Light rays often bend a little too much or too little at the edges of the curved lenses in real cameras, this creates a radial distortion and if the camera’s lens is not aligned perfectly parallel to the imaging plane a tangential distortion also occurs. In this project the distortion coefficients were calculated to correct radial distortion using a 9 x 6 corners chessboard images.
    • Radial distortion:

      x_{distorted} = x_{ideal} (1 + k_1r^2 + k_2r^4 + k_3r^6)

      y_{distorted} = y_{ideal} (1 + k_1r^2 + k_2r^4 + k_3r^6)

    • Tangential distortion:

      x_{corrected} = x + [2p_1xy + p_2(r^2 + 2x^2)]

      y_{corrected} = y + [p_1(r^2 + 2y^2) + 2p_2xy]


  1. Perspective transform: A perspective transform maps the points in a given image to different, desired, image points with a new perspective. In this project a bird’s-eye view transform that let’s us view a lane from above was used (it is useful for calculating curvature in the next steps).
    • Calculate transformation and inverse matrices from four pair of points:

      \begin{bmatrix} t_i x'_i \\ t_i y'_i \\ t_i \end{bmatrix} = \texttt{map\_matrix} \cdot \begin{bmatrix} x_i \\ y_i \\ 1 \end{bmatrix}

       \texttt{where:} \,\, dst(i)=(x'_i,y'_i), src(i)=(x_i, y_i), i=0,1,2,3

    • Apply perspective transform using the matrix:

      \texttt{dst} (x,y) = \texttt{src} \left ( \frac{M_{11} x + M_{12} y + M_{13}}{M_{31} x + M_{32} y + M_{33}} , \frac{M_{21} x + M_{22} y + M_{23}}{M_{31} x + M_{32} y + M_{33}} \right )

    • Points used in this project:

      \texttt{src} = \begin{bmatrix} 258 & 682\\ 575  & 464 \\ 707 & 464 \\ 1049 & 682 \end{bmatrix} \;\;\;\; 
\texttt{dst} = \begin{bmatrix} 450 & img_h \\ 450 & 0 \\ img_w - 450 & 0 \\ img_w - 450 & img_h\end{bmatrix}


  1. Thresholding: Thresholding is a method of segmenting image. In this project we use a combined threshold (sobel and colorspaces) to create a binary image where the lane line pixels are activated.
    • Convolute a sobel of kernel size of 5 in the x direction with the image:

      
Sob &= s_x \circledast img

    • Get S channel from HLS colorspace transform:

      
V_{max} \leftarrow {max}(R,G,B) \;\;\;\;
V_{min} \leftarrow {max}(R,G,B)

      
L \leftarrow \frac{V_{max} + V_{min}}{2} \;\;\;\;\;\;\;\;
S \leftarrow \begin{cases} 
\dfrac{V_{max} - V_{min}}{V_{max} + V_{min}} & if \(L < 0.5\) \\ \\
\frac{V_{max} - V_{min}}{2 - (V_{max} + V_{min})} & if \(L \ge 0.5\)
\end{cases}

    • Get B channel from LAB colorspace transform:

      
\begin{bmatrix}{X \\ Y \\ Z} \end{bmatrix}
\leftarrow
\begin{bmatrix}{0.412453 & 0.357580 & 0.180423 \\ 0.212671 & 0.715160 & 0.072169 \\ 0.019334 & 0.119193 & 0.950227}\end{bmatrix}
\cdot \begin{bmatrix}{R \\G \\ B}\end{bmatrix}

      
X \leftarrow X/X_n, \text{where} X_n = 0.950456 \;\;\;\;\;\;\;\;\;
Z \leftarrow Z/Z_n, \text{where} Z_n = 1.088754

      
L \leftarrow \begin{cases} 116*Y^{1/3}-16 & for \ Y>0.008856\ \\ 
903.3*Y & for \ Y \le 0.008856\} \end{cases}

      
A \leftarrow 500 (f(X)-f(Y)) + delta \;\;\;\;\;\;\;\;\;
B \leftarrow 200 (f(Y)-f(Z)) + delta

      
f(t)= \begin{cases} t^{1/3} & for \ t>0.008856\) \\
7.787 t+16/116 & for \ t\leq 0.008856\) \end{cases}

      
delta = \begin{cases} 128 & \texttt{for 8-bit images} \\
0 & \texttt{for floating-point images} \end{cases}

    • Every channel is then binarized using the next criteria:

      
Sob \leftarrow \begin{cases} 
1 & if \  40 <= Sob <= 255 \\ 
0 & otherwise \end{cases}
\ \ \ 
S \leftarrow \begin{cases}  
1 & if \  70 <= S <= 255 \\ 
0 & otherwise \end{cases}
\ \ \
B \leftarrow \begin{cases} 
1 & if \  140 <= B <= 255 \\ 
0 & otherwise \end{cases}

    • Finally, the binary image is made by the union of the 3 channels:

      
binary = Sob \cup S \cup B


  1. Polynomial fit: After the threshold, we fit a quadratic polynomial to each of the lane lines, this step will allow us to detect curves in the road.
    • First we need to calculate a histogram in the bottom half of the image by summing all the pixels in the y axis, two peaks will be detected and will be the base for the left and right lines. To avoid noise, margins () in the left and right edges were added.

      
\begin{tikzpicture}[isosceleles trapezium/.style args={of width #1 and height #2
and name #3}{insert path={
(45:{#1/sqrt(2)}) coordinate(#3-TR) -- (-45:{sqrt(#2*#2-#1*#1/2)}) coordinate(#3-BR) 
-- (-135:{sqrt(#2*#2-#1*#1/2)}) coordinate(#3-BL) -- (135:{#1/sqrt(2)}) coordinate(#3-TL) -- cycle}}]
\draw[isosceleles trapezium=of width 1 and height 3 and name my trap];
\draw[latex-latex] ([xshift=-10mm, yshift=12mm]my trap-TL -| my trap-BL) -- 
([xshift=-10mm]my trap-BL) node[midway,fill=white] {$h$};
\draw[draw=black and name myRect] (-2.8, 1.7) rectangle ++(5.5,-3.75);
\draw[latex-latex] ([xshift=-8mm, yshift=-3mm] my trap-BL) -- ([yshift=-3mm] my trap-BL)
node[midway,fill=white] {$\alpha$};
\draw[latex-latex] ([yshift=-3mm] my trap-BL) -- ([yshift=-3mm] my trap-BR)
node[midway,fill=white] {$b_i$};
\end{tikzpicture}

      
\texttt{where } \alpha = \frac{img_w}{4}

    • Next step we use the sliding window algoritm to fit 15 windows in each line starting from the base lines until the top of the image in order to look for all the pixels related to the lines. Once all pixels for each line are detected we fit a second degree polynomial in each set of the lines by minimizing the squared error.

      
y_k = x_k^np_0 + ... + x_kp_{n-1} + p_n

      
E = \sum^k |p(x_j)-y_j|^2

    • For optimization, once on the next frame we don't need to blindly look again for the lines, but use previous information. Adding a margin in the polynomial equation we can create a region of interest customized for every frame that follows the line in the previous frame.

      
ROI = \begin{cases}
ay^2 + bx + c - margin \\
ay^2 + bx + c + margin
\end{cases}


  1. Polynomial fit validation: In order to determine if the fit found is valid we need to met some criteria.
    • In case that a previous fit exists we need to compare the difference between coefficients, if the difference is too big, then it's not a valid one (differences in line coefficients shouldn't vary that much from frame to frame).

      
fit \leftarrow
\begin{cases}
invalid & if \ \ a > 0.001 \ \oplus \ b > 1 \ \oplus \ c > 100
\\
valid & otherwise
\end{cases}

    • In order to make a smooth transition the fit in the current frame will be equal to an average of the previous 5 fits.

      
current\_fit = \frac{\sum_i^4 fit_i}{5}


  1. Measuring curvature: Having the coefficients we can calculate the radius of curvature at any point of the function .


R_{curve} = \frac{\left( 1 + \left( \frac{dx}{dy} \right)^2 \right)^\frac{3}{2} }
{\left| \frac{d^2x}{d^2y} \right|}


f'(y)= \frac{dx}{dy} = 2ay + b \ \ \ \ \ \ \ \ \
f''(y)= \frac{d^2x}{d^2y} = 2a


R_{curve} = \frac{\left( 1 + \left( 2ay + b \right)^2 \right)^\frac{3}{2} }
{\left| 2a \right|}


2. Shortcomings

There are some flaws in this approach:

  • It is not illumination invariant, so the lack of contrast and light would result in a malfunction of the program.

  • Thresholding is without a doubt the core of the problem, playing with multiple thresholdings make the model more robust but also can detect undesired patterns and objects that will lead to a wrong lane detection.

  • A fixed camera position and consistent image size are assumed, so positioning the camera at a different angle and handling different image sizes would have undesirable results.


3. Improvements

Some points of the pipeline can be improved.

  1. Perspective Transform: The images and videos have a fixed size and perspective, right now the 4 selected points for the transform are fixed, so a better way to select them automatically based on features to handle multiple sizes and camera angles will be necessary for a more robust approach

  2. Thresholding: Without a doubt trhesholding is the core of the algorithm, a more robust method for illumination, rotation and translation invariance should be developed for a real application. This approach showed a good performance for the project_video and a fairly average for the challenge_video, but its completely useless for the harder_challenge_video.

  3. Polynomal fit: Sliding windows algorithm and its optimization for looking around previous polynomial are pretty robust methods for lane line detection, minor improvements could be done here. However, these methods are strongly dependant on the thresholding for finding a good fit, i.e. we are as strong as the weakest link.

About

Advanced lane line detection for Self-Driving Cars using Python and OpenCV

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published