1. Lane Detection and Tracking
There has been a significant amount of research on vision-based road lane detection and tracking. Vision-based localization of the lane boundaries can be divided into two sub-tasks: lane detection and lane tracking. Lane detection is the problem of locating lane boundaries without prior knowledge of the road geometry. Most lane detection methods are edge-based. After an edge detection step, the edge-based methods organize the detected edges into meaningful structure (lane markings) or fit a lane model to the detected edges. Most of the edge-based methods use straight lines to model the lane boundaries. Others employed more complex models such as B-Splines, parabola, and hyperbola. With its ability to detect imperfect instances of the regular shapes Hough Transform [5] is one of the most common techniques used for lane detection. Hough Transform is a method for detecting lines, curves and ellipses, but in the lane detection literature it’s preferred for the line detection capability. It is mostly employed after an edge detection step on grayscale images. Besides the Hough Transform, many different techniques also have been applied for the lane detection, such as, neural networks, dynamic programming and deformable template matching. Lane tracking, on the other hand, is the problem of tracking the lane edges from frame to frame given an existing model of road geometry. Amny techniques have been used for lane tracking. Among them we can mention the Kalman filtering, and particle filtering.
In [6] Li et al. have proposed a model that uses an adaptive Hough Transform. The images are first converted into grayscale using only the R and G channels of the color image. They have ignored the B channel relying on the good contrast of red and green channels with respect to the white and yellow lane markings. The grayscale image is passed through a very low thresholded Sobel edge detection. Afterwards they apply a special HT which they call RHT (Randomized HT). The pixels of RHT are sampled randomly according to their gradient magnitudes. This method ensures robust and accurate detection of lane markings especially for noisy images. The 3D Hough space is reduced to two dimensions for simplifying the problem and reducing the high computational cost of HT. The results have proven better results compared to GA-based lane detection.
In [7] Yu et al. also use Hough Transform to detect the lane boundaries. This work also considers the pavements at the sidewayds. Since the pavement boundaries are another means of continuous lines, the paper has put special attention on them. The HT is used to detect lane boundaries with a parabolic model. Road pavement types, lane structures and weather conditions have carefully been investigated. The 3-D Hough space is separated into two sub-domains. A 2-D domain of parameters shared by all the edge types, and a 1-D domain of remaining distinctive parameters. This study uses the Canny edge detector to get two images: a binary image denoting the edges and a gradient image denoting the ratio of vertial and horizontal gradients. They have applied the HT several times from a low resolution to the desired resolution images. They call this method multiresolution HT, and they have proven it to reduce the computational cost of classical HT while preserving the accuracy. The proposed ststem is only tested with 34 grayscale images of size 256 x 240. The experiments show that the ststem is capable of handling images of different qualities, paved and unpaved roads, marked and unmarked roads, shadows, and poor illumination conditions.
McCall and Trivedi [8] have designed a system (called VioLET) using steerable filters [9] for robust and accurate lane detection. Steerable filters are especially useful for detecting circular reflector markings, segmented-line markings, and solid-line markings. They are insensitive to varying ligting and road conditions, hence providing robustness to complex shadowing, lighting changes from overpasses and tunnels, and road-surface variations. By computing only three separable convolutions, a wide variety of lane markings can be detected. This study also has an improved curvature detection methodology. They have incorporated the road visual cues (lane markings and lane texture) with the vehicle-state information. The work is one of the most comprehensive ones in the lane detection scope. It contains a detailed literature survey and comparison of the previous work. The proposed system is tested with various quantitative metrics on a long test path using a specially equipped vehicle. With providing different metrics for evaluating lane conditions, the system is made ready to integrate with various driver-assistance systems. Lane keeping, lane changing and special conditions like tunnel entrance and tunnel exit are all tested in detail.
In [10] Pomerleau proposes a learning vision-based autonomous driving system called ALVINN. Neural Network training and learning scheme allows the system to drive in varying enviroments. Single-lane paved and unpaved roads, multilane lined and unlined roads, and roads full of obstacles are among the test environments. Depending on the road conditions, the vehicle moves autonomously at speeds of up to 55 miles per hour. A single hidden layer feedforward neural network takes a 30x32 unit ”retina” as input. The ”retina” image is created either from a video camera or a scanning laser rangefinder. The output layer is 30 units. Each unit is a value representing how sharp to steer to left/right direction in order to follow the road or to prevent colliding with nearby obstacles. The steering directions are disributed linearly. A 4-unit hidden layer connects the input layer to the output layer. The training is done on-the-fly. As the vehicle navigates, the live video sequence is fed into the NN and trained to steer in the same direction as the human driver. Since the proper may not give sufficient diversity of realtime cases, the video sequence is also transformed to create additional training data. This makes the system capable of handling improper driving and road conditions. A buffering technique is used to increase the diversity of sampling. The training on-the-fly scheme has been a novel approach allowing ALVINN to easily train in various environments. Use of laser range images and laser reflectance images have added the capability of following the roads in total darkness and avoiding the obstacles ahead. The system is able to process at 15 FPS, allowing to drive at 55 MPH. The learning capability of the system takes ALVINN one step ahead of the competitor systems. This provides high flexibility across driving situations which cannot be achieved with hand programmed systems. The experiments have shown that, instead of tarining a single network that deals with all road conditions, the system yields better results if exclusive networks are trained for each of the candidate conditions.
The work in [11] presents a a method to find the lane boundaries by combining a local line extraction method and dynamic programming. Initially the position of the lane boundaries are detected by the line extractor which runs on Sobel edge-detected image. To do this, the line extractor clusters similar values of the edge direction from gradient direction of edges. Next, dynamic programming is used to improve the line extractor results. Image frames are divided into horizontal sub-frames for which local edge detection is applied. Dynamic programming calculates the most prominent lines by minimizing the deviation from a virtual straight line. The reason HT is not used in this work is also discussed in detail. HT detects a single line at a time but they are trying to extract two side lines of the white mark. In addition, HT requires a peak search process to find the maximum voting value. The threshold value for edge detection has big impact on the overall performance. They have not proposed a dynamic solution to this problem. The comparison of experimental results with a HT solution has shown that the proposed method yields better results. Also, the computation time of the solution is strongly correlated with the number of lines in the frames.
In [12] Wang et al. have proposed an algorithm based on B-Snake. The algorithm is able to discover a wider range of lanes, especially the curved ones. B-Snake is basically a B-Splines implementation, therefore it can form any arbitrary shape by a set of control points. The system aims to find both sides of lane markings. This is achieved by detecting the mid-line of the lane, followed by calculating the perspective parallel lines. The initial position of the B-snake is decided by an algorithm called CHEVP (Canny/Hough Estimation of Vanishing Points). The control points are detected by a minimum energy method. Snakes [13], or active contours, are curves defined within an image which can move under the influence of internal forces from the curve itself and external forces from the image data. This study introduces a novel B-spline lane model with dual external forces. This has two advantages: First, the computation time is reduced since two deformation problems is reduced into one; Second, the B-snake model will be more robust against shadows, noise, and other ligting variations. The overall system is tested against 50 pre-captured road images with different road conditions. The system is observed to be robust against noise, shadows, and ligting variations. Also has yielded good results for both the marked and the unmarked roads, and the dashed and the solid paint line roads.
In [14] Kluge and Lakshmanan have introduced the well known LOIS (Likelihood of Image Shape) Lane Detection Algorithm for the first time. Instead of using a thresholding method they have proposed a deformable template model. Thresholding is not used because edge-based lane detectors mostly suffer from non-deterministic gradient magnitude thresholds. Shadows, puddles, tire skid marks and oil stains may create undesired edges that will require varying threshold values to be filtered out. LOIS also does not require a strict classification as edge and non-edge points. The likelihood function permits the algorithm to locate lane edges even when contrast is poor or there are many noise edges. LOIS uses the Metropolis algorithm to perform likelihood optimization (to identify the optimal set of template deformation parameters). They have found a set of system parameters that perform well in various road environments. The proposed system is shown to perform well at situations where the lane edges have relatively weak local contrast, or where there are strong distracting edges due to shadows, puddles and pavement cracks. It seems deformable template model suits well to the problem, but they may require to replace the Metropolis algorithm with alternative methods.
Another study from Kreucher et al. [15] uses the LOIS [14] Lane Detection Algorithm [16] to track the lanes. The system emits warning messages if a lane crossing is detected. Vehicle’s location with respect to the lane markings is detected by LOIS, which uses a deformable template approach. This approach has a parametric set of shapes that describes all possible ways that the object can appear in the image. A likelihood function is used to measure how well a particular detected object matches the given image. Previous articles on LOIS focus solely on lane detection where the vehicle is located around the center of two lanes. This paper’s contribution is using a Kalman filter to predict the future values of vehicle’s location considering the previously observed ones. The location is measured in terms of offset values with respect to the right and left lane markings detected by LOIS. If the vehicle is detected to be within one meter of either the left or the right lane marking, and if the vehicles path as predicted by the Kalman filter will lead it to be within 0.8 meters of either lane markings in less than one second, then a lane crossing warning is emitted.
In [17] Apostoloff and Zelinsky presents the first results from a study where a lane tracker was developed using particle filtering and visual cue fusion technology. This is part of a work on Australian National University. Several cameras (passive, active, near-field and far-field coverage) and sensors are located on the vehicle. This research introduces the first use of particle filtering in a road vehicle application. Another contribution of this study is its ability to automatically adopt to road condition variations by using a novel Distillation Algorithm which combines a particle filter with a cue fusion engine. This is a notable enhancement compared to the previous researches which rely on only one or two fixed cues for lane detection that are used regardless of how well they are performing. Distillation Algorithm on the other hand changes the cues dynamically considering the variations on the environment. It is based on Bayesian statistics and is self-optimized to produce the best statistical result. Particle filter is also used to track the detected lanes. The Lane tracker uses two different sets of cues: image based cues (lane marker cue, road edge cue, road color cue, non-road color cue) and the state based cues (road width cue, elastic lane cue). Experiments have shown that particle filter has impressive results for target detection and tracking. While other researches use separate procedures for detection and tracking, usage of particle filter for both tasks have exhibited good results in this study. It also removes the necessity for additional computations.
Similar to LOIS [14], [16], [15], and the lane detection approach proposed in [18] uses a deformable template model. The aim of this study is to overcome problems of Kalman filter based lane trackers. The problem with the Kalman filter based lane tracking is that, they cannot recover after a tracking failure occurance. That is because Kalman filter is based on Gaussian densities which cannot represent simultaneous alternative hypotheses. In the proposed method the lane boundaries are assumed to be parabolas in the ground plane. The lane detection is formulated as a maximum a posteriori (MAP) estimate problem. Tabu search algorithm is used to obtain the global maxima for the posterior density. The detected lanes are tracked using a particle filter that recursively estimates the lane shape and the vehicle position. The proposed model outputs many useful parameters: position of the vehicle inside the lane, its heading direction, and the local structure of the lane.
The General Obstacle and Lane Detection system (GOLD [19]) used in the ARGO vehicle at the University of Parma transforms stereo-vision images into a common bird’s eye view. It uses a pattern matching technique to detect lane markings on the road. A horizontal search is perfomed for dark-bright-dark regions of certain width. The effect of illumination conditions, shadows or sunny blobs is reduced by considering each pixel not globally but rather with respect to its left and right horizontal neighbors. The road marking pixels mostly have higher brightness value than their horizontal neighbors. After brightness analysis step a gray-level image is computed that represents horizontal brightness trainsitions. This lets use of adaptive threshold for image binarization. The proposed system is limited to roads with lane markings as they form the very basis of the search method.
In [20] Bellino et al. presents the lane detection techniques used in SPARC (Secure Propulsion using Advanced Redundant Control) Project financed by EU. This study introduces two new approaches. First, the noise due to vibration of vehicle can be used through Stochastic Resonance. While traditional methods try to avoid the noise, this study uses it to reveal useful information such as the contour of objects and lanes. Second, this study utilizes several sensors (camera, radar, laser) for lane detection, whichever is providing reliable data depending on external conditions (shadows, fog, rain, dark).
W. Enkelmann et al. [21] have built a real-time lane tracking system which handles unmarked lane borders as well as marked lane borders. Kalman filter is used for horizontal and vertical lane curvature estimation. If lane borders are partially occluded by cars or other obstacles, the results of a completely separate obstacle detection module are used to increase the robustness of the lane tracking module. They have also given an algorithm to classify lane types. The illustrated lane tracking system has two subtasks: departure warning and lane change assistant. While the lane departure warning system evaluates images from a front looking camera, the lane change assistant receives signals from back looking cameras and radar sensors.
A recent study from Zhu et al. [22] presents a novel approach for lane detection problem. Instead of using one single method to calculate all parameters in the lane model, the Adaptive Random Hough Transformation (ARHT) and the Tabu Search algorithm are used cooperatively to calculate the different parameters. ARHT is an efficient approach to detect curves, which determines n parameters of the curve by sampling n pixels in the edge image. Tabu Search algorithm is based on a maximum a posteriori (MAP) estimate problem similar to [18]. A multiresolution strategy is employed to reduce the execution time and provide more accurate results, similar to [7]. The proposed system uses a hyperbolic lane model, and therefore is able to detect both straight and curved lanes. ARHT and Tabu Search are used to calculate the parameters of the hyperbolic model. Lane tracking is accomplished by a particle filter. First frame is used by the detection algorithm. The result of the detection algorithm is delivered to the particle filter for tracking. Therefore, tracking starts with the second frame and continues as long as a confidence threshold is satisfied. When confidence threshold is violated, the detection algorithm is called again to generate new initial particles for the tracking algorithm.
Another recent study by Bai et al. [23] uses a different approach for road and lane detection. An extended hyperbola model is used to represent the road. A nonlinear term is integrated into the model to handle transitions between the straight and the curved road segments. The parameters of the model are estimated by multiple vanishing points located on road segments. This paper is primarily focused on road detection rather than lane detection. But it uses lane information to do so, and presents useful techniques for our intentions.
In [23] M. Mandalia and D. Salvucci presents an SVM-based method for lane-change detection. The aim of the proposed system is to detect drivers’ lane change intentions. The technique uses both behavioral and environmental data, but primarily focused on behavioral data. Several features are used for SVM training: acceleration, near-field lane position, far-side lane position, heading, lead car distance, and steering angle. All SVM kernels have been tested, but linear kernel has performed the best results. The system was able to detect about 87% of all true positives within the first 0.3 seconds from the start of the maneuver. Usage of lead-car velocity and eye movements are mentioned to be future enhancements ofr the system.
2. Sign Detection and Classification
There are numerous methods for the detection and recognition of traffic signs. Simialar to the lane detection algorithms, vision-based lane detection systems also mostly suffer from adverse weather and ligting conditions. A sign detection system can be decomposed into two separate parts: detection and classification. Researchers have proposed various techniques for detection and classification. Among the commonly used techniques we can mention Genetic Algorithm, Neural Networks, Kalman Filter, radial symmetry, Ada-Boost and LDA.
One of the early studies on the topic is introduced by Escalera et. al [24] in 1997. Detection is achieved by a shape analysis on a color thresholded image, whereas classification is done by neural networks. Although HSI is very invariant to ligting changes, RGB is preferred in this study. That is because, HSI formulation is nonlinear and therefore requires much processing power. The proposed approach applies a red-color threshold, followed by corner detector for triangular signs and circumference detector for circular signs. The detectors are basically a set of masks used for convolution. Two separate multilayer perceptron NNs have been trained for triangular and circular signs. The size of the input layer corresponds to an image of 30x30 pixels, and the output layer is of size ten, i.e., nine sign types plus one output that shows that the sign is not one of the nine. Ideal signs were used for training. 1620 training patters are created out of them by rotating, adding Gaussian noise and displacing 3 pixels.
In [25] Fang et al. have additionally focused on the tracking of the signs through the image sequence. Prior to tracking phase, they have used two NNs for detecting the signs: one for color features and one for shape features. A fuzzy approach is used to create an integration map of the shape and color features, which in turn is used to detect the signs. To reduce the complexity of detection operations, the system can only detect signs at particular size (8-pixel radius). Once the location of the sign is detected in the current frame, the size and location in the following frame is predicted by a Kalman filter. This significantly reduces the search space and increases the accuracy. Nevertheless, the detection technique proposed in this paper requires a large search space due to the complexity of the integration map.
Bahlmann et al. [26] suggest the use of AdaBoost and Haar wavelet features for detection, and a Gaussian probability density model for classification. Traditional object detection approached generally apply color and shape detection separately one after the other. Regions that have falsely been rejected by the color segmentation, cannot be recovered in the further processing. The main contribution of this paper, with this motivation, is a joint color and shape modeling within the AdaBoost framework. In addition, AdaBoost is mostly used to select gray-scale wavelet features specified by their position, width and height parameters. This study, on the other hand, requires wavelets to be applied on RGB images. Therefore, instead of gray-scale images, they have proposed a method to use RGB color images in AdaBoost framework. The overall system is measured to perform with an error rate of 15%.
Hsu and Huang [27] also use a two-fold approach for traffic signs: detection and recognition. The detection phase also has three stages. In the first stage, a region in the captured image where the road sign is more likely to be found is selected. Here, either the color information or other heuristics (such as possible locations of road signs, geometrical characteristics of the signs) are used. In the second stage, the region of interest is searched to find the possible location of the triangular or circular shape regions. Then, a closer view image is captured focusing the identified regions. In the third stage, template-matching is applied to detect the road signs. In the recognition phase, matching pursuit (MP) filter [28] is used to recognize the road signs effectively. Matching pursuit (MP) algorithm uses greedy heuristic to iteratively decompose any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions. Matching pursuits are general procedures to compute adaptive signal representations. MP based recognition proposed in this paper is unfortunately too costly. While the computation time of the detection phase is 100 ms, the recognition operation using matching pursuit method requires about 250 ms.
Loay and Barnes [29] have developed a time-efficient, rotation-invariant and shape-based road sign detection technique. It can detect triangular, square and octagonal road signs. The method uses the symmetric nature of these shapes. Regular polygons are equiangular i.e., their sides are separated by a regular angular spacing. To utilize this regularity, they introduce a rotationally invariant measure. However, the algorithm has an important limitation such that, for each image frame the algorithm only seeks for predefined radii. Regarding the performance, for a 320x240 image, the algorithm was able to be run at 20Hz. The approach has strong robustness to changing illumination as it detects shapes based on edges, and will efficiently reduce the search for a road sign from the whole image to a small number of pixels. It can detect (without classification) the signs with a success rate of 95%.
An SVM-based study introduced by Maldonado et al. [30] can recognize circular, rectangular, triangular, and octagonal signs. They have used SVM for both detection and classification purposes. Linear SVMs are used as geometric shape classifiers at detection phase. They operate on the color-segmented image (red, blue, yellow, white, or combinations of these colors). After the color segmentation blobs of interest (BoI) are detected. Linear SVM executes on these blobs using the distance to borders (DtBs) as input vectors. For the sign classification phase, on the other hand, Gaussian-kernel SVMs are used. The input to the recognition stage is a block of 31x31 pixels in grayscale image for every candidate blob. In order to reduce the feature vectors, only those pixels that must be part of the sign (pixels of interest) are used. Results show a high success rate and a very low amount of false positives in the final recognition stage. The results show that, the proposed algorithm is invariant to translation, rotation, scale, and, in many situations, even to partial occlusions. This study does not suggest a tracking method. The overall recognition performance of the system is acceptable, and can detect different geometric shapes, i.e., circular and octagonal, and triangular and rectangular. But it requires several performance enhancements in order to be applicable in real-time. The current computation time is 1.77 seconds per frame.
Another SVM-based solution by Kiran et al. [31] introduces an SVM Learning technique for traffic sign classification. Similar to many other studies, they have preferred color segmentation for detection. Hue and saturation channels are used. Shape classification is performed using linear support vector machine. Better shape classification performance is obtained by training the SVM using novel features called distance from center (DtC) and distance to borders (DtB). DfC is defined to be the distance from the centre of the blob to the external edge of the blob, whereas, DtB is distance from the external edge of the blob to its bounding box. Each segmented blob has four DtB vectors and four DfC vectors for left, right, top and bottom. These vectors make the system invariant of translation, rotation and scale factors. Classification is tested by using DtB alone, and also by combining DtB and DfC feature vectors. Circular sign classification is more successful than triangular ones. Also joint features usage yields slightly better results. The lassification success rate is around 90%, and the true positives rate is around 96%.
In [32] Jimenez et al. focus just on the sign detection problem, dividing it into two sub-blocks that perform shape classification and localization of the sign. This work is a predecessor of [30] which used two different SVMs for detection and classification. The main contribution of this work is basically the improvement of the detection block, where the new method developed here has proven to be more successful than the distance to borders (DtB) method, defined in their previous work [30]. The classification of the shape is achieved by means of the connected components. Object rotations are handled with the use of the FFT. The signature of each blob was used for the classification of the shape of the traffic sign. The normalization of the energy of the signature makes the algorithm invariant to image scaling, and the use of the absolute value of the FFT of the normalized signature makes the algorithm invariant to object rotations. Experimental results, evaluated using a huge set of randomly generated synthetic images are also given, showing a great robustness to object scaling, rotation, projective deformation, partial occlusions and noise.
A more recent study of Escalera et al. [33] uses genetic algorithm for the detection, and a neural network for classification. The proposed system not only recognises the traffic sign but also provides information about its condition or state. Traffic signs are detected trough color and shape analysis. First the hue and saturation components of the image are analysed and the regions in the image that fulfill some color restrictions are detected. If the area of one of these regions is big enough, a possible sign can be located in the image. The perimeters of the regions are obtained and a global search of possible signs is performed with an elitist GA. The initial population of the GA is not random, but rather is created according to the color analysis results. A thresholding of the color analysis image is performed and the number and position of the blobs are obtained. The fitness function is basically the proportion of the number of points whose distance is less than a threshold value. For NN training, RGB is preferred instead of HSI, due to HSI’s instability to obtain the hue value of gray colors. Some researches have used the I component, but the color information would be lost because a dark red pixel (belonging to the sign border) would have the same value as a dark gray. The NN is finally followed by an additional sign state analysis step. This helps the algorithm, not only know the detected sign, but also the confidence in its detection.
Soetedjo and Yamada [34] have focused on the traffic sign classification using Ring Partitioned Method on grayscale images. In contrast to the previously discussed methods, this study does not require many carefully prepared samples for training. In the pre-processing stage, a special method is used to convert the RGB image into a grayscale format which is invariant to illumination changes (called ”specified grayscale image”). First, color thresholding is applied for each of the red, blue, white and black colors. This produces four grayscale images corresponding to four mentioned colors. These grayscale images are combined by the histogram specification method, a technique to convert an image into one with particular histogram specified in advance. The method divides a rectangular ”specified grayscale image” into several rings, which constitute the ring-partitioned image. A fuzzy histogram value is calculated for each ring, providing better smoothed values. The Euclidean’s distance is used for matching. It measures the distance between the target image and the reference images. The proposed system has a matching rate of around 95%. But the circular nature of the rings makes the system only applicable to the circular signs.
Another different approach is to represent the sign features by using a human vision colour appearance model [35]. CIECAM97 [36] color appearance model has been applied to extract color information and to segment and classify traffic signs. CIECAM97 is a standard colour appearance model recommended by CIE (International Colour Commission on Illumination) in 1997 for measuring colour appearance under various viewing conditions. It takes weather conditions into account and simulates human’s perception for perceiving colours under various viewing conditions and for different media, such as reflection colours, transmissive colours, etc. For the segmentation step, the color ranges (hue and choroma) for red, blue, black, and white are detected. Only blue and red signs are used in this study. Based on the range of sign colours, traffic sign-to-be are segmented using quad-tree histogram method from the rest of scenes for further processing. Along with the color features, the method also applies a method fro modeling shape features. Recognition rate is very high for signs under artificial transformations that imitate possible real world sign distortion (up to 50% for noise level, 50 m for distances to signs, and 5◦ for perspective disturbances) for still images.
H. Fleyeh [37] has proposed a fuzzy approach for traffic sign color detection and segmentation. RGB image taken by a digital camera is converted into HSV and segmented by a set of fuzzy rules depending on the hue and saturation channels. The fuzzy rules just do the segmentation of the colors of the sign. The model evaluates the appearance and the color of objects with respect to: 1) the color of incident light depending on CIE [36] curve; 2) the reflectance properties of the object, which is a function of the wavelength of the incident light; 3) the camera properties. HSV color space is used because hue is invariant to the light variations and saturation changes. Seven fuzzy (if-then) rules are applied with respect to the hue and saturation values. The method does not do a classification of the detected signs.
Miura et al. [38] have used two cameras to recognize the traffic signs. One camera has a wide-angle lens and is directed to the moving direction of the vehicle, whereas the other camera is equipped with a telephoto lens and can change the viewing direction to focus the attention to the target sign. The detection process first extracts the candidates by color and intensity. Next, telephoto camera is directed to the region of interest and captures a closer view of the candidate signs. For detecting the circles they use the fact that; if an edge is a part of a circle, the center of the circle should exist on the line which passes the edge and has the same direction as the gradient of the edge. After detecting the circles with regard to a fixed threshold value, the classification is achieved by a normalized correlation-based pattern matching technique using a traffic sign image database.
Piccioli et al. [39] also incorporated both color and edge information to detect road signs from a single image. They applied the Kalman-filter-based temporal integration of the extracted information for further improvement. They claimed that to improve the performance, their technique could be applied to temporal image sequences. In fact, the detection of road signs using only a single image has three problems: 1) to reduce the search space and time, the positions and sizes of road signs cannot be predicted; 2) it is difficult to correctly detect a road sign when temporary occlusion occurs; and 3) the correctness of road signs is hard to verify. By using a video sequence instead of temporal images, the information from the preceding images, such as the number of the road signs and their predicted sizes and positions can be preserved. This information can be used to increase the speed and accuracy of road-sign detection in subsequent images.
Another work by Garcia-Garrido et al. [40] intends to recognize both circular (prohibition and obligation) and triangular signs. The system comprises of three stages. First, detection is performed by the Hough transform. Canny edge detector is preferred beacuse it preserves the contours. The threshold for Canny algorithm is determined dynamically, according to the histogram. This approach helps to handle various weather and lighting conditions, and even night-time driving. For triangular signs, the aim is to detect three straight lines intersecting each other, forming a 60 degrees-angle. But Hough transform does not mention the start and end points of the lines. If the approach is applied to the whole image, it would yield too may intersecting lines. To overcome this, the HT is applied to every contour successively. Second, neural network is used for classification. Two different neural networks have been implemented; one of them identifies whether it is a triangular sign or not, and its type; and the other one recognizes the circular signs. Both are backpropagation neural networks, where the input is a 32x32 pixel-size normalized image of the candidate sign. Finally, Kalman filter is employed for tracking, which provides the system with memory. Kalman filter clearly improves the computational time. The experiment show that the proposed system has a recognition rate of 98.5% for speed limit signs, and 97.2% for warning signs. The system has shown reliable and robust in sunny, cloudy, and rainy days, and also at night. The average processing time of 30 ms per frame makes the system a good approach to work in real time conditions.
In a very recent study Ruta et al. [41] have developed a two-stage symbolic traffic sign detection and classification system. The detector is basically is a circle/regular polygon detector with color pre-filtering. For the classification stage, they introduce a novel feature selection algorithm that extracts for each sign a small number of critical local image regions having the most dissimilarity between the candidate and the other signs. The comparison to the set of target signs is made using a distance metric based on color distances. The Kalman filter based tracker is additionally employed in each frame to predict the position and the scale of a previously detected sign and hence to reduce computation. Owing to the tracker, the sign detector is only triggered every several stages for a set of ranges to detect new sign candidates. This study has three important aspects. First, feature extraction, hence training, is simple because it is performed directly from the publicly available sign templates. Second, each template is treated and trained individually providing a means for measuring dissimilarity from the remaining templates. Finally, the usage of color distance metrics has proven to be suitable for modeling various traffic sign although trained from ideal sing templates.
3. References
1. Darpa grand challenge, 2004-2007.2. Continental advanced driver assistance systems.
3. Mercedes benz speed limit assist.
4. European commission: European transport policy for 2010. http://europa.eu/legislation summaries/environment/tackling climate change/l24007 en.htm, Oct 2007.
5. P.V. Hough. Method and means for recognizing complex patterns, 1962.
6. Qing Li, Nanning Zheng, and Hong Cheng. Springrobot: a prototype autonomous vehicle and its algorithms for lane detection. Intelligent Transportation Systems, IEEE Transactions on, 5(4):300308, Dec. 2004.
7. B. Yu and A.K. Jain. Lane boundary detection using a multiresolution hough transform. In Image Processing, 1997. Proceedings., International Conference on, volume 2, pages 748751 vol.2, Oct 1997.
8. J.C. McCall and M.M. Trivedi. Video-based lane estimation and tracking for driver assistance: survey, system, and evaluation. Intelligent Transportation Systems, IEEE Transactions on, 7(1):2037, March 2006.
9. W.T. Freeman and E.H. Adelson. The design and use of steerable filters. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 13(9):891906,Sep 1991.
10. Dean Pomerleau. Neural network vision for robot driving. In M. Arbib, editor, The Handbook of Brain Theory and Neural Networks. 1995.
11. Dong-Joong Kang and Mun-Ho Jung. Road lane segmentation using dynamic programming for active safety vehicles. Pattern Recogn. Lett., 24(16):31773185,2003.
12. Yue Wang, Eam Khwang Teoh, and Dinggang Shen. Lane detection using b-snake. Information, Intelligence, and Systems, International Conference on, 0:438, 1999.
13. Michael Kass, Andrew Witkin, and Demetri Terzopoulos. Snakes: Active contour models. INTERNATIONAL JOURNAL OF COMPUTER VISION, 1(4):321331, 1988.
14. K. Kluge and S. Lakshmanan. A deformable-template approach to lane detection. In Intelligent Vehicles '95 Symposium., Proceedings of the, pages 5459, Sep 1995.
15. C. Kreucher, S. Lakshmanan, and K. Kluge. A driver warning system based on the lois lane detection algorithm. In Proceedings of IEEE International Conference on Intelligent Vehicles, pages 1722. Stuttgart, Germany, 1998.
16. S. Lakshmanan and K. Kluge. Lois: A real-time lane detection algorithm. In Proceedings of the 30th Annual Conference on Information Sciences and Systems,1996.
17. N. Apostoloff and A. Zelinsky. Robust vision based lane tracking using multiple cues and particle filtering. In Intelligent Vehicles Symposium, 2003. Proceedings.
IEEE, pages 558563, June 2003.
18. Y. Zhou, R. Xu, X. Hu, and Q. Ye. A robust lane detection and tracking method based on computer vision. Measurement Science and Technology, 17(4):736745, 2006.
19. Massimo Bertozzi, Alberto Broggi, Gianni Conte, Alessandra Fascioli, and Ra Fascioli. Obstacle and lane detection on the argo autonomous vehicle. In in Proc. IEEE Intelligent Transportation Systems Conf.'97, 1997.
20. Mario Bellino, Yuri Lopez De Meneses, Peter Ryser, and Jacques Jacot. Lane detection algorithm for an onboard camera. In SPIE proceedings of the first Workshop on Photonics in the Automobile, 2004.
21. Wilfried Enkelmann. Video-based driver assistance--from basic functions to applications. Int. J. Comput. Vision, 45(3):201221, 2001.
22. Li Bai, Yan Wang, and Michael Fairhurst. An extended hyperbola model for road tracking for video-based personal navigation. Know.-Based Syst., 21(3):265272, 2008.
23. Hiren M. Mandalia and Dario D. Salvucci. Using support vector machines for lane change detection. In In Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting, 2005.
24. A. de la Escalera, L.E. Moreno, M.A. Salichs, and J.M. Armingol. Road traffic sign detection and classification. Industrial Electronics, IEEE Transactions on, 44(6):848859, Dec 1997.
25. Chiung-Yao Fang, Sei-Wang Chen, and Chiou-Shann Fuh. Road-sign detection and tracking. Vehicular Technology, IEEE Transactions on, 52(5):13291341, Sept.2003.
26. C. Bahlmann, Y. Zhu, Visvanathan Ramesh, M. Pellkofer, and T. Koehler. A system for traffic sign detection, tracking, and recognition using color, shape, and
motion information. In Intelligent Vehicles Symposium, 2005. Proceedings. IEEE, pages 255260, June 2005.
27. Hsu and Huang. Road sign detection and recognition using matching pursuit method. Image and Vision Computing, 19(3):119129, February 2001.
28. S.G. Mallat and Zhifeng Zhang. Matching pursuits with time-frequency dictionaries. Signal Processing, IEEE Transactions on, 41(12):33973415, Dec 1993.
29. G. Loy and N. Barnes. Fast shape-based road sign detection for a driver assistance system. In Intelligent Robots and Systems, 2004. (IROS 2004). Proceedings. 2004 IEEE/RSJ International Conference on, volume 1, pages 7075 vol.1, Sept.-2 Oct.2004.
30. S. Maldonado-Bascon, S. Lafuente-Arroyo, P. Gil-Jimenez, H. Gomez-Moreno, and F. Lopez-Ferreras. Road-sign detection and recognition based on support vector
machines. Intelligent Transportation Systems, IEEE Transactions on, 8(2):264278, June 2007.
31. C.G. Kiran, L.V. Prabhu, V.A. Rahiman, K. Rajeev, and A. Sreekumar. support vector machine learning based traffic sign detection and shape classification using distance to borders and distance from center features. In TENCON 2008 - 2008, TENCON 2008. IEEE Region 10 Conference, pages 16, Nov. 2008.
32. Pedro Gil Jimenez, Saturnino Maldonado Bascon, Hilario Gomez Moreno, Sergio Lafuente Arroyo, and Francisco Lopez Ferreras. Traffic sign shape classification and localization based on the normalized fft of the signature of blobs and 2d homographies. Signal Process., 88(12):29432955, 2008.
33. A. De La Escalera, J. M A Armingol, and M. Mata. Traffic sign recognition and analysis for intelligent vehicles. Image and Vision Computing, 21:247258, 2003.
34. Aryuanto Soetedjo and Koichi Yamada. K.: Traffic sign classification using ring partitioned method. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E, 88:24192426, 2005.
35. XW Gao, L. Podladchikova, D. Shaposhnikov, K. Hong, and N. Shevtsova. Recognition of traffic signs based on their colour and shape features extracted using human vision models. Journal of Visual Communication and Image Representation, 17(4):675685, 2006.
36. M. R. Luo and R. W. G. Hunt. The structure of the cie 1997 colour appearance model (ciecam97s). Color Research & Application, 23:138146, 1998.
37. Hasan Fleyeh. Road and traffic sign color detection and segmentation-a fuzzy approach. 2005.
38. J. Miura, T. Kanda, and Y. Shirai. An active vision system for real-time traffic sign recognition. In Intelligent Transportation Systems, 2000. Proceedings. 2000 IEEE, pages 5257, 2000.
39. Giulia Piccioli, Enrico De Micheli, and Marco Campani. A robust method for road sign detection and recognition. In ECCV '94: Proceedings of the third European
conference on Computer vision (vol. 1), pages 495500, Secaucus, NJ, USA, 1994. Springer-Verlag New York, Inc.
40. M.A. Garcia-Garrido, M.A. Sotelo, and E. Martm-Gorostiza. Fast traffic sign detection and recognition under changing lighting conditions. In Intelligent Transportation Systems Conference, 2006. ITSC '06. IEEE, pages 811816, Sept. 2006.
41. Andrzej Ruta, Yongmin Li, and Xiaohui Liu. Real-time traffic sign recognition from video by class-specific discriminative features. Pattern Recognition, 43(1):416
430, 2010.
42. John Canny. A computational approach to edge detection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, PAMI-8(6):679698, Nov. 1986.
43. K. Ratnayake and A. Amer. An fpga-based implementation of spatio-temporal object segmentation. In Image Processing, 2006 IEEE International Conference on, pages 32653268, Oct. 2006.
44. Ades project website.
45. Melanie Mitchell. An Introduction to Genetic Algorithms. The MIT Press, 1998.
46. Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. Surf: Speeded up robust features. In In ECCV, pages 404417, 2006.
47. Kevin P. Murphy, Antonio B. Torralba, Daniel Eaton, and William T. Freeman. Object detection and localization using local and global features. In Jean Ponce, Martial Hebert, Cordelia Schmid, and Andrew Zisserman, editors, Toward Category-Level Object Recognition, volume 4170 of Lecture Notes in Computer Science, pages 382400. Springer, 2006.
48. Alan Koncar, Holger Janen, and Saman Halgamuge. Gabor wavelet similarity maps for optimising hierarchical road sign classifiers. Pattern Recognition Letters, 28(2):260 267, 2007.
49. Avinoam Borowsky, David Shinar, and Yisrael Parmet. Sign location, sign recognition, and driver expectancies. volume 11, pages 459 465, 2008.
50. Miguel S. Prieto and Alastair R. Allen. Using self-organising maps in the detection and recognition of road signs. Image and Vision Computing, 27(6):673 683, 2009.
51. X.W. Gao, L. Podladchikova, D. Shaposhnikov, K. Hong, and N. Shevtsova. Recognition of traffic signs based on their colour and shape features extracted using human vision models. Journal of Visual Communication and Image Representation, 17(4):675 685, 2006.
52. Seung Gweon Jeong, Chang Sup Kim, Kang Sup Yoon, Jong Nyun Lee, Jong Il Bae, and Man Hyung Lee. Real-time lane detection for autonomous navigation. In Intelligent Transportation Systems, 2001. Proceedings. 2001 IEEE, pages 508513,2001.
53. Wenhong Zhu, Fuqiang Liu, Zhipeng Li, Xinhong Wang, and Shanshan Zhang. A vision based lane detection and tracking algorithm in automatic drive. In Computational Intelligence and Industrial Application, 2008. PACIIA '08. Pacific-Asia Workshop on, volume 1, pages 799803, Dec. 2008.















