Poster Paper Proc. of Int. Conf. on Advances in Computer Engineering 2011
Mobile Robot Localization using Single Camera Vision for Automated Test Bed by using DWT and HSI Color Model Piyush Kapoor1, Pavan Chakraborty1, G.C. Nandi1, Vibhor Agrawal1 1
Indian Institute of Information Technology, India piyushkapoor7@yahoo.com pavan@iiita.ac.in, gcnandi@iiita.ac.in, vibhor_vaibhav@hotmail.com computed via circular statistics and (4) degree of existence of line segments, computed by largest value of voting in hough space in that block for edge points. The method was sensitive to the changing camera posture. It was much dependent on the camera’s posture used for taking images used for the training. The results depicted that it was able to have success of about seventy eight percent. Then same author in [7] used panoramic images instead of normal image to improve the results. The accuracy increased but with the trade-off with increased computational complexity. [8] Focuses on the changing illumination conditions and changing camera’s posture in the problem of robot localization. They instead of using the direct image features used the histograms as the features. They used YCrCb color model instead of RGB to reduce the effect of changing lightning conditions. Because a Cr and Cb component represents the red color difference and the blur color difference with the corresponding intensity image i.e. ‘Y’ component. Problem of changing camera posture was tackle by using the edge density histogram computed from edge count of each block. Thus considered the overall edge distribution, which is independent of changing camera’s position of taking the image. In our application the camera’s changing will be an issue. But changing lightning and illumination condition is not big issue as automated car would roam around in the campus only in day time. But still to encompass it data is being collected on five different days and at different day time.
Abstract— The color information of the environment which is to be sensed by camera to perform localization is more precisely stored in the alone ‘Hue’ component of the ‘HSI’ color model. While when using the ‘RGB’ color model we need to use all the ‘R’, ‘G’ & ‘B’ components separately to measure color or its distribution. Also in representing edge pixels distribution an extra edge computation is needed to be performed in the same. Discreet Wavelet Transform (DWT) is being used to reduce the color information area to half and higher frequency components could directly be used to represent the edge pixels. Also the problem of changing camera posture is being taken into the consideration. For this histogram technique is utilized. When implementing on real time system time complexity is important factor. The computation is reduced by making use of above defined concepts. The result shows that the proposed algorithm works well to solve the localization problem as required by the undergoing project of Automated Car. Index Terms— HSV Color Model, Discreet Wavelet Transform (DWT).
I. INTRODUCTION For any robot to navigate independently in the outdoor environment it is must to be able to localize itself i.e. robot must know in which location is it. Despite of what type of robot in use weather wheeled, crawled or legged localization is essential task for it to perform. Localization could be performed by utilizing the sensed information from external environment by the sensors employed on the robot. Sonar sensors are widely used to know about the external environment of robot. But the major drawback of sonar sensor is its cost i.e. not cheaply & easily available to labs where it is to be used and tested and also the complexity involved in using them is high. Alternating to it is cameras which are very easily and cheaply available. Also image provides us with the maximum information of the environment which can be visualized by system performing localization by applying image processing operations which is being depicted in various previous works [1]-[5]. The only drawback of using this methodology is time complexity. The methodology we are to develop would purely be in accordance and better suited with our application. [6] Extracted the four features of the image to train the localization system. Firstly divide the image in to the blocks of size 16 × 16 and following features were extracted from each block: (1) RGB color components, (2) edge density, (3) degree of distribution of edge directions © 2011 ACEEE DOI: 02.ACE.2011.02. 46
II. BASIC CONCEPTS
OF ALGORITHM
The main aim of the present work is to develop a localization module that could be implemented on an automated four wheeled car under development in our Robotics & Artificial Intelligence Lab at Indian Institute of Information Technology Allahabad. The car would roam around throughout the campus and could be used for various applications. So, our campus is used for localization. However, the model could be well applied to perform localization in any other environment too. To perform efficient localization the issue of changing camera posture is very essential because it is sure that every time the car/robot passes through any location via road its orientation to take image will always be different. Also sometimes images would be taken from nearby and sometimes farther away from distance. Since presently the car/robot is being developed to wander in campus in day time only so changing illumination and lightning condition is 193
Poster Paper Proc. of Int. Conf. on Advances in Computer Engineering 2011 not the much of issue. Rather we have utilized this to improve our location reorganization results. So, instead of using the YCrCb color model it is more beneficiary to use Hue, Saturation and Intensity (HSI) color model. Because “H” and “S” components represented the color in a very much better and concise way than “Cr” and “Cb” component. Also while using RGB color model all three components needed to be used to use color as one of the feature. While in HSI color model only Hue is able to represents the color. Also Saturation could be used as an extra feature with Hue to increase accuracy. Another issue is computational complexity. Since, the module is needed to be implemented on the real time system so result must be able to get provided in real time without having any losses to car/robot. This has to be done without any loss of any relevant details of the external environment. So, for this we have firstly, converted the taken video frame into its Discreet Wavelet Transform, which transform into four time-frequency components i.e. LL, LH, HL and HH each one fourth to the size of original image (where ‘H’ & ‘L’ denoting low and high frequency). The color information of the environment is preserved in LL component. And for the edges in the scene LH or HL components was easily utilized. Hence no need to use any separate edge finding function. Thus, image size to be processed for extracting feature is reduced to one fourth without any feature losses and eliminating the edge finding module. This has improved the computational complexity.
o On each block count the number of edges. ed (i, j) = en = o Histogram of number of edge pixels in each box is generated & is used as a feature to train the localization system. o Combine both computed features to generate a generalized feature for training. The system was trained using around five hundred images. The finalized feature vector was used as a reference at the time of classification which was found by computing the Centroid of the features of the similar class’s features. The classification algorithm is as follows: Step 1: Extract Feature from the image to be tested. Step 2: Generate Location Identifier (defined in next section). Step 3: Compare through Location Identifier extracted feature with reference feature vector of considered classes. Step 4: The most closest or nearest feature vector is of the selected classified class
III. PROPOSED ALGORITHM The whole process is divided in to the following steps: (1) Huge Data Collection which is to be used for training and as well as testing. (2) Features Extraction. (3) Training the Localization System. (4) Testing phase to evaluate & analyze results. Huge data is being collected from IIIT A campus on five different days. In total more than ten thousand image frames were acquired from video clips captured from though out the campus. From that about 500 images were used for training the localization system. Following are the steps involved in feature extraction phase: o Compute 2 D DWT of RGB color image of size N × N. [LL, HL, LH, HH]=DWT (Image (R, G, B))
(b) Fig.1 (a) Original Image of Campus (b) Image After taking Discreet Wa velet Transform
o Convert LL from RGB to HSI colour space. LL (H, S, V) =Convert (LL(R, G, B)) o Divide the both ‘H’ & ‘S’ component off LL – band in to the non overlapping blocks of size M × M. o Compute average value of each block.
o Histogram of these average values from ‘H’ & ‘S’ component would be used as feature. o Take HL band. o Covert HL band image into binary if not. o Divide it in to the non over lapping blocks of size L × L. © 2011 ACEEE DOI: 02.ACE.2011.02. 46
Fig. 2 (a) LL Band Image (b) LL – Band converted ‘H’, ‘S’ & ‘V’ colour components
194
Poster Paper Proc. of Int. Conf. on Advances in Computer Engineering 2011 So, after the features were extracted from images to be trained the location identifiers were generated by the centroid computation. For the above defined purpose the cross co relation between the features from location identifier (three location identifiers exists for each class as defined above in the algorithm) of four different classes were computed which is tabulated as below: T ABLE I C ROSS CO RELATION BETWEEN ‘H UE DISTRIBUTION’ LOCATION I DENTIFIER FOR ALL FOUR CLASSES
T ABLE II C ROSS CO RELATION BETWEEN ‘SATURATION DISTRIBUTION’ LOCATION I DENTIFIER FOR ALL FOUR CLASSES
The cross correlation results show that the through proposed scheme the system is able to distinguish between the various location mainly through Hue and Edge identifier. But saturation hasn’t being proven to be good location identifier for the current location. This is being due to the fact that similar ness of the locations in IIIT A campus. But this is not necessarily that saturation would always be unusable location identifier. It depends upon the type of data we are working upon. That is the reason why despite of its nonimportance in our data it is not being removed from the proposed scheme.
Fig. 3 (a) ‘H’ & ‘S’ Images (b) Division into Blocks (c) Histogram Taken as Features (d) Edge Distribution Extracted from HL Band Image as Histogram.
IV. EXPERIMENTAL RESULTS The below given shows the picture of automated robot/ car for which the localization module is being developed. The efficiency of the algorithm is being tested on one thousand images of every class or location to be identified. The whole IIIT A campus is divided in to the following four regions: (1) Garden area (2) Building outdoor or in front of building walls. (3) Road Area (4) Building Indore Area
T ABLE III C ROSS CO RELATION BETWEEN ‘EDGE DISTRIBUTION’ LOCATION IDENTIFIER FOR ALL FOUR CLASSES
Also these cross co relation values suggests that very simple centroid computation methodology has proven to be efficient in our algorithm. So, there is no need to use any complex algorithm for this purpose which would increase the time complexity of the overall algorithm. Also the CPU computation time was computed on the system having Intel(R) Core (TM) 2, 2.00 GHz with 1 GB RAM was computed and is found out to be 3.4823 seconds for 800 × 600 size frame. This means our system can process an image frame after every four seconds
Fig. 4 picture of automated robot/car for which the localization module is being developed
After performing the training operation it’s desirable to see if the taken samples for training system will work or not. © 2011 ACEEE DOI: 02.ACE.2011.02. 46
195
Poster Paper Proc. of Int. Conf. on Advances in Computer Engineering 2011 approximately. The similar system is being employed on the automated test bed. Processing an image in every 4 seconds is safe enough for decision making our project. Also the speed at which car is going to travel across the campus is not very fast that processing frame in four seconds would impose any problem. The time required to compute 2D DWT of image frame is negligible in comparison to the time require extracting feature on frame size of four times of that. This all being done without any loss of relevant information. Also we got the chance to use saturation portion of frame in additional to color to improve our localization result without being putting too much time constraint on the CPU. In testing phase more than thousand images frames of IIITA campus were used to see the algorithms efficiency. Following is the number of frames taken for testing for each class:
to classification accuracy. If we look at the confusion matrix we find that the most of the Garden frames that were being misclassified as classified as Building Outdoor is due to large cross co relation value in Saturation distribution. Using the histogram i.e. the overall distribution of information throughout the image frame rather than directly using it with index value has made the proposed technique camera posture independent. CONCLUSION The localization is being efficiently performed in the application by using the aforesaid technique. The accuracy of the classification is being enhanced by making use of the ‘HSI’ color model. Also the computational time complexity is well enough to get implemented in any the real time robotic system. Also the integrity of the results is being maintained on changing camera’s orientation and posture (algorithm is tested on Huge Data set). The first step of localization in the project is completed. The second module is to use the above algorithm to recognize or localize in a much concise manner. For instance the robot should be able to tell not only it is in front of building but also must tell which building is it. Or it must know in which road of campus is it. Or inside which lab of building is it. Research is going on to this problem in our institute.
T ABLE IV
The confusion matrix for the classification is tabulated below: T ABLE V C ONFUSION MATRIX
REFERENCES [1] Stephen Se, David Lowe, Jim little:”Local and Global Localization for Mobile Robots using Visual Landmarks”, 2001 Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2001, pp.414-420, 2001. [2] Jurgen Wolf, Wolfram Burgard, Hans Burkhardt:”Using an Image Retrieval System for Vision-Based Mobile Robot Localization”, In Proc. of the International Conference on Image and Video Retrieval (CIVR), pp.108-119, 2002. [3] Kenji TANAKA, Mitsuru HIRAYAMA, Nobuhiro OKADA, EijiKONDO:”Location-Driven Image Retrieval for Images Collected by a Mobile Robot”, JSME international journal. Series C, Vol. 49, No. 4, 2006. [4] A.Remazeilles, F.Chaumette:”Image-based robot navigation from an image memory”, ELSEVIER. Robotics and Autonomous Systems, Vol. 55, No. 4, pp.345-356, 2007. [5] S.A.Engelson:”Using image signatures for place recognition” , Pattern Recognition Letters, Vol. 19, No. 10, pp.941951, 1998. [6] H. Morita, M. Hild, J. Miura, and Y. Shirai. “View-based localization in outdoor environments based on support vector learning”. In Proceedings of 2005 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 3083–3088, 2005. [7] H. Morita, M. Hild, J. Miura, and Y. Shirai, “Panoramic view based navigation in outdoor environment based on Support vector learning”. [8] Tomohiro Uchimoto, Sho’ji Suzuki and Hitosh Matsubara, “A Method to estimate Robot’s Location using vision sensor for various types of mobile robots.
So, the accuracy of the above algorithm is percentage is tabulated below: T ABLE VI
Clearly the above result shows that high accuracy is being achieved with fairly simple and efficient implementation. Also hardware setup is also quite simple and can be handled by any computer science person. The reason behind the fact that above scheme has worked that it makes use of maximum information about the color and shape of various objects present in the image frames. Even though Saturation matrix didn’t suit much to our data (as seen in results) but still being incorporated in algorithm to at least provide little weightage
© 2011 ACEEE DOI: 02.ACE.2011.02. 46
196