Finding 3D Orienation of a Line Using Hough Transform and a Stereo Pair by go shen

$11(; 7(&+1,&$/ 5(3257 /,5$ 75 129(0%(5

),1',1* 7+( ' 25,(17$7,21 2) $ /,1( 86,1* +28*+ 75$16)250 $1' $ 67(5(2 3$,5 $ *$67(5$726 $1' & %(/75$1 /$%25$725< )25 ,17(*5$7(' $'9$1&(' 52%27,&6 '(3$570(17 2) &20081,&$7,21 &20387(5 $1' 6<67(0 6&,(1&(6 81,9(56,7< 2) *(12$ 9,$/( &$86$ , *(12$ ,7$/< ( 0$,/ antonis@lira.dist.unige.it

Technical Report LIRA-TR 3/00

$%675$&7 In human vision system, the differences between the left and right images are used to recover 3D properties of a scene. Similarly on an artificial vision system, the differences between the two images can be used for the extraction of many useful 3D characteristics such as the depth, the surface normal and the exact position of a point. In this report we derive the formulation for the computation of a line in space based on its projections on a stereo pair of images and on the angle of converge of the two cameras. Experiments have been carried out that shows using our setup the proposed technique is rather unstable. Solutions for future implementations are proposed, which are rising the estimation range of the Hough accumulation array, and increasing the length of the baseline. Keywords: /,1( 25,(17$7,21 +28*+ 75$16)250 67(5(2 9,6,21

Finding the 3D Orientation of a Line Using Hough Transform and a Stereo Pair

, ,1752'8&7,21 The orientation of a line in space is given by two angles named elevation and azimuth, respectively. It is well known, but also intuitively apparent, that from a single view this line orientation information cannot be recovered. It can be recovered though using a single moving camera or a stereo pair. In both cases the slope of the line on the image plane is combined with the relative transformation matrix (rotation and translation) between the two views. A well known and widely used method for detecting straight lines on an image is the so called Hough transform [1]. In the general case the Hough transform is used to present any geometrical shape in a space defined by its geometrical characteristics as a single point [2] [3]. For a line these characteristics may be its distance from the centre of the image and its slope. By applying, therefore the Hough transform for straight lines on an image, the recovering of the slope of any line is a straightforward process on the 2D plane. The Hough transform has proven very accurate for straight line extraction against noise. The main disadvantage of the Hough transform is its slowness. The time complexity of Hough transform is ZKHUH 1 LV WKH QXPEHU RI EODFN SL[HOV DQG LV WKH VNHZ HVWLPDWLRQ UDQJH GLYLGHG E\ WKH UHVROXWLRQ 7R RYHUFRPH WKLV GUDZEDFN VHYHUDO WHFKQLTXHV IRU IDVW extraction of the Hough transformed have been proposed in the resent literature. These include parallel implementation [4], hierarchical scheme [5] and rectangular block decomposition [6]. In 3D space the straight lines constitute a 4D parameter space. Therefore, using a 4D Hough transform to detect such line fragments would be irksome. However, in the recent literature there can be found specimens of utilising the Hough transform on monocular or binocular platforms for the extraction of 3D segments. Bhattacharya et al [7] illustrate the use of the (2+2)D Hough space for the detection of straight lines in range images of scenes consisting of quadrangular shapes. Meribout et al [8] have presented a hardware structure for real-time 3D segment extraction using binocular vision and a generalized Hough transform. In this report we derive the formulation for the computation of the azimuth and elevation angles of a line. We utilise a robotic head with 3 degrees of freedom. The 3D position of a straight line is extracted by utilising the Hough transform on each image and the convergence angle of the two optical axes. We assume that our stereo setup is fine aligned and the vergence angle symmetric [9]. In order to identify the line on each image plane a CAD model is used [10]. The line is extracted from the CAD model and it is projected on each image plane using a simple pin-hole model. The best match, using a weighted Least-Mean-Square (LMS) algorithm, between the projected line and the lines found by the Hough transform is assumed to be the real projection of the line onto the image plane. Experiments have been carried out to measure the accuracy of the proposed method. Its main drawback in accuracy is the quantisation of the Hough space. In other words, the bigger the resolution of the Hough angle, the greater the accuracy of the line orientation. However, as mentioned above the computation of the Hough transform demands great times of computation, which are related to the resolution of the angle. Therefore, there is a trade off between fast extraction and accuracy. The rest of the report is organised as follows: In section II a fast overview on Hough transform for line detection on 2D planes is performed. Moreover, our technique for the elimination of false line detection is described. A brief description of the measurement technique of the 3D coordinates of a single point is provided in section III. In section IV the formulation for the determination of the 3D pose of a line segment is extracted. Experimental results are given in section V and, finally, in section VI concluding remarks are made.

Technical Report LIRA-TR 3/00

,, 7+( +28*+ 75$16)250 Let us suppose that we are looking for straight lines in an image. If we take a point (X,Y) in the image, all the lines, which pass through that pixel have the form: Y=aX+b

(1)

for varying values of a and b (see Figure 1). Therefore, we can say that from a certain point (X, Y) on the image plain, a family of lines is passing with characteristics: b = -aX+Y

DWDQ D

;

DWDQ D

(2)

where, now the X, Y are considered being constants, and a, b the variables.

However, as it can be seen in Figure 1, when the line is perpendicular to X axis, a is not defined. For this reason an equivalent, but more robust way to present this family of lines is: ρ = Xcosθh + Ysinθh

Figure 1: Lines passing through a point

(3)

where, θh is the angle of the perpendicular from (0, 0) to the line and ρ is the distance of that perpendicular (see Figure 2). In order to cover all the space it is enough that either ρ is signed and 0 ≤ ϑ h < π , or that ρ is presented in absolute value and 0 ≤ ϑ h < 2π . Throughout this report we follow the first case.

ρ ρ

Now consider two points in the (X, Y) space, which lie on the same line as shown in Figure 3a. For each point, all of the possible lines through it are represented by a single curve in (ρ, θh) space (Figure

θh

Y Figure 2: Lines passing through a point, in ρ, θ representation

-10 13

-20

11 -5

(a)

-30 0

100

120

140

160

180

(b)

Figure 3: (a) Two points in the (X, Y) space and (b) the same points represented by curves in the (ρ, θ) space

Finding the 3D Orientation of a Line Using Hough Transform and a Stereo Pair

3b). Thus, the single line in the (X, Y) space, which passes through these two points, lies on the intersection of the two curves in (ρ,θh) space, as shown in Figure 3b. Therefore, all the pixels which lie on the same line in the (X, Y) space are represented by curves that intersect on a single point in the (ρ,θh) space. The coordinates of this single point, through which all the curves pass, gives the values of ρ and θh in the equation (3) and, thus, a sufficient description of the line on the image plane. The procedure that is followed to detect straight lines in an image is: 1. Quantise the (ρ,θh) space into a two-dimensional array (Hough accumulation array HAA). 2. Initialise all elements of the HAA(ρ,θh) to zero. 3. For each edgel1 (X,Y), all the elements of HAA(ρ,θh) whose indices ρ, θh satisfy equation (3) are increased by 1. 4. Search for elements of HAA(ρ,θh) with the largest values. These correspond to a line in the original image.

Figure 4: (a) Original image and (b) the 7 dominant lines found in HAA, following the steps from 1 to 4

However, following the aforementioned strategy, problems like the one shown in Figure 4 might arise. In the case of Figure 4b dummy lines appear by the diagonal of the image. The reason is that since there are more pixels on the diagonals, it is also more probably that there are also more edgels, not necessarily belonging to a line that passes by the diagonal.

Edgel = Edge Element, i.e. pixel that has been detected by an edge detector as belonging to an edge in the image.

Technical Report LIRA-TR 3/00

Figure 5: (a) HAA normalisation mask and (b) the 7 dominant lines after the application of the previous mask on the HAA of Figure 4a; the improvement is significant if one compares Figures 4b and 5b

Another common problem is that due to luminosity reasons the edge detector may detect many more edgels around the real line, or due to quantisation reasons an edgel may participate to more than one lines. The result is that many neighbour peaks appear in the HAA, which represent lines with small deviation in the ρ or the θh. In order to overcome the two aforementioned problems the following two step are proposed to be sequenced after the 3rd step of the procedure: 5. A normalisation mask as the one shown in Figure 5a should be applied to the HAA just after its extraction. This mask reduces all the lines passing through the centre of the image plane to the same length. In this way the probability of appearing faulty lines by the diagonal is equalized. Figure 5b presents the result of the normalisation mask of Figure 5a on the HAA of Figure 4b. 6. A binary decision mask should be “convolved” with the HAA. The mask retains the maximum of the neighbour that overlays, whilst it pushes the rest to values zero. Figure 6 presents the result of the maximum retain 5x5 round mask on the HAA of Figure 4a.

Figure 6 The 7 dominant lines after the application of a maximum retain 5x5 round mask on the HAA of Figure 4a after step 5; comparing to Figure 5b the improvement is significant.

Finding the 3D Orientation of a Line Using Hough Transform and a Stereo Pair

,,, 0($685(0(17 2) 7+( ' &225',1$7(6 2) $ 6,1*/( 32,17 A junction is the intersection of two or more lines on the image plane. The junction tracking algorithm and the measurement method are described in [11]. Briefly here we can say that the sequence described in the previous paragraph is followed to extract the lines on the image planes of a stereo pair. The lines and the related junctions are provided by the CAD model of the observing object. The extracted lines and the junctions are related to the CAD model using a weighted LSM method. Then a closed loop method is followed, so that by moving simultaneously the three d.o.f. of the robotic head the junction is put at the principal point of the image in both images (right and left). When this is the case the two cameras are verging on the certain junction and the direct kinematics of the head are applied, in order to determine the 3D position of the junction relatively to the head. ,9 (;75$&7,1* 7+( ' ,1)250$7,21 )520 $ /,1( One way to describe the orientation of a line in the space are two angles ( [ \ ] ) as shown in Figure 7. These two [ angles have been chosen for simplicity reasons, because, as it ] will be shown bellow, the extraction of Îą is straightforward. These Î˛ angles, along with the coordinates Îą ([ \ ] ) of the point Z can ( [ \ ] ) \ provide the full pose of the line. The extracted 3D position of Z is computed as described in section Figure 7: The two angles Îą and Î˛ are expressed in a reference III, using a closed loop approach. frame parallel to that of the left camera and can determine the Let us now consider another point orientation of the line in space Z belonging to the same line, such as Z = ([ \ ] ). The sought angles are given by: Îą=atan((] ] )/( \ \ )

Z ([ \ ] )

(4)

and Î˛=atan(([ [ )/( ( \ 2 â&#x2C6;&#x2019; \1 ) + ( ] 2 â&#x2C6;&#x2019; ]1 ) )) 2

(5)

Let us consider the case of Figure 8. The head is converging at the junction Z = ([ \ ] ), which is also a point of the sought line. Therefore with respect to the frame of the left camera it is: Z = ([ \ ] ) = (G/(2sinY) F, 0, 0)

(6)

where G is the baseline and F is the distance from the rotation point of the camera to the camera sensor. By substituting to (4) and (5) we respectively get: Îą=atan(] /\ )

(7)

and Î˛=atan(([ G/(2sinY) + F)/( \ 2 + ] 2 )) 2

(8)

[ F

] G

Figure 8: A robotic head converging with symmetric vergence angle Y on a 3D point Z

Technical Report LIRA-TR 3/00

By utilising the pin-hole model on the left image plane (see also Figure 9) the 3D points Z = ([ \ ] ) and Z = ([ \ ] ) are represented by 2D points (X1, Y1) and (X2, Y2), which are: X1 =

I\1 I]1 = 0 , Y1 = =0 I â&#x2C6;&#x2019; [1 I â&#x2C6;&#x2019; [1

(9)

I\ 2 I] 2 , Y2 = I â&#x2C6;&#x2019; [2 I â&#x2C6;&#x2019; [2

(10)

( [ \ ] ) ( [ \ ] )

and X2 =

respectively,

(X2,Y2 )

where I is the focal length of the camera.

]

X (X1,Y1 )

By combining equations (7) and (10) we get: Îą = atan(Y2/X2) = Ď&#x20AC;/2 - Î¸hL

[

(11)

where Î¸hL is the representative angle of the line, in the Hough plane, for the left image.

Figure 9: The projection of a line in space (green) onto an image plane (red), according to the pinhole model

Given the topology of the head presented in Figure 8, the transformation matrix from the left to the right camera frame is: ďŁŽcos 2Y â&#x2C6;&#x2019; sin 2Y ďŁŻ sin 2Y cos 2Y 7=ďŁŻ 0 ďŁŻ 0 ďŁŻ 0 ďŁ° 0

0 F cos 2Y + G sin Y â&#x2C6;&#x2019; F ďŁš 0 F sin 2Y â&#x2C6;&#x2019; G cos Y ďŁş ďŁş 1 0 ďŁş ďŁş 0 1 ďŁť

(12)

and its inverse: ďŁŽ cos 2Y sin 2Y ďŁŻâ&#x2C6;&#x2019; sin 2Y cos 2Y â&#x2C6;&#x2019;1 7 =ďŁŻ ďŁŻ 0 0 ďŁŻ 0 ďŁ° 0

0 F cos 2Y + G sin Y â&#x2C6;&#x2019; F ďŁš 0 â&#x2C6;&#x2019; F sin 2Y + G cos Y ďŁşďŁş ďŁş 1 0 ďŁş 0 1 ďŁť

(13)

If the points Z and Z with respect to the frame of the right camera have coordinates which are ([Âś \Âś ]Âś ) and ([Âś \Âś ]Âś ), then, given the matrix 7 , these coordinates are: ([Âś \Âś ]Âś ) = (d/(2sinY) â&#x20AC;&#x201C; c, 0, 0)

(14)

and ([Âś \Âś ]Âś ) = ([2cos2Y+\ sin2Y+ccos2Y+dsinYâ&#x20AC;&#x201C;c, â&#x20AC;&#x201C;[2sin2Y+\ cos2Y+dcosYâ&#x20AC;&#x201C;csin2Y, ] ) (15) respectively. Using the pin-hole for the right camera as well we obtain: tan(Ď&#x20AC; / 2 â&#x2C6;&#x2019; Ď&#x2018;hR ) =

Y2â&#x20AC;&#x2122; ] 2â&#x20AC;&#x2122; = X â&#x20AC;&#x2122;2 \ 2â&#x20AC;&#x2122;

(16)

Combining (7), (15) and (16) and after some algebraic and trigonometric manipulation we get:

Finding the 3D Orientation of a Line Using Hough Transform and a Stereo Pair

ďŁš [2 1 ďŁŽ 1 tan Ď&#x2018; hR â&#x2C6;&#x2019; tan Ď&#x2018;hL cos 2Y + (G cos Y â&#x2C6;&#x2019; F sin 2Y )ďŁş = ďŁŻ ] 2 sin 2Y ďŁ° ]2 ďŁť

(17)

From Figure 7 and equation (11) it can be derived that:

\ 2 + ] 2 = ] 2 /sinÎą = ] 2 /sin (Ď&#x20AC; / 2 â&#x2C6;&#x2019; Ď&#x2018;hL ) = ] 2 /cosĎ&#x2018;hL 2

(18)

By combining (8), (17) and (18) we finally obtain that: Î˛=atan( cosÎ¸ hL

tan Î¸ hL cos 2Y â&#x2C6;&#x2019; tan Î¸ hR ) sin 2Y

(19)

Equations (11) and (19) are adequate to provide the orientation of the line in space, with respect to the frame of the left camera. The only data need to provide are the Hough angles of the line in both images and the vergence angle. Moreover, using also equation (6) the full pose of the line, with respect to the left camera frame is known. The necessary conditions to stand the above are: i.

The vergence angle of the head is symmetrical

ii.

The junction Z is put in the principal point of both images

However equation (19) is quite unstable in the 80 discrete case that is 60 used to extract the orientation of the lines 40 in the image planes (Hough transform). 20 This is shown in the next plot (Figure 10). 0 We have examined a common case of our -20 system, i.e. for several -40 values of the vergence angle corresponding to -60 distances from about 1m to 10m and for a -80 -5 -4 -3 -2 -1 0 1 2 3 4 5 JLYHQ YDOXH RI hL=0 we have plotted the corresponding value of Figure 10: Plots of angle against hR for several vergence angles and for DJDLQVW hR ranging hL=0 from â&#x20AC;&#x201C;5 degrees to 5 degrees. As stated above this is a common case for our system, which explores the inner part of a ship structure. This means that the typical distances of the head from a line are in the chosen range. Besides, the environment is usually composed by horizontal and vertical lines, hence with the head parallel to the floor hL=0 is a common case and, moreover, in this case WKH hR should be in the range of Âą5 degrees. Therefore, as it can be observed in Figure 10, an HUURU RI RQH GHJUHH GXH WR TXDQWLVDWLRQ PLJKW DIIHFW WR DQJOH HYHQ WR GHJUHHV DW ORQJ distances, whilst even for sort distances the error is not negligible, i.e. more that 5 degrees. The solutions in this case would be either to increase the quantisation of the HAA or to increase the length of the baseline. The first is obvious, but as stated previously in this paper it slows down the efficiency of the system significantly. By using a longer baseline, for the

GLVWDQFH

GHJUHHV

Technical Report LIRA-TR 3/00

same distances, the vergence angle is bigger and from Figure 10 the error in DQJOH VPDOOHU However, in this case the hardware used is fixed, and it should be taken into account for future implementations. 9 (;3(5,0(176 The experiment presented in this section verify the theoretical conclusion derived above. Several 3D points were measured, using the method described in [11] with the accuracy presented in [9]. For the measurement a perfect metallic cube of dimension 100mm was placed in frond of the head, roughly as shown in Figure 11. The junctions 1, 4, 5, 6, 7 and 8 were measured and the results are given in Table 1. Table 1: Point measurement

MXQFWLRQ

SDQ

WLOW

]

\ [

Figure 11: The setup used for the experiments. The cube with respect to the head coordinating system

YHUJHQFH

[

]

8.1552

-21.2135

-10.4638

474.7

68.0

186.1

-0.261702

-24.9251

-12.2728

396.0

-1.8

-189.0

8.05029

-9.99141

-11.1275

471.0

66.5

-83.8

-9.43388

-10.2873

-11.2058

465.4

-77.3

-85.6

-0.498334

-12.1128

-13.2441

394.7

-3.4

-84.7

The figures in Table 1 are enough to recover the points Z and Z with respect to the frame of the left camera. The coordinates [ \ ] are with respect to the head base. Therefore, by applying a transformation using the appropriate SDQ WLOW and YHUJHQFH values the coordinates of any of these eight points can be recovered for any of these eight positions of the head. As an example let us take the case that the head is verging on point 8. In this case the values of WKH +RXJK DQJOHV IRU WKH OLQH IRUPHG E\ SRLQWV DQG ZHUH IRXQG WR EH hL=178 degrees and hR=1 degrees. On the other hand the coordinates of points 8 and 4 with respect to the frame of the left camera are (393.8, 0, 0) and (415.9, 6.9, -101.7), respectively. Using the above data ZLWK HTXDWLRQV DQG LW UHVXOWHG WKDW GHJUHHV DQG -6.23 degrees, respectively. The error, therefore, is 18.46 degrees. The same approach was followed for the rest of the lines and the error was found to vary in the range Âą25 degrees, which verify our theoretical hypothesis. 9, &21&/86,216 A method to compute the 3D orientation of a line was proposed in this report. For this computation a stereo pair and the Hough transform is needed. Using our setup it is shown that this technique is rather unstable. In order that the technique being accurate, it is proposed that the quantisation of the HAA is denser and the baseline longer. If, for example, the quantisation of the HAA is 0.1 degrees and the baseline is 0.5m, the error varies in the range Âą2.5 degrees for lines in distances of about 5m. However, it should be noted that by increasing the estimation range of the HAA, the time complexity increases linearly. Moreover, the baseline of the existing setup (Eurohead) is fixed. For future implementations, therefore, it is proposed the utilisation of a fast Hough transform, as the one presented in [6],

Finding the 3D Orientation of a Line Using Hough Transform and a Stereo Pair

and a longer baseline. The length of the baseline not only affects the accuracy of the azimuth and elevation of the sought line, but the accuracy in the measurement of the coordinates of a single 3D point (see [9]) and, consequently, the accuracy of the pose of the line in space. $&.12:/('*0(176 The work presented in this report has been supported by the Esprit project ROBVISION (EP28867). 5()(5(1&(6 [1] [2] [3] [4]

[5] [6]

[7] [8]

[9] [10]

[11]

R. D. Duda and P. E. Hart (1972) "Use of the Hough Transform to Detect Lines and Curves in Pictures", &RPPXQ $&0, : 11-15. P. D. Picton (1987) "Hough Transform Refereces", ,QWHUQDWLRQQDO -RXUQDO RI 3DWWHUQ 5HFRJQLWLRQ $UWLILFLDO ,QWHOLJHQFH, : 413-425. J. Illingworth and J. Kittler (1988) "A Survey of the Hough Transform", &RPSXWHU 9LVLRQ *UDSKLFV DQG ,PDJH 3URFHVVLQJ, : 87-116. W. A. Gotz and H. J. Druckmuller (1995) "A Fast Digital Radon Transform - An efficient means for evaluating the Hough transform", 3DWWHUQ 5HFRJQLWLRQ, (12): 1985-1992. S. C. Jeng and W. H. Tsai (1990) "Fast Generalized Hough Transform", 3DWWHUQ 5HFRJQLWLRQ /HWWHUV, (11): 725- 733. S. J. Perantonis, B. Gatos and N. Papamarkos (1999) "Block Decomposition and Segmentation for Fast Hough Transform Evaluation", 3DWWHUQ 5HFRJQLWLRQ, (5): 811-824. P. Bhattacharya, H. Liu, A. Rosenfeld and S. Thompson (2000) "Hough-transform Detection of Lines in 3D Space", 3DWWHUQ 5HFRJQLWLRQ /HWWHUV, (9): 843-849. M. Meribout, M. Nakanishi, E. Hosoya and T. Ogura (2000) "Hough Transform Algorithm for Thee Dimensional Segment Extraction and its Parallel Hardware Implementation", &RPSXWHU 9LVLRQ DQG ,PDJH 8QGHUVWDQGLQJ, (2): 177-205. A. Gasteratos and G. Sandini, (2000)2Q WKH $FFXUDF\ RI WKH (XURKHDG, , LIRA-Lab, DIST, University of Genova, Genova, pp. 15. M. Vincze, M. Ayromlou, S. Galt, A. Gasteratos, C. Gramkow, N. Hewer, S. Hoffgaard, O. Madsen, R. Martinotti, O. Neckelmann, G. Sandini, R. Waterman and M. Zillich (2000) "RobVision - Visually guiding a walking robot through a ship structure", VW ,QWHUQDWLRQDO &RQIHUHQFH RQ &RPSXWHU $SSOLFDWLRQV DQG ,QIRUPDWLRQ 7HFKQRORJ\ LQ WKH 0DULWLPH ,QGXVWULHV &203,7 , Berlin, Germany, pp. . A. Gasteratos, R. Martinotti, G. Metta and G. Sandini (2000) "Precise 3D Measurements with a High Resolution Stereo Head", )LUVW ,QWHUQDWLRQDO :RUNVKRS RQ ,PDJH DQG 6LJQDO 3URFHVVLQJ DQG $QDO\VLV ,:,63$ , June 14-15, Pula, Croatia, pp. 171-176.