JOURNAL of AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS
Editor-in-Chief Janusz Kacprzyk
Executive Editor: Anna Ładan aladan@piap.pl
(Systems Research Institute, Polish Academy of Sciences; PIAP, Poland)
Associate Editors: Mariusz Andrzejczak (PIAP, Poland) Katarzyna Rzeplińska-Rykała (PIAP, Poland)
Co-Editors: Dimitar Filev (Research & Advanced Engineering, Ford Motor Company, USA)
Kaoru Hirota
Webmaster: Tomasz Kobyliński tkobylinski@piap.pl
(Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Japan)
Witold Pedrycz (ECERF, University of Alberta, Canada)
Roman Szewczyk (PIAP, Warsaw University of Technology, Poland)
Editorial Office: Industrial Research Institute for Automation and Measurements PIAP Al. Jerozolimskie 202, 02-486 Warsaw, POLAND Tel. +48-22-8740109, office@jamris.org
Copyright and reprint permissions Executive Editor
Editorial Board: Chairman: Janusz Kacprzyk (Polish Academy of Sciences; PIAP, Poland) Plamen Angelov (Lancaster University, UK) Zenn Bien (Korea Advanced Institute of Science and Technology, Korea) Adam Borkowski (Polish Academy of Sciences, Poland) Wolfgang Borutzky (Fachhochschule Bonn-Rhein-Sieg, Germany) Oscar Castillo (Tijuana Institute of Technology, Mexico) Chin Chen Chang (Feng Chia University, Taiwan) Jorge Manuel Miranda Dias (University of Coimbra, Portugal) Bogdan Gabryś (Bournemouth University, UK) Jan Jabłkowski (PIAP, Poland) Stanisław Kaczanowski (PIAP, Poland) Tadeusz Kaczorek (Warsaw University of Technology, Poland) Marian P. Kaźmierkowski (Warsaw University of Technology, Poland) Józef Korbicz (University of Zielona Góra, Poland) Krzysztof Kozłowski (Poznań University of Technology, Poland) Eckart Kramer (Fachhochschule Eberswalde, Germany) Andrew Kusiak (University of Iowa, USA) Mark Last (Ben–Gurion University of the Negev, Israel) Anthony Maciejewski (Colorado State University, USA) Krzysztof Malinowski (Warsaw University of Technology, Poland)
Andrzej Masłowski (PIAP, Poland) Tadeusz Missala (PIAP, Poland) Fazel Naghdy (University of Wollongong, Australia) Zbigniew Nahorski (Polish Academy of Science, Poland) Antoni Niederliński (Silesian University of Technology, Poland) Witold Pedrycz (University of Alberta, Canada) Duc Truong Pham (Cardiff University, UK) Lech Polkowski (Polish-Japanese Institute of Information Technology, Poland) Alain Pruski (University of Metz, France) Leszek Rutkowski (Częstochowa University of Technology, Poland) Klaus Schilling (Julius-Maximilians-University Würzburg, Germany) Ryszard Tadeusiewicz (AGH University of Science and Technology in Kraków, Poland)
Stanisław Tarasiewicz (University of Laval, Canada) Piotr Tatjewski (Warsaw University of Technology, Poland) Władysław Torbicz (Polish Academy of Sciences, Poland) Leszek Trybus (Rzeszów University of Technology, Poland) René Wamkeue (University of Québec, Canada) Janusz Zalewski (Florida Gulf Coast University, USA) Marek Zaremba (University of Québec, Canada) Teresa Zielińska (Warsaw University of Technology, Poland)
Publisher: Industrial Research Institute for Automation and Measurements PIAP
If in doubt about the proper edition of contributions, please contact the Executive Editor. Articles are reviewed, excluding advertisements and descriptions of products. The Editor does not take the responsibility for contents of advertisements, inserts etc. The Editor reserves the right to make relevant revisions, abbreviations and adjustments to the articles.
All rights reserved ©
1
JOURNAL of AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS VOLUME 5, N° 1, 2011
CONTENTS REGULAR PAPER
SPECIAL ISSUE SECTION 64
3
Computational methods for investigation of stability of models of 2D continuous-discrete linear systems Mikołaj Busłowicz 8
Interaction management between social agents and human Abdelhak Moussaoui, Alain Pruski, Brahim Cherki
Hybrid Intelligent Systems for Optimization and Pattern Recognition - Part II Guest Editors: Oscar Castillo and Patricia Melin 68
Fusion of door and corner features for scene recognition Mario I. Chacon-Murguia, Rafael Sandoval-Rodriguez, Cynthia P. Guerrero-Saucedo
16
Bayesian model for multimodal sensory information fusion in humanoid Wei Kin Wong, Chu Kiong Loo, Tze Ming Neoh, Ying Wei Liew, Eng Kean Lee
77
Optimization of a modular neural network for pattern recognition using parallel genetic algorithm Martha Cárdenas, Patricia Melin, Laura Cruz Reyes
23
Real-time coordinated trajectory planning and obstacle avoidance for mobile robots Lucas C. McNinch, Reza A. Soltan, Kenneth R. Muske, Hashem Ashrafiuon, James C. Peyton
85
Adaptive ant-colony algorithm for semantic query routing Claudia Gómez Santillán, Laura Cruz Reyes, Elisa Schaeffer, Eustorgio Meza, Gilberto Rivera Zarate
30
A framework for unknown environment manipulator motion planning via model based realtime rehearsal Dugan Um, Dongseok Ryu
95
Intelligent leader-follower behaviour for unmanned ground-based vehicles Pouria Sadeghi-Tehran, Javier Andreu, Plamen Angelov
101
A new optimization algorithn based on a paradigm inspired by nature Leslie Astudillo, Patricia Melin, Oscar Castillo
36
47
Robotic rehabilitation of stroke patients using an expert system Pradeep Natarajan, Arvin Agah, Wen Liu 58
Considerations on coverage and navigation in wireless mobile sensor network Tadeusz Goszczynski, Zbigniew Pilat
2
Application of the bee swarm optimization BSO to the knapsack problem Marco Aurelio Sotelo-Figueroa, Rosario Baltazar, Juan Martín Carpio
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
COMPUTATIONAL METHODS FOR INVESTIGATION OF STABILITY OF MODELS OF 2D CONTINUOUS-DISCRETE LINEAR SYSTEMS Received 9th September 2010; accepted 20th September 2010.
Mikołaj Busłowicz
Abstract: The problem of asymptotic stability of models of 2D continuous-discrete linear systems is considered. Computer methods for investigation of asymptotic stability of the Fornasini-Marchesini type and the Roesser type models, are given. The methods proposed require computation of the eigenvalue-loci of complex matrices. Effectiveness of the stability tests are demonstrated on numerical examples. Keywords: continuous-discrete system, hybrid system, linear system, stability, computational methods.
x× (t,i+1) = A0x (t,i) +A1x× (t,i)+A2x (t,i+1)+Bu (t,i), i Î Z + , t Î Â+ ,
(1)
where x× (t,i) =¶x (t,i)/¶t, x (t,i) Î Ân, u (t,i) Î Âm, and A0, A1, A2 Î Ân×m, B Î Ân×m. Definition 1. The Fornasini-Marchesini type model (1) is called asymptotically stable (or Hurwitz-Schur stable) if for u (t,i) º 0 and bounded boundary conditions x (0,i), i Î Z+ , x (t,0), x× (t,0), t Î Â+ ,
(2)
the condition lim i,t®¥ || x(t,i)|| = 0 holds for t,i ®¥.
1. Introduction In continuous-discrete systems both continuous-time and discrete-time components are relevant and interacting and these components can not be separated. Such systems are called the hybrid systems. Examples of hybrid systems can be found in [6], [8], [9], [16]. The problems of dynamics and control of hybrid systems have been studied in [5], [6], [16]. In this paper we consider the continuous-discrete linear systems whose models have structure similar to the models of 2D discrete-time linear systems. Such models, called the 2D continuous-discrete or 2D hybrid models, have been considered in [11] in the case of positive systems. The new general model of positive 2D hybrid linear systems has been introduced in [12] for standard and in [13] for fractional systems. The realization and solvability problems of positive 2D hybrid linear systems have been considered in [11], [14] and [15], [17], respectively. The problems of stability and robust stability of 2D continuous-discrete linear systems have been investigated in [1-4], [7], [18 -20]. The main purpose of this paper is to present computational methods for investigation of asymptotic stability of the Fornasini-Marchesini and the Roesser type models of continuous-discrete linear systems. The following notation will be used: Â - the set of real numbers, Â+ = [0,¥], Z+ - the set of non-negative integers, Ân×m - the set of n×m real matrices and Â+n = Â+n´1 , || x(·) || - the norm of x(·), li{X} - i-th eigenvalue of matrix X.
2. Preliminaries and formulation of the problem The state equation of the Fornasini-Marchesini type model of a continuous-discrete linear system has the form [11]
The characteristic matrix of the model (1) has the form H (s,z) = szIn - A0 - sA1 - zA2.
(3)
The characteristic function w(s,z) = det H (s,z) = det[ szIn - A0 - sA1 - zA2]
(4)
of the model (1) is a polynomial in two independent variables s and z, of the general form n
n
w(s,z) = å å akj s k z j, ann = 1.
(5)
k=0 j=0
The state equation of the Roesser type model of a continuous-discrete linear system has the form [11] x× h (t,i) A A x h (t,i) B + 1 u(t,i), = 11 12 v x (t,i+1) A21 A22 x v (t,i) B2 t Î Â+ , i Î Z+,
(6)
where x× h (t,i) = ¶x h (t,i)/¶t, x h (t,i) Î Ân1, x v (t,i) Î Ân2 are the vertical and the horizontal vectors, respectively, u (t,i) Î Âm is the input vector and A11 Î Ân1×n1, A12 Î Ân1×n2, A21 Î Ân2×n1, A22 Î Ân2×n2, B1 Î Ân1×m, B2 Î Ân2×m. The boundary conditions for (6) are as follows xh (t,0), xv (t,0), t Î Â+ , xh (0,i), xv (0,i), i ³ 1, i Î Z+. (7) Definition 2. The Roesser type model (6) is called asymptotically stable (or Hurwitz-Schur stable) if for u(t,i) º 0 and bounded boundary conditions (7) the conditions limi,t®¥ || xh (t,i)|| = 0 and limi,t®¥ || xv (t,i)|| = 0 hold for t,i ®¥. The characteristic matrix of the model (6) has the form H (s,z) =
sIn1 - A11 - A21
- A12 zIn2 - A22
(8) Articles
3
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
Using the rules for computing the determinant of block matrices [10], we obtain that the characteristic function w(s, z) = det H(s, z) of the Roesser type model can be computed from one of the following equivalent formulae w (s, z) = det(zIn2 – A22)det(sIn1 – A11 – A12 (zIn2 – A22)-1 A21), (9a)
N° 1
S1FM(ejw) = (Inejw – A1)-1(A2ejw + A0).
2011
(17)
Proof. If A1 ¹ ±In then the matrix Inejw – A1 is non-singular for all w Î[0, 2p] and [s(Inejw – A1) – A0 – A2ejw ] = [Inejw – A1][ s – S1FM(ejw)], (18) where S1FM(ejw) has the form (17).
-1
w (s, z) = det(sIn1 – A11)det(zIn2 – A22 – A21 (sIn1 – A11) A12}. (9b) The characteristic function of the Roesser type model can be written in the form n1 n2
w (s,z) = å å akj s k z j, an1n2 = 1.
(10)
From (16) and (18) it follows that w (s,ejw) = det(Inejw – A1)det(sIn – S1FM(ejw)).
(19)
This means that for A1 ¹ ±In the eigenvalues of the matrix S1FM(ejw) are the roots of the polynomial w(s,ejw).¢
k=0 j=0
From [1], [7] we have the following theorem. Theorem 1. The Fornasini-Marchesini type model (1) with characteristic function (4) (or the Roesser type model (6) with characteristic function (9)) is asymptotically stable if and only if w (s, z) ¹ 0, Re s ³ 0, | z |³ 1.
(11)
The polynomial w(s, z) satisfying condition (11) is called continuous-discrete stable (C-D stable) or HurwitzSchur stable [1]. The main purpose of this paper is to present computational methods for checking the condition (11) of asymptotic stability of the Fornasini-Marchesini type model (1) and the Roesser type model (6) of continuousdiscrete linear systems.
3. Solution of the problem
w (s, ejw) ¹ 0, Re s ³ 0, "w Î[0, 2p],
(12)
w (jy, z) ¹ 0, | z |³ 1, "y Î[0, ¥).
(13)
Proof. From [7] it follows that (11) is equivalent to the conditions w (s, z) ¹ 0, Re s ³ 0, | z |= 1,
(14)
w (s, z) ¹ 0, Re s = 0, | z |³ 1.
(15)
It is easy to see that conditions (14) and (15) can be written in the forms (12) and (13), respectively. ¢ Asymptotic stability of the FornasiniMarchesini type model From (4) for z = ejw we have jw
w (s, e ) = det[s(Ine – A1) – A2e – A0].
Articles
Proof. Substituting s = jy in (4) one obtains w(jy , z) = det[z( jyIn – A2) – A0 – jyA1].
(21)
If A2 ¹ In then the matrix jyIn – A1 is non-singular for all y ³ 0 and [z( jyIn – A2) – A0 – jyA1] = [ jyIn – A2][ z – S2FM(jy)],
(22)
where S2FM(jy) is defined by (20).
w (jy , z) = det( jyIn – A2)det( zIn – S2FM(jy)).
(23)
Hence, if A2 ¹ In then the eigenvalues of the matrix S2FM(jy) are the roots of the polynomial w(jy , z). ¢ Theorem 3. The Fornasini-Marchesini type model (1) with A1 ¹ ±In and A2 ¹ In is asymptotically stable if and only if the conditions of Lemmas 1 and 2 hold, i.e. Re li{S1FM (ejw)} < 0, "w Î[0, 2p], i = 1,2,..., n, and | li{S2FM(jy)}|< 1, "y ³ 0, i = 1,2,..., n,
(24) (25)
where the matrices S1FM (ejw) and S2FM(jy) have the forms (17) and (20), respectively.
S1FM (1) = (In – A1)-1(A2 + A0),
(26a)
S1FM (–1) = (–In – A1)-1(–A2 + A0).
(26b)
(16)
Lemma 1. The condition (12) for the Fornasini-Marchesini type model (1) with A1 ¹ ±In holds if and only if all eigenvalues of the complex matrix S1FM(ejw) have negative real parts for all w Î[0, 2p], where 4
(20)
Proof. It follows from Theorem 2 and Lemmas 1 and 2. ¢ From (17) for w = 0 and w = p we have
3.1.
jw
S2FM(jy) = ( jyIn – A2)-1(A0 + jyA1).
From (21) and (22) it follows that
Theorem 2. The condition (11) is equivalent to the following two conditions
jw
Lemma 2. The condition (13) for the Fornasini-Marchesini type model (1) with A2 ¹ In holds if and only if all eigenvalues of the complex matrix S2FM(jy) have absolute values less than one for all y ³ 0, where
From the theory of matrices it follows that if (–1)n det S (1) £ 0 then not all eigenvalues of the matrix (26a) have negative real parts. Similar condition holds for the FM 1
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
matrix (26b). Hence, we have the following remark. Remark 1. Simple necessary condition for asymptotic stability of the Fornasini-Marchesini type model (1) with A1 ¹ ±In has the form (–1)n det( In – A1)det(A2 + A0) > 0
(27a)
(–1)ndet(– In – A1)det(– A2 + A0) > 0.
(27b)
Example 1. Consider the Fornasini-Marchesini type model (1) with the matrices A0 =
– 0.4 1 0 – 0.5 0.1 0 0 0.2 0.5 , A1 = 0 0.1 – 0.4 , 0 – 0.1 – 0.1 0 0.2 – 0.2
– 0.4 – 1.8 0 A2 = 0.1 – 0.4 0 , 0 0 – 0.7
It is easy to check that the necessary conditions (27) hold. Computing eigenvalues of the matrices S1FM(ejw), w Î[0, 2p], and S2FM(jy), y Î[0, 100], one obtains the plots shown in Figures 1 and 2. It is easy to check that eigenvalues of S2FM(jy) remain in the unit circle for all y >100. From Figures 1 and 2 it follows that the conditions (24) and (25) of Theorem 3 are satisfied and the system is asymptotically stable.
2011
3.2. Asymptotic stability of the Roesser type model From (8) for z= ejw we have w (s, ejw) = det
sIn1 – A11 – A12 – A21 In2ejw – A22
(29)
Lemma 3. The condition (12) for the Roesser type model (6) with A22 ¹ ±In2 holds if and only if all eigenvalues of the complex matrix S1R(ejw) have negative real parts for all w Î[0, 2p], where S1R(ejw) =A11 + A12 (In2ejw – A22)-1 A21.
(30)
Proof. If A22 ¹ ±In2 then the matrix In2ejw – A22 is nonsingular for all w Î[0, 2p] and from (9a) it follows that w (s,ejw) = det(In2ejw – A22)det(sIn1 – S1R(ejw)),
(28)
N° 1
(31)
where S1R(ejw) has the form (30). This means that for A22 ¹ ±In2 the eigenvalues of the matrix S1R(ejw) are the roots of the polynomial w(s,ejw). ¢ Lemma 4. The condition (13) for the Roesser type model (6) with A11 ¹ In1 holds if and only if all eigenvalues of the complex matrix S2R(jy) have absolute values less than one for all y ³ 0, where S2R(jy) = A22 + A21 ( jyIn1 – A11)-1 A12 .
(32)
Proof. If A11 ¹ In1 then the matrix jyIn1 – A11 is non-singular for all y ³ 0. From (9b) for s = jy we have w (jy, z) = det(jyIn1 – A11)det(zIn2 – A22 – A21(jyIn1 – A11)-1 A12}. (33) From (32) and (33) it follows that w (jy, z) = det(jyIn1 – A11)det(zIn2 – S2R(jy),
(34)
where S2R(jy) is defined by (32). If A11 ¹ In1 then the eigenvalues of the matrix S2R(jy) are the roots of the polynomial w(jy, z). ¢
Fig. 1. Eigenvalues of the matrix S1FM(ejw), w Î[0, 2p].
Theorem 4. The Roesser type model (6) with A22 ¹ ±In2 and A11 ¹ In1 is asymptotically stable if and only if the conditions of Lemmas 3 and 4 hold, i.e. Re li{S1R(ejw)} < 0, "w Î [0, 2p], i = 1,2,...,n1, and | li{S2R(jy)}| < 1, "y ³ 0, i = 1,2,...,n2,
(35) (36)
where matrices S1R(ejw) and S2R(jy) have the forms (30) and (32), respectively. Proof. The proof follows from Theorem 2 and Lemmas 3 and 4. ¢ From (30) for w = 0 and w = p it follows that
Fig. 2. Eigenvalues of the matrix S2FM(jy), y Î[0, 100].
S1R (1) = A11 + A12 (In2 – A22)-1 A21,
(37a)
S1R (–1) = A11 + A12 (–In2 – A22)-1 A21.
(37b)
From the above and theory of matrices we have the following remark. Articles
5
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
Remark 2. Simple necessary condition for asymptotic stability of the Roesser type model (6) with A22 ¹ ±In2 are as follows: (–1)ndet S1R(1) > 0 and (–1)ndet S1R(–1) > 0.
complex matrices (17) and (20) for the Fornasini-Marchesini type model and complex matrices (30) and (32) for the Roesser type model.
Example 2. Consider the Roesser type model (6) with the matrices
ACKNOWLEDGMENTS
A11 = A12 =
–1 0.1
0 – 0.5 , A12 = –5 –1
0 , 0
– 0.5 – 1 – 0.5 0.8 , A22 = . 0 –1 0.2 0.4
(38)
Eigenvalues of the matrices S1R(ejw), w Î[0, 2p] and S (jy), y Î[0, 100], are shown in Figures 3 and 4. If is easy to check that eigenvalues of S2R(jy) remain in the unit circle for all y >100. From Figures 3 and 4 it follows that the conditions (35) and (36) of Theorem 4 are satisfied and the system is asymptotically stable.
The work was supported by the Ministry of Science and High Education of Poland under grant No. N N514 1939 33.
AUTHOR Miko³aj Bus³owicz - Bia³ystok University of Technology, Faculty of Electrical Engineering, ul. Wiejska 45D, 15-351 Bia³ystok, Poland. E-mail: busmiko@pb.edu.pl.
R 2
References [1]
[2]
[3]
[4]
[5]
[6]
Fig. 3. Eigenvalues of the matrix S1R(ejw), w Î[0, 2p]. [7]
[8]
[9]
[10]
[11] [12]
Fig. 4. Eigenvalues of the matrix S2R(jy), y Î[0, 100]. [13]
4. Concluding remarks Computational methods for investigation of asymptotic stability of the Fornasini-Marchesini type model (1) (Theorem 3) and the Roesser type model (6) (Theorem 4) of continuous-discrete linear systems have been given. These methods require computation of eigenvalue-loci of 6
Articles
[14]
[15]
Y. Bistritz, “A stability test for continuous-discrete bivariate polynomials”, In: Proc. Int. Symp. on Circuits and Systems, vol. 3, 2003, pp. 682-685. M. Bus³owicz, “Robust stability of the new general 2D model of a class of continuous-discrete linear systems”, Bull. Pol. Ac.: Tech., vol. 57, no. 4, 2010. M. Bus³owicz, “Stability and robust stability conditions for general model of scalar continuous-discrete linear systems”, Measurement Automation and Monitoring, vol. 56, no. 2, 2010, pp. 133-135. M. Bus³owicz, “Improved stability and robust stability conditions for general model of scalar continuousdiscrete linear systems”, Measurement Automation and Monitoring (submitted for publication). M. Dymkov, I. Gaishun, E. Rogers, K. Ga³kowski and D. H. Owens, “Control theory for a class of 2D continuousdiscrete linear systems”, Int. J. Control, vol. 77, no. 9, 2004, pp. 847-860. K. Ga³kowski, E. Rogers, W. Paszke and D. H. Owens, “Linear repetitive process control theory applied to a physical example”, Int. J. Appl. Math. Comput. Sci., vol. 13, no. 1, 2003, pp. 87-99. J.P. Guiver, N.K. Bose, “On test for zero-sets of multivariate polynomials in non-compact polydomains”, In: Proc. of the IEEE, vol. 69, no. 4, 1981, pp. 467-469. J. Hespanha, “Stochastic Hybrid Systems: Application to Communication Networks”, Techn. Report, Dept. of Electrical and Computer Eng., Univ. of California, 2004. K. Johanson, J. Lygeros, S. Sastry, “Modelling hybrid systems”. In: H. Unbehauen (Ed.), Encyklopedia of Life Support Systems, EOLSS, 2004. T. Kaczorek, Vectors and Matrices in Automatics and Electrotechnics. WNT: Warszawa, 1998, p. 70. (in Polish) T. Kaczorek, “Positive 1D and 2D Systems”, SpringerVerlag: London, 2002. T. Kaczorek, “Positive 2D hybrid linear systems”, Bull. Pol. Ac.: Tech., vol. 55, no. 4, 2007, pp. 351-358. T. Kaczorek, “Positive fractional 2D hybrid linear systems”, Bull. Pol. Ac.: Tech., vol. 56, no. 3, 2008, pp. 273-277. T. Kaczorek, “Realization problem for positive 2D hybrid systems”, COMPEL, vol. 27, no. 3, 2008, pp. 613-623. T. Kaczorek, V. Marchenko, £. Sajewski, “Solvability of
Journal of Automation, Mobile Robotics & Intelligent Systems
[16] [17]
[18]
[19]
[20]
VOLUME 5,
N° 1
2011
2D hybrid linear systems comparison of the different methods”, Acta Mechanica et Automatica, vol. 2, no. 2, 2008, pp. 59-66. D. Liberzon, “Switching in Systems and Control”, Birkhauser: Boston, 2003. £. Sajewski, “Solution of 2D singular hybrid linear systems”, Kybernetes, vol. 38, no. 7/8, 2009, pp. 10791092. Y. Xiao, “Stability test for 2-D continuous-discrete systems”. In: Proc. 40th IEEE Conf. on Decision and Control, vol. 4, 2001, pp. 3649-3654. Y. Xiao, ”Robust Hurwitz-Schur stability conditions of polytopes of 2-D polynomials”, In: Proc. 40th IEEE Conf. on Decision and Control, vol. 4, 2001, pp. 36433648. Y. Xiao, “Stability, controllability and observability of 2-D continuous-discrete systems”. In: Proc. Int. Symp. on Circuits and Systems, vol. 4, 2003, pp. 468-471.
Articles
7
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
INTERACTION MANAGMENT BETWEEN SOCIAL AGENTS AND HUMAN Received 18th March 2010; accepted 10th December 2010.
Abdelhak Moussaoui, Alain Pruski, Brahim Cherki
Abstract: In this paper we address the problem of managing social interactions between a human user and a set of social agents in structured environments. These agents have to regulate a measurable characteristic of the user. Our study focuses mainly therapeutic, caring for elderly people and assessment applications. The agent concept considered in this study is suitable for multi-robot systems (physical, service, mobile or not), but also to virtual agents (avatars) in a virtual reality (VR) application. The proposed approach has been adapted for an application of social phobia treatment using virtual reality. Thus, we conducted a case study to show the effectiveness of our contribution. Keywords: multi-robot system, multi-agent system, social interaction, virtual reality, virtual reality exposure therapy.
1. Introduction Currently, research works on multi-robot systems (MRS) are one of the most important areas in robotics. This can be explained by the many benefits of cooperation of several robots especially for engineering applications. MRS allows a greater tolerance to faults; it has better efficiency and performance while having opportunities for tasks development and expansion. According to [1] and [2], the research work on MRSs can be classified in three categories, the first one concerns reconfigurable robots systems whose idea is inspired by organization of insects and animals [3-5]. The second category concerns the trajectory planning of a group of robots such as the control of aircraft traffic [6]. The third focuses on cooperative multi-robot architectures, such as, the project ACTRESS [7] on the maintenance in known and structured environment. Unlike conventional control systems, cooperation must be considered in the modeling, analysis and improvement of MRS. Implement this task is not easy. One of the most used tools for overcoming this difficulty is the multi-agent systems (MAS). An MRS can be considered as a robot with a distributed artificial intelligence. On the other hand, recent years have seen a significant growth in the development of non-invasive, wearable and wireless sensors technologies [8-10]. This technique allows for physiological measures such as HR (Heart Rate), SC (Skin Conductivity) skin conductance, EMG (Electromyogram), EEG (electroencephalogram). These tools have been widely used in clinical, therapeutic and assistance applications. 8
Articles
The objective of this work is to propose an approach to manage social interaction between a human user and a set of agents integrated in structured environment. Their mission is mainly to regulate a measurable characteristic of the user. We cite as an example an elderly person in a structured flat. This person will be assisted by a group of robots to take his medication in case of falling of his blood glucose or to be assisted in achieving a daily task (for example preparing a meal). Agent concept considered in this paper concerns the physical robots, mobile or not, service robot, as well as avatars in a virtual environment in a virtual reality (VR) experience. Another applications that interests us is treating cognitive disorders using VR, in fact, several research have documented the effectiveness of exposure therapy using VR and particularly with phobias [11-16]. The principle of this technique is to expose the patient to anxiety-provoking situations in a gradual, repeated and complete manner, in order to desensitize him/her [17],[18]. Our idea is to integrate virtual actors in a virtual world and to create anxiety-provoking situations as it would in real-life. Our contribution is based on three basic concepts: Make the user live an experience close to the real world by participating in a story, planned and designed by an expert; Let the user free to decide and move in the virtual environment (VE). He/she is guided by the progress of events. Introduce the user's measured state in the control of actions to activate. The system consists of the user and the world made of objects and agents. Our system must satisfy the following points Regulate the state of users around a level given by an expert that depends on the user and application; Ensure the coherence in the agents behaviors in their world and hence the cohesion of the story experienced by the user. We propose an application of exposure therapy virtual reality guided by a measure of the patient's state. To implement our approach, we conducted a pilot study for treating social phobia that we will detail in this paper.
2. Related works There are many research projects to develop cooperative MRS. Research work on MRSs in different application areas continue to grow since the '80s. Planetary exploration has received notable attention from researchers. The experimental project "Robot Work Crew" developed by NASA [19] addressed the problem of tightly
Journal of Automation, Mobile Robotics & Intelligent Systems
coupled coordination of two robots and the cooperation and coordination problem for heterogeneous and homogeneous MRS in difficult environments. The robotic deployment of a modular solar photovoltaic tent using Campout architecture [20] was also studied. The robot soccer is also a framework for developing cooperative SMR, results are of interest for other application areas. The soccer robot developed by the team FUFighters [21] is popular. The system is characterized by a generic and modular architecture composed of layers provided with 3 elements: sensors (Perceptual Dynamics), behavior (Activation Dynamics) and actuators (Target Dynamics). The project Martha is one of the few cooperative MRS designed to serve human [22]. Its objectives are the control and the management of a fleet of autonomous mobile robots for trans-shipment tasks in harbors, airports and marshalling yards. Experiments were conducted with a family of 3 mobile robots called Hilare. There are also other research projects that are conducting laboratory experiments on more specific issues related to cooperative MRS, their framework is more general to target specific applications. In this paper, we conduct a study which finds application in MRS for therapeutic, evaluation and assistance purposes. The project eCIRCUS (Education through Characters with emotional-Intelligence and Rolplaying Capabilities that Understand Social interaction) proposed a new approach in the use of ICT to support social and emotional learning within Personal and Social Education (PSE). To attain their aim, eCIRCUS studies educational role-play by the use of autonomous synthetic characters and the implication of the user through affective engagement [23].
3. General principle of our approach 1.1. Methodology The problem considered in this paper concerns the management of social interaction between a user and a set of agents. Agents Actions
Object Behaviors Management Engine
Reactions Human
State Human’s State Assessment
Fig. 1. Regulation loop of the human user state. The mission of agents aims to control the state of the user around a value given by an expert; they must also ensure the coherence of their behavior. The "state" of the user is a generic term that can represent measurable characteristics, that is dependent on the use of the system (a position in space, a physiological value, a gestural characteristic, the achievement of an action). For example, in an experience of virtual reality expo-sure therapy, the state of the user is the measured anxiety levels. From the user's point of view, the system can be seen as a control loop of his/her state as shown in Fig. 1. The “Assessment” bloc informs the system of the user's state calculated from the recorded measurements.
VOLUME 5,
N° 1
2011
The “Behaviors Management Engine” bloc is the regulating element of the system. It compares the state that is measured on the user with a reference state fixed by an expert, in order to decide on the nature and degree of actions to generate. 1.2. The Behaviors Management Engine a) Multi-Agent System Our solution to the problem discussed in this paper is based on the theory of MAS that [24] consists of a group of rational agents interacting with each other. Agents may share the same interest, conflicting interests or more general ones. There are several definitions of an agent and none of them is prevailing [24]. An agent can be anyone that perceives his/her environment through sensors and acts on it through effectors [24]. An ideal rational agent tries to maximize the performance level with regard to what he perceives and the knowledge at hand [24]. The applications envisaged for our approach consist of user agents and their entourage made up of cooperating objects. b) The agents and their actions We denote by agent's actions or behavors. An agent can have two kinds of actions: “Stimulating” actions and “Background” actions. “Stimulating” actions are those which are likely to induce a change in the user's state. The “help” actions and the “auxiliary” actions belong to this type. The “help” actions allow the user to follow the scenario's progress. The “auxiliary” actions indicate the ones that must be activated simultaneously with another action for some coherence reasons of the scene. An “auxiliary” action has an indifferent effect on the user. The “Background” actions are activated by the agents when the user is interacting neither with the user nor with another agent. As represents the set of stimuli and Af the set of “Background” actions, where: “Stimulating” action is mainly characterized by its effect. This effect can be “Positive”, “Neutral” or “Negative” with a degree of relevance given by an expert. c) The MAS Environment Let's remember that the user is an element of the MAS environment. In order to be able to react correctly, agents must know, at any time, the user's state. We express the state with two variables: the magnitude and its derivative. The user's state to be measured depends on the context of use. If the goal is to lead the user to a given point, the measure is the position relative to the target. This measure determines the progression of the person to the expected objective state. The magnitude of the measurement is determined by three intervals: Low, Normal or High. For derivative, we decided to divide it into three intervals: “Falling” (negative), “Stable”(nul) and “Rising”(positive). In Fig. 2, the cells of the table correspond to the user's states; the arrows represent the possible transitions between states. The set of user's states is denoted by S:
Articles
9
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
The objective is to bring the user's state back to (NulStable) denoted: s* = (N, S). The transition model of an environment expresses the way it changes state when a given action is activated. The user's transition model is stochastic. Activating a stimulus action does not necessarily mean that the user passes into another state. Let s the user's state at instant t and s' the state in which he passes after activating action a with: s, s' Î S and a Î As . The transition model is a probability distribution (s', ïs, a) on all possible states s'.
Magnitude
Derivative
N° 1
2011
with p(s'ïs, a)U(s')
a* = ar g maxa ÎAs s'ÎS
Each agent calculates his own optimal action a* that maximizes the sum over all the products of the probability to pass into each possible state s' by its utility U(s'). The utility of state are given by the importance of the state to achieve the objective of agents. We are interested in the calculation of the probability distribution. Suppose that the user is in state s. We can say that the user “Reacts” to action a if: a - is “Positive” then the user passes into a higher state than s (according to the order given in the definition of the set S.), a - is “Negative” then the user passes into a lower state than s, a - is “Neutral”, then the user does not change state. s- is the state in which the user goes if he “Reacts” to action a. We suppose that the user either passes into state s- or does not change state. From this last characteristic, we can write that:
Fig. 2. User's states. a* = ar g maxa ÎAs (p(s-ïs, a)U(s-) + (1-p(s-ïs, a)U(s) )
Magnitude
Derivative
3
2
1
4
5
4
1
2
3
Fig. 3. State Utility. d) User's state controlling In stochastic environments, finding the series of actions that enable you to reach the objective is not a simple research on a graph, because transitions between the different states are non-deterministic and agents must take into consideration the probabilistic aspect of transitions in their decisions. The expected utility concept is one of those used to introduce the transition uncertainty between states in the agents' decision [24]. A state's utility s, denoted U(s), is a real number that expresses its importance in the achievement of a given objective. U(s) : S ®ú The objective of an agent is the environment' stat s* = (N, S), a state with the highest utility and desired by his environment. Fig. 3 shows the utility of the user's different states. We also note that in our case, a state's utility does not change from one agent to another, because they cooperate and share the same objective. For an agent to be rational in state s, he must choose an optimal action a* that maximizes the expected utility given by: p(s', ïs, a)U(s') s'ÎS
10
Articles
The probability p(s-ïs, a) does not depend only on s, s- and the nature of a, but on certain other parameters as well, that are related to the effect of a on the user and to his activation conditions. For this reason, p(s-ïs, a) depends on two different elements: The probability for the user to pass from state s to state s- if a is activated in conditions where the user sees or hears a with certainty; The activation conditions of in the actual context of application. e) Conflicts management In case of collaborative agents, coordination allows to ensure that no agent perturbs the others when he chooses an action, and his actions are useful to the collective objective of the group. Assignment of roles to agents is a very much used tool as a coordination technique. A role can be seen as a masking operator on all of an agent's actions in some specific situations. We decided to let only one agent interact with the user, by introducing the principal agent's role. The latter will be assigned to the most certain agent capable of making the user pass into a useful state. For example, in the case of a virtual reality exposure therapy, if two virtual agents decide to speak simultaneously to the patient, it may interfere with each other. The stimulus expected utility suits perfectly this end. Let ri the capacity of agent i to be the principal agent. ri is the highest expected utility calculated over all possible actions of agent i. The “Main Agent” role will be assigned according to the following algorithm: For eachAgent i in parallel 1. Calculate ri of agent | QUOTE ppip. 2. Transfer ri to the other agents. 3. Wait till you receive the others' potentials. 4. Attribute the “Main Agent” role to that agent who maximizes r. End For.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
f) The story manager Remember that the MAS must regulate the user's state and generate coherent stories. Our approach consists in proposing a set of tools that allow people not familiar with cinematography to transform a story into feasible scenarios whose principal element is the user. The actors are represented by agents. g) The scenario The story manager helps the agents to determine all the possible actions with respect to the situation they are in. An agent's situation is defined by his geographical location and the scenario progress. While evolving in the world, the user generates events that depend on his geographical location. These events will help, on one side the story manager and on the side agents, in order to react to the user's actions. For example, go to a specific location to take drugs. The accomplishment of tasks by the user is indicated by events. Let the set of events. Each task the user is asked to fulfill corresponds to a node in the scenario. A scenario is made of an ordered sequence of nodes from the beginning to the end of a session. The couple (a,e), where: e Î E and a Î ² Ajs
N° 1
2011
a reduced number of actions, as it is in our study. Mp : ² Ajs ´ ² Ajs ´ N ® B j:0 n-1
j:0 n-1
is the matrix that gives the action-action links, with n is the number of agents and N the interval of nodes. For instance Mp(a', a'', 2) = 1 indicates that at the second node of the scenario, the execution of action a' will be followed by the execution of action a''. Synthesis on the mechanism of action selection: This mechanism is given by the following algorithm where the main steps are summarized: Determine the user's state. If a node is validated then go to the next one. Find the set of possible actions coming from events at the actual node of the scenario. Find the set of possible actions coming from priorities between actions at the actual node of the scenario. Find all possible actions Ap. Calculate the expected utility of stimuli that belong to Ap and find the optimal action a*. Assign the role of “MainAgent”. If an agent is the “Main Agent”, then activate stimulus a* otherwise choose randomly and activate a “Background” action.
j:0 n-1
represents action a that asks the user to accomplish a task and event e indicates that the task has been accomplished by the user, with n the number of agents and Ajs the set of stimuli of agent j. A node does not correspond to one couple (a,e) only but to many. In fact, a task can be asked from the user either in different ways by the same agent, which corresponds to different actions, or by other agents. Similarly, a task can be accomplished by the user in different manners, which corresponds to different events. Therefore, node i is associated to a set Ci = {ci', ci'', ci''', ...} of couples ci = (a,e)i and so allowing the transition from node i to i+1. Couple ci represents a condition of passage from node i of the scenario to node i+1. Action a asks the user to accomplish a task, event e indicates its accomplishment. So, a scenario with m nodes can be defined by the ordered series Sc : C0 ,C1 ,C2 ... Cm. h) Reaction to events The events triggered by the user enable agents to react to his actions. A given event is associated, by an agent, to one or several actions which are going to be activated. Let Me : E ´ ² Ajs ´ N ® B the matrix that gives the j:0 n-1
event-action links, with n the number of agents and N the interval of nodes.As an example, indicates that the accomplishment of event e at the third node of the scenario will be followed by the activation of a. i) Links between actions In some situations, when an action is selected by an agent, it must be followed by another action, either from the same agent or another one. The answer to this question consists in using priority rules represented by a graph of links between actions. This is valid only in the case of
4. Application in virtual reality exposure therapy To implement our approach we have developed an application in the context of cognitive and behavioral therapies. During the last two decades, many research works tried to experiment and evaluate the usage of VR to treat cognitive and behavioral disorders, and particularly phobias. The idea consists of making the patient evolve in a virtual environment (VE) where the she/he gradually faces situations that are feared in real life in order to desensitize him/her [25], [26] We are interested in social phobia which is a pathology whose fearful conditions can be easily created by an inductor based on VR using everyday life scenes and at the same time to have some measurable emotional reactions from the patients The virtual actors are represented by the agents of MAS, the patient will be considered the user. Video recording of real actors embody the actions of different agents, these videos are inlaid in the virtual environment by chromakey technique. An evaluation of the effect of these actions has been established by a psychologist. The virtual environment is a bank. We chose a bank for many reasons. It offers the possibility to diversify the anxiety provoking situations, such as talking to strangers, carrying out tasks while others are looking at you. The patient will also be motivated by the final objective that is withdrawing money. The agents present in the VE are: The secretary “Madeline”, at the reception desk (on the right in Fig. 4), a customer in the hall “Simon” (on the left), a counter clerk “Guy” (in the middle). The VE is projected on a wide screen. The patient is wearing a surfing tool that allows him move around into the bank. At the beginning of a session, the patient is at the bank entrance and the therapist explains to him the Articles
11
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
objective without giving him any details. The patient enters the bank and an electronic display says “Please go to the reception desk”. At the reception desk, the secretary asks him what she can do for him. He answers, and then she phones the counter clerk to have his opinion about the operation. Next, she tells the patient to go to desk number 2 to withdraw money. Finally, the counter clerk gives him his banknotes. This scenario corresponds to the main framework of the story, but even if the patient does not follow this scenario voluntarily or involuntarily, the agents help to go back to it.
N° 1
2011
Let: the probability that the patient “Reacts” to stimulus a : the patient reacts to the action a if he passes to the state compatible with the nature of the action. the probability that the patient hears or sees stimulus a; the probability that the patient passes from state s to s, knowing that he has seen stimulus a; the probability that the patient has seen or heard action a, knowing that he passed from s to s-. According to Bayes' theorem [24], we have: P(S) = (P(S Ç R))/(P(R-çï S)) = = (P(S-çï R)P(R))/(P(R-çï S))
Fig. 4. The virtual environment and virtual agents.
Remember that we considered action a as is the only stimulation source of the patient in the VE (it comes from the role assignment to -a single agent). Therefore, if he changes state from s to s, then it is done because of action a. For this reason, we can conclude that P(RïS) = 1. we have: p(sù s, a) = P(S) = P(S-çï R)P(R) P(S-çï R) is the probability that the patient passes from s à s knowing that he has seen or heard action a. We consider that P(S-çï R) depends on the effect of action a, according to the following relations:
P(S-çï R) = {¡(3/2(e(a)-1/3} Fig. 5. Examples of stimuli.
Fig. 6. Examples of three basic actions.. The patient's state can be “reassured”, “Anxious” or “Very Anxious” associated respectively to “Low”, “Normal” and “High”. Similarly, a stimulus may be “reassuring”, “anxiety provoking” or “neutral” associated respectively with “Positive”, “Negative” and “Neutral”. In Fig. 5, three stimuli are shown. These are, respectively from left to right: “Reassuring”, “Anxiety Provoking” and “Neutral”. In Fig. 6, three “Background” actions are illustrated. The principle of exposure therapy is to put the patient in front of anxiety provoking stimuli. Therefore, we can consider that our goal is to bring the patient back to the “Anxious”-“Stable” state. We are interested in calculating p(s-ïs, a) : the probability that the patient passes from state s to s-. We said earlier that p(s-ïs, a) depends on the contextual conditions of activation of a. These conditions are summarized with two parameters: the distance between the patient and the source of the action and the vision direction of the patient with respect to the source of action. 12
Articles
if a is "Anxiety Provoking". @1-3ïe(a)ï if a is "Neutral". @-3/2(e(a)+1/3) if a is "Reassuring". )-ç P(R) is the probability that the patient sees or hears action a. A model of this type is not available in literature. We defined a simplified model based on the following rules: The patient hears or sees with certainty the actions that are in his axis of vision and at a distance less than 4m; Actions with a 30° displacement with respect to the vision axis and a distance less than have a probability 0.7; Actions with a 90° displacement with respect to the vision axis and a distance less than have a probability 0.3; The probability of actions displaced by more than 90° and at distance less than is 0.1; Beyond 2m, the probability P(R) is inversely proportional to the distance between the action and the patient.
5. Evaluation In order to examine the story manager and the agent's behavior, we carried out a pilot assessment on non-phobic patients. It is difficult to show the effectiveness of our approach with a small example of three agents, like the one above, but therapists who tested the developed tool attest to its usefulness, particularly for the management of virtual agents. In a first version of the system, no measure of state anxiety is removed, the system feedback is provided by the
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
therapist as a Wizard of Oz; he estimates the patient's state and transmits it to the system through the keyboard. The sessions were recorded in order to return and consider the moments of interaction that interest us. Now, we give some examples: 5.1. Situation n°1: Outside of the bank. Fig. 7 The situation starts outside the bank, at the entrance. All agents activate some “Background” actions. The door is half-opaque and the patient can see what is going on inside.
Fig. 10. The customer in the hall looks the other way.
Situation n°2: The patient enters the bank Fig. 8 Now the patient enters the bank. While going through the main door, the electronic screen displays “Please go to the reception desk” with a simultaneous beep to attract the patient's attention, who could either go straight to the reception desk or to counter 2, or just walk in the hall.
5.4.
Situation n° 4: Instead of going to the reception desk, the patient goes to Counter 2 In this situation, the patient is wrong and instead of going to the reception, he goes to Counter 2. In this case, the cashier activates a “help” action and tells the patient to go to the reception. The way the patient is sent back to the reception could be “Reassuring” (Fig. 11), “Anxiety Provoking” (Fig. 12) or “Neutral”.
Fig. 7. The entrance of the bank.
Fig. 11. “Reassuring” “help” action.
Fig. 8. The patient enters the bank.
Fig. 12. “Anxiety Provoking” “help” action.
5.2.
5.3.
Situation n°3: The patient walks in to the bank. In this situation, the patient hesitates before going to the reception desk and continues his walk in the hall. Depending on his anxiety state, the customer in the hall might look at him to stimulate him (Fig. 9) in case he is “Reassured”, or omit him if he is already anxious (Fig. 10).
5.5.
Situation n°5: The patient is in front of the secretary The patient is at the reception. First, the secretary greets him and then gives him enough time to ask his question. Three possible ways of reception are shown in Fig. 13. The other agents continue to activate “Background” actions.
Fig. 13. Three ways of reception: “Anxiety provoking”, “Neutral” and “Reassuring”. Fig. 9. The customer in the hall stimulates the patient.
Articles
13
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
5.6.
Situation n°6:Aphone conversation between the secretary and the Counter clerk The patient asks the secretary if it is possible to withdraw money. She telephones the Counter clerk to ask him if this is possible. She activates an action that stimulates the patient and the counter clerk activates an “auxiliary” action that is required for a coherent scene (Fig. 14).
Fig. 17. The counter clerk gives the money to the patient. Fig. 14. The secretary phones the Counter clerk. 5.7.
Situation n° 7: The patient is in front of the secretary but looks at the customer In this situation, the secretary tells the patient to go to counter n°2 (Fig. 15). The patient, who is looking in the customer's direction but not at her, hears what she says (Fig. 16). Though the customer can stimulate him, he does not do it in order not to disturb the dialog between the secretary and the patient. Here, the secretary is the “Main agent”.
Fig. 15. The secretary talks to the patient.
the Liebowitz Social Anxiety Scale (LSAS), the Short Beck Depression Inventory (BDI-13) The iGroup Presence Questionnaire (IPQ), the Subjective Unit of Discomfort Scale (SUDS 0-100) and heart rate (HR). We scheduled the clinical protocol in two phases. The first phase includes one 45-minute session in which the patient discovers the therapeutic program and become familiar with the experimental device. The patient also completed self-administered questionnaires given above. The second phase is organized into eight weekly sessions of 30 to 45 minutes each. The patient experiences in each session the virtual environment of the bank projected on a screen. The patient completes auto-questionnaires previously given before the start of the fifth meeting and at the end of the eighth. The results showed that the solution formed by the automatic control system and the proposed therapeutic protocol can be applied successfully in therapy for social phobia. According to the results of measurement of presence in the environment of the bank (by IPQ), the patient had a large involvement in the virtual environment. The psychotherapist has confirmed both, the fluidity of our therapeutic approach and the easiness of managing this study; he gave more time to the patient and the therapy.
7. Conclusion
Fig. 16. What the patient really sees. 5.8. Situation n° 8: The patient takes the money The patient is at counter 2. The counter clerk gives him the money. Here too, the clerk can activate a “Neutral”, “Reassuring” or “anxiety provoking” action, depending on the patient's state. Fig. 17 shows the “anxiety provoking” action.
6. Real case study To implement our approach, we conducted a case study of a social phobic patient (male, 24 years). Several subjective scales and objective measure were taken to assess the effectiveness of our clinical method, including: 14
Articles
We have presented our approach to managing social interactions between a human and a set of social agents in environments. The proposed approach is based on cooperative MAS in which agents share the same objective and coordinate their actions to achieve it. We used coordination techniques between agents in a MAS to express the coherence between the activeted actions. The effectiveness of the proposed approach has been proven by adapting our approach to treat social phobia by VR. The results of social phobic patient case study testify to the usefulness of the developed application. The therapist has confirmed both, the fluidity of the considered therapeutic approach and the easiness to manage therapy sessions. We are currently studying the development of an application with real robots in order to aid elderly people to plan and organize daily tasks.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
AUTHORS Abdelhak Moussaoui, Alain Pruski* - Laboratoire d’Automatique humaine et de Sciences Comportementales - Université of Metz, 7, rue Marconi, 57070 METZ Technopôle, tel: + 33 387 315 281, fax: + 33 387 547 301. E-mail: {abdelhak.moussaoui, alain.pruski}@univmetz.fr Brahim Cherki - The Automatic Control Laboratory of Tlemcen -University of Tlemcen, FSI, BP 230, Chetouane 13000, tel: +213 43 28 56 89, fax: +213 43 28 56 85. E-mail: b_cherki@yahoo.fr
[14]
[15]
[16]
* Corresponding author [17]
References [1]
[2]
[3] [4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
L.E. Parker et al., “Current state of the art in distributed autonomous mobile robotics”. Distributed Autonomous Robotic Systems, vol. 4, 2000, p. 312. T. Arai, E. Pagello, L.E. Parker, “Editorial: Advances in multi-robot systems”. IEEE Transactions on Robotics and Automation, vol. 18, 2002, pp. 655-661. E. Bonabeau et G. Théraulaz, “L'intelligence en essaim”. Pour la Science, vol. 282, 2000, p. 66-73. G. Beni, S. Hackwood, “Stationary waves in cyclic swarms”. In: Proc. of the 1992 IEEE International Symposium on Intelligent Control, 1992, pp. 234-242. T. Fukuda, S. Nakagawa, “Dynamically reconfigurable robotic system”. In: Proceedings IEEE International Conference on Robotics and Automation, 1988, pp. 1581-1586. S. Premvuti, S. Yuta, “Consideration on the cooperation of multiple autonomous mobile robots”. IEEE International Workshop on Intelligent Robots and Systems, Towards a New Frontier of Applications, Ibaraki, Japan, pp. 59-63. H. Asama, A. Matsumoto, Y. Ishida, “Design of an autonomous and distributed robot system: ACTRESS”. IEEE/RSJ IROS, 1989, pp. 283290. A. Teller, “A platform for wearable physiological computing”. Interacting with Computers, vol. 16, Oct. 2004, pp. 917-937. J. Anttonen, V. Surakka, “Emotions and heart rate while sitting on a chair”. Portland, Oregon, USA: ACM, 2005, pp. 491-499. F.H. Wilhelm, M.C. Pfaltz, P. Grossman, “Continuous electronic data capture of physiology, behavior and experience in real life: towards ecological momentary assessment of emotion”. Interact. Comput., vol. 18, 2006, pp. 171-186. E. Klinger, I. Chemin, P. Légeron, S. Roy, F. Lauer, P. Nugues, “Designing Virtual Worlds to treat Social Phobia”. Cybertherapy 2003 (Wiederhold BK, Riva G, Wiederhold MD, eds.), San Diego, CA: Interactive Media Institute: 2003, pp. 113-121. M.M. North, S.M. North, J.R. Coble, “Virtual Reality Therapy: An effective Treatment for the Fear of Public Speaking”. International Journal of Virtual Reality, vol. 3, 1998, pp. 1-7. H. Grillon, F. Riquier, B. Herbelin, D. Thalmann, “Virtual Reality as Therapeutic Tool in the Confines
[18]
[19]
[20]
[21] [22]
[23]
[24] [25]
[26]
N° 1
2011
of Social Anxiety Disorder Treatment”. International Journal in Disability and Human Development, vol. 5, 2006, pp. 243-250. C. Botella, H. Villa, A. García Palacios, S. Quero, R.M. Baños, M. Alcaniz, “The use of VR in the treatment of panic disorders and agoraphobia”. Studies in Health Technology and Informatics, vol. 99, 2004, pp. 73-90. C. Botella, C. Perpiñá, R.M. Baños, A. García-Palacios, “Virtual reality: a new clinical setting lab”. Studies in Health Technology and Informatics, vol. 58, 1998, pp. 73-81. P. Anderson, B.O. Rothbaum, L.F. Hodges, “Virtual reality exposure in the treatment of social anxiety”. Cognitive and Behavioral Practice, vol. 10, Summer. 2003, pp. 240-247. E. Klinger, S. Bouchard, P. Légeron, S. Roy, F. Lauer, I. Chemin, P. Nugues, “Virtual reality therapy versus cognitive behavior therapy for social phobia: a preliminary controlled study”. Cyberpsychology & Behavior: The Impact of the Internet, Multimedia and Virtual Reality on Behavior and Society, vol. 8, 2005, pp. 76-88. A. Moussaoui, A. Pruski, B. Cherki, “Emotion regulation for social phobia treatment using virtual reality”, HuMaN07, Timimoun: 2007. A. Trebi-Ollennu, H.D. Nayar, H. Aghazarian, A. Ganino, P. Pirjanian, B. Kennedy, T. Huntsberger, P. Schenker, “Mars rover pair cooperatively transporting a long payload”. In: Proc. IEEE International Conference on Robotics and Automation, 2002, pp. 3136-3141. T. Huntsberger, P. Pirjanian, A. Trebi-Ollennu, H.D. Nayar, H. Aghazarian, A.J. Ganino, M. Garrett, S.S. Joshi, P.S. Schenker, “CAMPOUT: A control architecture for tightly coupled coordination of multirobot systems for planetary surface exploration”. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, vol. 33, 2003, pp. 550-559. D. Polani, RoboCup 2003, Springer, 2003. R. Alami, S. Fleury, M. Herrb, F. Ingrand, F. Robert, “Multi-robot cooperation in the MARTHA project”. IEEE Robotics & Automation Magazine, vol. 5, 1998, pp. 36-47. L.C.O.C. Weiss, E. André, “ECIRCUS: Building Voices for Autonomous Speaking Agents”. 6th Speech Synthesis Workshop, Bonn, Germany: 2007. S.J. Russell, P. Norvig, Artificial Intelligence: A Modern Approach (2nd edition). Prentice Hall, 2003. E. Klinger, P. Légeron, S. Roy, I. Chemin, F. Lauer, P. Nugues, “Virtual reality exposure in the treatment of social phobia”. Studies in Health Technology and Informatics, vol. 99, 2004, pp. 91-119. A. Moussaoui, A. Pruski, B. Cherki, “Social interaction control in the framework of virtual reality therapy”, France, 2009.
Articles
15
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
BAYESIAN MODEL FOR MULTIMODAL SENSORY INFORMATION FUSION IN HUMANOID ROBOT Received 23rd January 2010; accepted 15th September 2010.
Wei Kin Wong, Chu Kiong Loo, Tze Ming Neoh, Ying Wei Liew, Eng Kean Lee
Abstract: In this paper, the Bayesian model for bimodal sensory information fusion is presented. It is a simple and biological plausible model used to model the sensory fusion in human’s brain. It is adopted into humanoid robot to fuse the spatial information gained from analyzing auditory and visual input, aiming to increase the accuracy of object localization. Bayesian fusion model requires prior knowledge on weights for sensory systems. These weights can be determined based on standard deviation (SD) of unimodal localization error obtained in experiments. The performance of auditory and visual localization was tested under two conditions: fixation and saccade. The experiment result shows that Bayesian model did improve the accuracy of object localization. However, the fused position of the object is not accurate when both of the sensory systems were bias towards the same direction. Keywords: multimodal, Bayesian fusion, fixation, saccade, humanoid robot.
1. Introduction Realizing audiovisual object localization in humanoid robot is a challenging work. As the possible locations of the object are computed based on image and sound independently, the results need to be fused to form the ultimate perceptions. This Sensory information fusion has the advantages of reducing uncertainties of the information, providing information that is unavailable from single type of sensor and causing the system to be fault tolerant [8]. In years, researchers have developed sophisticated methods for multisensory information fusion for robot. For example, Yong-Ge Wu et al. [10] proposed an information fusion algorithm based on generalized Dempster-Shafer's theory of evidence (DSTE). Their result shows that higher accuracy is achieved when the vision evidence is dependent which violate the basic assumption of DSTE. J.A. Lopez-Orozco's et al. [11] proposed an enhanced Kalman filter multi-sensor fusion system that is used to calculate the position and orientation of an autonomous mobile robot. They focused on simplifying the Kalman filter multi-sensor fusion to lower its computational cost and solving the dependency on the selection of different sampling period and assimilation waiting interval. Kentaro Toyama's et al. [12] developed a Markov dynamic network model that integrates the analyses of multiple visual tracking algorithm to enable better head tracking. Their analysis shows that Bayesian system fusion usually outperforms any of its constituent systems, often making estimates close to the system estimate with the least error. 16
Articles
Futoshi Kobayashi's et al. [13] proposed a Recurrent Fuzzy Inference (RFI) with recurrent inputs and applied it to a multi-sensor fusion system in order to estimate the state of systems. The membership functions of RFI are expressed by Radial Basis Function (RBF) with insensitive ranges and the shape of the membership functions can be adjusted by a learning algorithm. Lucy Y. Pao et al. [14] developed a multi-target tracking algorithm based on Joint Probabilistic Data Association (JPDA) for use in multi-sensor tracking situation. They found that the multisensory data association probability is actually the product of the single-sensor data association probability. Instead of fusing spatial information of the object gained, several developed humanoid robot uses auditory information only as a guide for visual system. For example, Carlos Beltrán-González et al. [15] proposed a crossmodal perceptual architecture that can segment object objects based on visual-auditory sensorial cues, by constructing an associative sound-object memory and create visual expectation using a sound recognition algorithm. In HansH. Bothe’s et al. paper [16], they described a hierarchically organized technical system performing auditoryvisual sound source localization and camera control. In their fusion system, auditory maps that belongs to one video interval is fused. Then, the resultant map is filtered by three types of filters, and the results are fused again. Finally, the fovea position of the object is calculated using quadratic prediction. Hiromichi Nakashima et al. [17] proposed a learning model for sound source localization through interactions between motion and audio-visual sensing. The model consists of two modules, which are a visual estimation module consisting of a three-layer perceptron and an auditory estimation module consisting of a neural network with a Look-Up-Table algorithm. Different from fusion models discussed above, in the field of neuroscience, recent researches [1-5] strongly suggest that multisensory cues, especially spatial cues may be combined in a much simpler statistically optimal manner in the brain. For example, Paola Binda et al. [2] conducted test on human’s ability of localization of visual, auditory and audiovisual stimulus during the time of fixation and saccade. They assumed that saccade has little effect on auditory space perception and both auditory and visual spatial cues are conditionally independent. They showed that the result of fusion of auditory and visual spatial cue can be well modeled by Bayes's theorem and the precision of localization is better in multimodal presentation compare to unimodal presentation. At the same time, their result also showed that auditory signal becomes more important for localization at the time of saccades, suggesting that the visual signal has become transiently
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
noisy, and therefore receives less weight. In order to adapt this Bayesian model of multisensory data fusion into robot, the weights of all sensory systems under different conditions such as fixation or saccade must be determined. This can be done by conducting experiment to measure the level of erroneous of them. In this paper, we focus on adopting the Bayesian fusion model into humanoid robot to fused spatial properties of an object detected by visual and auditory sensors. We also proposed a way to calculate weights of visual and auditory system based on the error of localization obtained through experiment, which is the main contribution of this paper. In order to reduce complexity of calculation, only azimuth position of the target object is considered. The visual system of the robot locate the target object based on its dominant color, determined using color segmentation process that involve log-polar transformation, temporal difference and hue-saturation (H-S) histogram. The auditory system locate the target object using interaural time difference (ITD), which is the difference in time for the sound to reach the left and right microphone. ITD is determined using generalized cross-correlation (GCC) method performed in frequency domain. Note that unimodal object localization refers to object localization using only aidutory or visual system and bimodal object localization utilize information from both sensory system. In section 2 of this paper, the Bayesian fusion model is described. The description of humanoid robot, the experiment and conclusion are available in section 3, 4 and 5, respectively.
p(cïY) = P ki = 1 p(XiïY)
N° 1
2011
(2)
It is usually easier to deal with the logarithm of the likelihood than the likelihood itself, because the product can be changed to the sum, and the term involving exponents can be simplified. Let L(Y) be the log-likelihood function: L(Y) = log p(cïY) = S ki = 1 log p(XiïY)
(3)
Assume that the reading from the sensors follow Gaussian density function, so that p(XiïY) is given by: 1 e(-½(xi-Y)tCi-1(xi-Y)) (2p)n/2|Ci|1/2
p(XiïY) =
(4)
where Ci is the variance-covariance matrix, t denotes the transpose, and ïï denote the determinant. Now, the expression for likelihood (Eq. 3) becomes: L(Y) = S ki = 1 - -12 log[(2p)n|Ci|] - -12 (Xi - Y)tCi-1(Xi - Y) (5) -
The best estimate Y of Y can be found by differentiating L with respect to Y, equating the result to zero, and computing the value of Y, as follows: -
Y=
S ki = 1 Ci-1Xi S ki = 1 Ci-1
(6)
Consider a system that contains only two sensors and let both Y and Xi in Eq. 6 be scalar measurements and let C simply be the SD of localization error of the sensor, Eq. 6 is then becomes:
2. Bayesian Fusion Model In short, Bayes’ theorem can be described as a way of converting one condition probability to other, by reweighting it with the relative probability of the two variables [1]. In this paper, the weights of sensory system during fixation and saccade were computed base on SD of localization error. In order to derive the relationship between signal inputs, weights and the resultant output, let’s assume that the independent sensor output is denoted by vector X = (x1, x2, ... , xm) and the object property (e.g. three-dimensional position) is denoted by Y = (y1, y2, ... , ym), so that p(XïY) is the probability of sensor output being X given that the object property is Y, and p(YïX) is called the posterior probability of object property being Y given that the sensor output is X. In this particular application, p(XïY) can be computed from data collected during tests, while p(YïX) is the desired outcome. These two probabilities are related by Bayes’theorem as follow [8]: p(YïX) =
p(YïX)p(Y) p(X)
1 s 21
y=
x1 + 1 s 21
x2
1 s 22
(7)
In our case, symbol S is used to denote the azimuth position, so that y = Sv,a be the best estimation of object azimuth position, x1 = Sv and x2 = Sa be the visual and auditory spatial information in azimuth plane computed by visual and auditory system. Then, the Eq. 7 is rewritten as:
Sv,a =
1 s 2v
Sv + 1 s 2v
+
1 s a2
Sa
1 s a2
(8)
Clearly, Eq. 8 shows that result of bimodal localization is actually the weighted sum of the results of two unimodal localization, where the weight of visual and auditory system (wv and wa) [2] are defined as: 1 s 2v
(1)
where the marginal probability p(X) and the prior probability p(Y) are the unconditional probabilities of the sensor output and object property being X and Y respectively. Then, assume that in the system there are k sensors, which give the following readings: c = (X1, X2, ... , Xk) and the best estimate of the object property Y can be developed using these k sensors reading. This can be achieved by using the likelihood estimate. In the likelihood estimate Y is computed such that the following is maximized:
+
1 s 22
wv =
1 s 2v
+
1 s a2
(9)
1 s a2
(10)
and
wa =
1 s 2a 1 s 2v
+
Thus, Eq. 8 can be further simplified into Articles
17
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
Sv,a = wv Sv + wa Sa
(11)
The predicted bimodal threshold sv,a is then given by sv,a =
s2v sa2 s2v + sa2
N° 1
2011
relative to the head. The range of head’s motion was limited to 30° to the left and right. The sampling rate of the program was 6 Hz and the eyes and ears of the robot share the common center in horizontal plane.
(12)
Clearly, according to Eq. 11, SD denotes the reliability of the sensory system. A reliable sensory system with lower SD will be assigned a larger weight and will become more important in estimating the target’s properties under certain condition (e.g. fixation and saccade).
3. E-Bee the humanoid robot (a)
Fig . 1. E-Bee’s head. The humanoid robot used in the experiment is called EBee, shown in Fig. 1. Its head consists of 5 degree of freedom (D.O.F) and is driven by servomotors. These servomotors are powered by a unit of D.C. power supply and are control by a SSC-32 servomotor controller connected to the computer. The computer contains an Intel Xeon E5335 Quad-core processor, 2 GB RAM, 100 Mbps network controller, a standard 44.1kHz soundcard and a 128 MB graphic card. A pair of Logitech Quickcam Express and a pair of Sony ECM PC50 omni-directional microphones were used in the robot to grab image and sound. The Linux base asymmetric multiprocessing (AMP) system is equipped with Intel open source computer vision library (OpenCV) and Intel Integrated Performance Primitives (Intel IPP) which contains C/C++ Compiler, Math Kernel Library and Signal Processing Library.
4. Experiment In order to study the Bayesian model for sensory information fusion and to compute the weight of visual and auditory system during the time of fixation and saccade, tests were conducted to obtain the SD of these two sensory systems. In these tests, only azimuth position of the stimulus was considered. We assumed that error and sampling period of auditory and visual system were independent and time lag between visual and auditory signal can be ignored. Background of image was a static color background in green color, located in front of the humanoid robot, with a distance of 60 cm from the eye-center, as shown in Fig. 2. The visual stimulus was presented as a 2 cm x 2 cm red square on the green background and the auditory signal was presented as continuous white noise, played through a normal speaker. Fluorescent light was used with no additional light source such as sun light. Due to hardware limitation, the neck was turned relatively to the stimulus to make the audiovisual stimulus move horizontally in the image captured, instead of driving the stimulus to move 18
Articles
(b) Fig. 2. (a) The robot facing forward which is the direction of the red square and white noise and (b) top view of the experiment setup, q indicates the azimuth angle with respect to head center (modified from [9]). Total of five tests were conducted in this experiment. They are fixation test, 20 deg/sec saccade test, 30 deg/sec saccade test, 60 deg/sec saccade test and condition specific calibration saccade test. During fixation test, the robot’s head and eyes were initially fixated towards the background at position (0°, -20°) (elevation angle of 0o and azimuth angle of -20°; positive value indicates right and up of the humanoid robot while negative value indicates left and down). The head was then turned with 1o stepping and fixate for 1 sec after each step, until it reach (0°, +20°) and back to (0°, -20°). The test was repeated three times. Perceived positions of auditory and visual stimulus were recorded each time the robot head fixated. During saccade test, the neck was turned by three different angular speeds (20 deg/sec, 30 deg/sec and 60 deg/sec). The robot head was firstly fixate at (0°, -30°). Once the tests start, the head was turned between position (0°, +30°) and (0°, -30°) and only the data that fell within the range (20° to the left and right) were recorded. Each test was repeated 50 times in order to minimize the influence of random hardware error and the motion range was extended to 30° at both sides during sac-cade test to avoid obtaining data during saccade onset and changes of direction of motion because images was noisy during these conditions. After the results of these saccade test were obtained, the robot system was recalibrated based on the amount of error obtained during 60 deg/sec saccade test. Then, the condition specific cali-
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
bration test was car-ried out by using similar setting with 60 deg/sec saccade test. The robot head was assumed to turn with a constant speed until the end of motion after a short time from saccade onset and changes of direction of motion. The actual azimuth position of robot head was estimated using a simple mathematic equation shown below:
N째 1
2011
Fig. 3 demonstrates the result of all five tests. The result of fixation test indicates that during fixation, auditory and visual localization was accurate with little error of object localization. According to the results of 20 deg/sec, 30 deg/sec and 60 deg/sec saccade test, the error of unimodal object localization increased as the angular speed of robot head was increased. This is because the images captured were blur and not reliable during the time of saccade. This phenomenon affected the result more and more signi-
ficantly as the angular speed of the robot head increased, especially during the time saccadic motion starts and during changes of direction of motion. Table 1 and Fig. 4 summarize the calculated mean and SD of the error of unimodal and bimodal object localization. Table 1 also summarize the calculated weights of unimodal object localization under different conditions. The bimodal threshold, sv,a was calculated by comparing the result of bimodal localization with the ideal position, rather than using Eq. 12. According to the results in Table 1, it is clear that Bayesian model of sensory information fusion did improve the accuracy of object localization as the mean error of bimodal object localization were smaller then the mean error of object localization by visual system at all time. Also, during all three saccade tests, auditory system has become more reliable as it carried lower error than visual system. However, the results of object localiza-tion by auditory system during all saccade tests were still less reliable since they deviated greatly (2째 to 7째) from ideal position throughout the tests. As a result, the result of bimodal object localization was distorted greatly when both sensory systems were not accurate. This has clearly demonstrates the disadvantage of Bayesian model that, during the absence of reliable sensory system such as auditory system in human [2], the result of Bayesian model could become inaccurate as well.
Fixation Test for Unimodal and Bimodal Localization
20deg/sec Saccade Test
q = n(30 - wt) n=
(13)
-1 for motion from left to right 1 for motion from right to left
where q denote the azimuth position of the stimulus, w denote the angular speed and t denote the time relative to saccade onset of each turn.
(a)
(b) 30deg/sec Saccade Test
(c)
60deg/sec Saccade Test
(d) Condition Specific Calibration Saccade Test
Fig. 3. (a) Demonstrations of a portion of the results of unimodal and bimodal localization during fixation test. (b) (c) (d) Demonstrations of the result of 20 deg/sec, 30 deg/sec and 60 deg/sec saccade test, respectively. (e) The result of condition specific calibration saccade test. (adopted from [9]).
(e)
Articles
19
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
Table 1. Comparison of mean and SD of error of localization of auditory, visual and Bayesian audiovisual localization, as well as the weight of unimodal localization [9]. Auditory localization
20
Visual localization
Test
Mean (x- a)
SD (sa)
Weight (wa)
Mean (x- v)
SD (sv)
Weight (wv)
Bayesian audiovisual localization Mean SD (x- v,a) (sv,a)
Fixation
0.7159
0.8732
0.6897
0.7683
1.3020
0.3103
0.6341
0.7905
20deg/sec saccade
1.9910
2.5389
0.4636
2.0713
2.3605
0.5364
1.7941
2.1015
30deg/sec saccade
2.7428
3.3786
0.6584
4.2705
4.6905
0.3416
3.1689
3.6338
60deg/sec saccade
6.6249
6.9261
0.7239
10.7557
11.2159
0.2761
7.7643
7.9769
Condition specific calibration saccade
1.9197
2.3324
0.7132
2.9261
3.6779
0.2868
1.9492
2.3861
(a)
(b)
(c)
(d)
Articles
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
(e)
N째 1
2011
(f)
Fig. 4. (a)(b)(c)(d)(e) Demonstrations of mean and SD of auditory, visual and audiovisual localization under different conditions. (f) Comparison of mean error of localization. The mean error increased when saccade speed increased in all tests, except in the last test where condition specific calibration helped to lower the mean error. Hence, in order to overcome this weakness of Bayesian model, condition specific calibration was proposed. After the condition specific calibration, the mean and SD of error of unimodal object localization became much smaller (see Fig. 4) compare to the 60 deg/sec saccade test. However, the disadvantage of this solution was that, the humanoid robot must always be aware of the angular speed of its head and condition specific calibration value must be known. Similar to biological system, altering robot design will eventually alter the weight of all sensory system.
eng.kean.lee@intel.com. * Corresponding author
References [1]
[2]
5. Conclusion In this paper, the Bayesian model for multimodal sensory information fusion is described. We show that Bayesian fusion of multisensory information is simple and can be applied in humanoid robot to increase the accuracy of object localization. However, current model requires prior knowledge on the reliability of sensory system obtained through a series of experiments. Besides, the Bayesian model failed to generate accurate spatial position of audiovisual stimulus when both of the sensory systems were not reliable, especially during saccade. Future work should focus on using neural network and reinforcement learning to provide the Bayesian fusion model the ability to learn the weights online.
[3]
[4] [5]
[6] [7]
ACKNOWLEDGMENTS This research was funded by Intel Research Grant from year 2007 to 2009.
[8]
Battaglia P.W., Jacobs R.A., Aslin R.N., "Bayesian integration of visual and auditory signals for spatial localization". Journal of the Optical Society of America, vol. 20, 2003, pp. 1391-1397. Bolognini N., Rasi F., L'adavas E., "Visual localization of sounds". Neuropsychologia, vol. 43, 2005, pp. 1655-1661. Sommer K.-D., Kuhn O., Puente Leon F., Bernd R.L. Siebert, "A Bayesian approach to information fusion for evaluating the measurement uncertainty". Robotics and Autonomous Systems, vol. 57, 2009, pp. 339-344. Hackett J.K., Shah M., "Multisensor fusion: a perspective". In: Proc. of IEEE International Conference on Robotics and Automation, vol. 2, 1990, pp.1324-1330.
[9]
AUTHORS Wei Kin Wong, Tze Ming Neoh, Chu Kiong Loo* Centre for Robotics and Electrical Systems, Multimedia University, Jalan Ayer Keroh Lama, 75450 Melaka, Malaysia. E-mails: kin2031@yahoo.com, neohtm@gmail.com, ckloo@mmu.edu.my. Ying Wei Liew, Eng Kean Lee - Intel Malaysia Sdn. Bhd., Bayan Lepas Free Industrial Zone, 11900 Penang, Malaysia. E-mails: ying.wei.liew@intel.com,
Knill D.C., "Bayesian models of sensory cue integration". In: Kenji Doya, Shin Ishii, Alexandre Pouget, Rajesh P. N. Rao, Bayesian Brain: Probabilistic Approach to Neural Coding, The MIT Press, Cambridge, 2007, pp.189-206. Binda P., Bruno A., Burr D.C., Morrone M.C., "Fusion of visual and auditory stimuli during saccades: a Bay-esian explanation for perisaccadic distortions". The Journal of Neuroscience, vol. 27, 2007, pp. 8525-8532. Sophie Deneve, Alexandre Pouget, "Bayesian multisensory integration and cross-modal spatial links". Journal of Physiology-Paris, vol. 98, 2004, pp. 249-258. Burr D.C., Alais D., "Combining visual and auditory information". Progress in Brain Research, vol.155, 2006, pp. 243-258.
Wei Kin Wong, Tze Ming Neoh, Chu Kiong Loo, Chuan Poh Ong, "Bayesian fusion of auditory and visual spatial cues during fixation and saccade in humanoid robot". Lecture Notes in Com-puter Science, vol. 5506, 2008, pp. 1103-1109.
[10]
[11]
Yong-Ge Wu, Jing-Yu Yang, Ke Liu, "Obstacle detection and environment modeling based on multisensor fusion for robot navigation". Artificial Intelligence in Engineering, vol. 10, 1996, pp. 323-333. Lopez-Orozco J.A., de la Cruz J.M., Besada E., Ruiperez P., "An asynchronous, robust, and distributed multisensor fusion Articles
21
Journal of Automation, Mobile Robotics & Intelligent Systems
[12]
system for mobile robots". The International Journal of Robotics Research, vol. 19, 2000, pp. 914-932. Toyama K., Horvitz E., "Bayesian modality fusion: probabilis-
[13]
tic integration of multiple vision algorithms for head tracking". th In: Proc. of 4 Asian Conference on Computer Vision, 2000. Kobayashi F., Arai F., "Sensor fusion system using recurrent
[14]
fuzzy inference". Journal of Intelligent and Robotic Systems, vol. 23, 1998, pp. 201-216. Pao L.Y., O'Neil S.D., "Multisensor fusion algorithms for
[15]
[16]
tracking". American Control Conference, 1993, pp. 859-863. Beltràn-Gonzàlez C., Sandini G., "Visual attention priming based on crossmodal expectations". In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005, pp. 10601065. Bothe H.-H., Persson M., Biel L., Rosenholm M., "Multivariate nd
sensor fusion by a neural network model". In: Proc. 2 International Conference on Information Fusion, 1999, pp. 10941101. [17]
Hiromichi Nakashima, Noboru Ohnishi, "Acquiring localization ability by integration between motion and sensing". In: Proc. of IEEE International Conference on Systems, Man, and Cybernetics, vol. 2, 1999, pp. 312-317.
[18] [19]
[20]
[21]
[22]
[23] [24]
[25]
Kòrding K.P., Beierholm U., Ma W.J., et al., "Causal Inference in Multisensory Perception". PLoS ONE, vol.2, 2007, e943. Mishra J., Martinez A., Sejnowski T.J., Hillyard S.A., "Early Cross-Modal Interactions in Auditory and Visual Cortex underlie a Sound-Induced Visual Illusion". The Journal of Neuroscience, vol. 27, 2007, pp. 4120-4131. Yoshiaki Sakagami, Ryujin Watanabe, Chiaki Aoyama, et al., "The intelligent ASIMO: system overview and integration". In: IEEE/RSJ International Conference on Intelligent Robots and System, vol. 3, 2002, pp. 2478-2483. Metta G., Sandini G., Vernon D., et al., "The iCub humanoid robot: an open platform for research in embodied cognition". Performance Metrics for Intelligent Systems Workshop, Gaithersburg, USA, 2008. Metta G., Gasteratos A., Sandini G., "Learning to track colored objects with Log-Polar vision". Mechatronic, vol. 14, 2004, pp. 989-1006. Berton F., A brief introduction to log-polar mapping. LIRALab, University of Genova, 2006. Natale L., Metta G., Sandini G., "Development of auditoryevoked reflexes: visuoacoustic cues integration in a binocular head". Robotics and Autonomous Systems, vol. 39, 2002, pp. 87-106. Kee K.C., Loo C.K., Khor S.E., Sound localization using generalized cross correlation: Performance Comparison of PreFilter. Center of Robotics and Automation, Multimedia University, 2008.
22
Articles
VOLUME 5,
N° 1
2011
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
REAL-TIME COORDINATED TRAJECTORY PLANNING AND OBSTACLE AVOIDANCE FOR MOBILE ROBOTS Received 23rd July 2010; accepted 25th November 2010.
Lucas C. McNinch, Reza A. Soltan, Kenneth R. Muske, Hashem Ashrafiuon, James C. Peyton
Abstract: A novel method for real-time coordinated trajectory planning and obstacle avoidance of autonomous mobile robot systems is presented. The desired autonomous system trajectories are generated from a set of first order ODEs. The solution to this system of ODEs converges to either a desired target position or a closed orbit de.ned by a limit cycle. Coordinated control is achieved by utilizing the nature of limit cycles where independent, non-crossing paths are automatically generated from different initial positions that smoothly converge to the desired closed orbits. Real-time obstacle avoidance is achieved by specifying a transitional elliptically shaped closed orbit around the nearest obstacle blocking the path. This orbit determines an alternate trajectory that avoids the obstacle. When the obstacle no longer blocks a direct path to the original target trajectory, a transitional trajectory that returns to the original path is defined. The coordination and obstacle avoidance methods are demonstrated experimentally using differential-drive wheeled mobile robots. Keywords: path planning, obstacle avoidance, ODE, mobile robots.
1. Introduction A typical method for trajectory planning is to describe the trajectory by a time-varying transitional state variable that converges to the desired target from an arbitrary initial position. A well-defined system trajectory determined in this manner, however, should not have any discontinuities and should admit tracking controller designs that are able to feasibly follow the resulting trajectory. The coordination of a group of autonomous systems is an even more challenging problem because of the potential for dynamic interaction and collision between members of the group and/or obstacles. A number of different approaches for the coordination of a group of autonomous systems into user-defined formations have been proposed. These approaches can be categorized into virtual structure, behavior-based, and leader-follower techniques. A survey of recent research in cooperative control of multi-vehicle systems is presented in [1]. Much of the work in obstacle avoidance has focused on trajectory optimization, decentralized control approaches, behavior-based approaches, and potential .eld methods. A review of path planning and obstacle avoidance techniques is presented in [2] In this work, a novel trajectory planning and obstacle avoidance approach that provides coordinated control of multiple autonomous agents is presented. The proposed strategy is composed of two parts
consisting of a coordinated trajec-tory planning algorithm [3] and a real-time obstacle avoid-ance algorithm [4]. The main advantage of this approach is a practical, computationally efficient trajectory planning algorithm with the capability for coordination of multiple autonomous agents and real-time collision avoidance for moving obstacles. To accomplish these tasks, transitional trajectories are defined using sets of ordinary differential equations (ODEs) of two general forms depending on the situation. If the objective is to catch and follow a desired target trajectory, the transitional trajectory is defined by a set of exponentially stable equations whose solution converges to the target trajectory. In the cases of coordinated motion and obstacle avoidance, the trajectory is defined by a set of equations whose solution is a stable limit cycle. A related approach to trajectory planning that is applicable to industrial robot manipulators is presented in [5]. Limit cycles of finite size and shape are used as a way of de.ning complex obstacles to be avoided. Applications of realtime navigation methods for mobile robots using limit cycles have also been addressed in [6], [7]. The limit cycles are defined as circles with constant coef.cients which may result in impractical control demand depending on the distance of the vehicle to the limit cycle. The use of constant coefficients in the limit cycle equations also applies only to stationary limit cycles. In this work, these approaches are expanded to dynamic limit cycles with more general elliptical geometries.
2. Mobile robot platform
Fig. 1. Experimental mobile robot from each motor to determine the wheel velocities. Articles
23
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
The experimental mobile robot system used in this work is the differential-drive wheeled robot shown in Figure 1 based on the LEGO Mindstorms NXT controller. It has the ability for wireless communication between individual robots and a host computer. The coordinated trajectory planning and real-time obstacle avoidance application is developed for this platform in the Matlab/ Simulink programming environment using the ECRobot rapid prototyping development tools [8]. The relatively low cost and high performance of this device results in a very powerful, flexible, and cost-effective mobile computing platform that can be used to implement advanced autonomous system applications directly on the target hardware. Further detail on this mobile robot platform can be found in [9]. A. Kinematic Model The autonomous mobile robot consists of two indepen-computed using equation (3) and the current position in the dently actuated front drive wheels of radius r mounted on inertial reference frame is determined using the kinematic a common axis as shown in Figure 2. The track width d represents the length of the segment connecting the wheel centers. A passive caster supports the rear of the mobile robot. The three degree of freedom planar posture of the robot is described in the inertial reference frame by the vector p = [x y q]T and the motion of the mobile robot is subject to the nonholonomic constraint y每cosq - x每 sinq = 0
(1)
arising from the two-dimensional planar kinematics for a differential drive mobile robot. The planar motion is described by the velocity vector q = [v w]T where v is the forward velocity of the robot in the direction orthogonal to the drive-wheel axis and w is the angular velocity of the robot about the center of the drive-wheel axis. The kinematic equations relating the body fixed velocities q to the inertial reference frame are written as (2)
Fig. 2. Differential drive mobile robot schematic. 24
Articles
N掳 1
2011
Assuming there is no wheel slip, the body fixed velocities are related to the angular velocities of the drive wheels by (3) where wL and wR are the angular velocities of the left and right wheels respectively [10]. B. Position Feedback Two methods of position feedback are available for the experimental mobile robots. The first uses encoder feedback from each motor to determine the wheel velocities. The current forward velocity v and angular velocity w are then computed using equation (3) and the current position in the inertial reference frame is determined using the kinematic model presented in equation (2). A discretetime approximation of the current pose of the robot at sample k is
(4) where pk = [xk yk qk]T and T is the sample period. This method assumes that there is no wheel slip and that the mobile robot travels on a perfectly flat planar surface. Because there are typically wheel slip and other un measured disturbances present, a more accurate method to determine the robot position based on real-time processing of a camera image is also available. A digital black and white camera mounted a fixed position above the mobile robots is used to view the two infrared LEDs installed at the front and rear of the centerline of the robot as shown in Figure 1. Each camera frame provides a dead reckoning measurement of position and orientation of each robot that is determined from the pixel location of the LEDs [9]. Position measurements are available at approximately are related to the angular velocities of the drive wheels by 20Hz using the camera and at least 100 Hz using the encoders. The advantage of using encoder feedback is that the position can be determined locally without the need to transmit a camera-based position to each robot from the host computer at the 20 Hz sample rate of the camera. The advantage of using camera feedback is that it is a dead reckoning measurement that is not subject to error from wheel slip, uneven surfaces, collisions, and other disturbances. A combination of encoder and camera feedback using multi-rate estimation schemes can be employed to obtain the benefits of each measurement while minimizing the communication requirements, however, this technique is not considered in this work. C. Local Tracking Control The reference posture and velocity vectors describing the desired trajectory are defined at each sample period where the reference posture pkr = [xkr ykr qkr]T and velocity qkr = [vkr wkr]T vectors must adhere to the nonholonomic constraint equation (1). The position and orientation
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
tracking error vector ekp = [ekx eky ekq]T is expressed as (5) where the current pose of the robot is pk. Various control laws for tracking and position control of mobile robots have been developed. In this work, the following kinematic tracking control law proposed in [11] is used (6) where the tracking errors are defined in equation (5), vks is the corrected forward velocity setpoint, wks is the corrected angular velocity setpoint and the controller gains kx, ky, and kq are strictly positive constants. The corresponding wheel velocity setpoints which compensate for any tracking error in the reference trajectory are calculated by inverting equation (3) (7) where the rotational speed setpoints wLs and wRs are sent to individual wheel speed controllers. An extension of this controller to motor torque and position set point control is proposed in [12].
3. Experimental application An experimental veri.cation of a novel method of combining trajectory planning and coordination or formation control of robotic and autonomous systems presented in [3] for autonomous surface vessels is demonstrated in this section. The method generates target trajectories that are either asymptotically stable or result in a stable limit cycle. The former case is used to implement formation control. Coordination is guaranteed in the latter case due to the nature of limit cycles where non-crossing independent paths are automatically generated from different starting positions that smoothly converge to closed orbits.Arelated approach for trajectory planning that is applicable to industrial robot manipulators is presented in [5]. A survey of recent research in cooperative control of multi-vehicle systems is presented in [1].
Fig. 4. Open-loop and closed-loop paths. A. Trajectory Tracking Performance A demonstration of the mobile robot tracking control performance is provided by defining a reference trajectory that approaches and then follows a circular orbit with a diameter of 1 meter represented by the solid line in Figure 3. The mobile robot must avoid two obstacles in this path that are contained within the circles indicated by the dashed lines in Figure 3. These two circles represent the obstacle boundaries that should not be crossed. Figure 4 presents the desired reference and experimental mobile robot paths with and without tracking control. As shown in this figure, the tracking error increases as the robot turns without encoder feedback tracking control. The maximum tracking error is reduced over five times using the closedloop tracking control law presented in equation 6. The corresponding tracking errors are presented in Figure 5.
Fig. 5. Trajectory tracking errors.
Fig. 3. Trajectory tracking example.
B. Encoder and Camera Image Position Feedback Figure 6 compares the desired reference, encoder, and camera image paths for the mobile robot under tracking control using encoder feedback on a flat surface with minimal wheel slip. As shown in this figure, there is very little difference between the position determined from the en coders and that determined from the camera image. Figure 7 presents a similar scenario except there is significant wheel slip obtained by using plastic wheels, as opposed to rubber, on a slick surface and specifying Articles
25
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
a trajectory with a smaller turning radius. A large error in the encoder position is now present under these conditions. In the sequel, the conditions in Figure 6 are used such that encoder feedback will provide an acceptable position measurement. The advantage of en coder feedback is that the position can be determined locally by each robot without the need to transmit a camera-based position to each robot at the 20 Hz sample rate of the wheel speed controllers.
N° 1
2011
where the definition of x* and y* and the form of the functions f1 and f2 are determined based on the type of reference trajectory. The reference posture pr and velocity qr are obtained from the state variables by
(10)
(11) where the expressions for qr and wr arise from the nonholonomic constraint in equation (1) Two general types of reference trajectory are presented in this work. The first is implemented when the mobile robot is required to reach a target position which may be either moving or stationary. This type of reference trajectory is referred to as a Transitional Target Trajectory. The second is implemented when the mobile robot is required to smoothly converge to and then follow a closed orbit. This type is referred to as a Transitional Limit Cycle Trajectory. D. Transitional Target Trajectory The transitional target trajectory defines a reference trajectory for which the mobile robot path converges exponentially to a stationary or moving target position. The desired reference position is denoted by xr(t) and yr(t) and the target position is denoted by xt(t) and yt(t) in the inertial reference frame. The state variables introduced in equation (9) are defined as
Fig. 6. Closed-loop trajectory with camera position.
(12) such that when the ODEs in equation (8) are asymptotically or finite time stable, x1(t), x2(t) ® 0, which implies that xr ® xt(t) and yr ® yt(t). The exponentially stable transitional target trajectory used in this work is defined by the following linear ODE system (13)
Fig. 7. Closed-loop tracking error with slip. C. Reference Trajectory Determination The reference trajectory for each mobile robot is defined using a set of two ordinary differential equations (ODEs) in terms of the two planar global position variables x and y (8) where x1 and x2 are the state variables, Di is an open and connected subset of ú2, and fi is a locally Lipschitz map from Di into ú2. The state variables x1 and x2 are defined as (9)
26
Articles
where t0 is the initial time. The strictly positive parameters k1(t) and k2(t) are assumed to be functions of time in order to generate desired reference trajectories that conform to the physical constraints of the mobile robot system. In general, they are selected as monotonically increasing positive time functions that start from a small value and are defined based on actuator limitations and the distance to the target. Fifth order polynomials allow for smooth and monotonic transitions of these parameters to their final values. The following polynomial may be used to compute each parameter ki, i=1,2 during the transition period t0 £ t £ t1 (14) where Dt = t - t0. The following boundary conditions are selected to smoothly increase the value of ki from 1% to 100% of its final value ki
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
(15) ×
N° 1
clockwise (CCW) rotation and W(t) < 0 represents clockwise (CW) rotation. The time derivative of equation (19) is
××
where ki and ki are the 1st and 2nd time derivatives of ki. The six polynomial coefficients kij, j = 0,...,5 are derived using the six boundary conditions specified in equation ¾ (15). Note that ki = ki ifor t > t1 and t1 is selected based on the initial distance of the target from the vessel. E. Transitional Limit Cycle Trajectory The transitional limit cycle trajectory defines a globally exponentially stable limit cycle to which the desired reference trajectory will converge. The state variables introduced in equation (9) for the transitional limit cycle trajectory are defined as
(20) ×
where W is the time derivative of W. The functions h1(x1,x2,t) and h2(x1,x2,t) are derived by eliminating cosWt and sinWt from equations (19) and (20) and are given by
(21)
where
(22)
(16) where xO(t)and yO(t)denote the position of the origin of the limit cycle in the inertial reference frame. The origin of the limit cycle is assumed to be a function of time to allow for dynamic obstacles. The transitional limit cycle trajectory has the following form
2011
The average angular velocity W(t) is assumed to be × monotonically increasing in magnitude such that W+W ¹ 0. Again, a fifth order polynomial is used to transition W(t) from 1% to 100% of its final constant value W in the selected transition period of t0 £ t £ t1
(17) (23) where l(x1,x2,t) defines the limit cycle geometry which may be an explicit function of time to account for obstacle planar rotation. The functions h1(x1,x2,t) and h2(x1,x2,t) represent the planar particle motion kinematics on the limit cycle; i.e. when l(x1,x2,t) = 0. The solution of equation (17) guarantees that any trajectory with an initial position outside of the limit cycle will converge to the limit cycle without crossing it. The positive parameters k1(t) and k2(t) are again assumed to be functions of time in order to generate realizable trajectories and are derived using equation (14). F. Elliptical Limit Cycles In this work an elliptical limit cycle geometry is used. This is a very general shape which can be used for obstacle avoidance, coordinated control, aerial coverage or any number of other trajectory planning applications. The limit cycle l(x1,x2,t) is defined by the general equation of an ellipse with semi major and semi minor axes, a and b, respectively, and origin at (xO(t),yO(t))
(18) where x1(t) and x2(t) are defined in equation (16) and f(t) is the angle representing the orientation of the ellipse semi major axis relative to the global horizontal axis. This angle can be time dependent if the desired limit cycle is rotating. The functions h1(x1,x2,t) and h2(x1,x2,t) are defined based on the motion of a particle around an ellipse (19) where W(t) is the average angular velocity of the particle on the limit cycle. Note that W(t) > 0 represents counter-
where Dt = t - t0. The following boundary conditions (24) are used to derive the six polynomial coefficients Wi, i=0,...,5.
4. Coordinated control An application of the transitional limit cycle trajectory is coordinated control. The trajectories defined in equation (17), which converge to stable limit cycles, are ideal for co-ordinated control because all paths to the limit cycle are independent and non-crossing due to the uniqueness of the solution to the ODEs. The following experimental example considers the coordinated motion of three identical mobile robots that join an elliptical reference trajectory at spaced intervals and maintain the elliptical motion. The mobile robots are initially located at the (x,y) positions (-1,0), (0.5,0.5), and (0.5,-0.5) expressed in meters. Each robot joins the elliptical reference trajectory with semimajor and semi-minor axes of a=0.5m and b=0.3m respectively oriented with f=-30degrees, its center at (0,0) and a desired angular velocity of W=0.45rad/sec. The final values of the trajectory parameters are selected as k1 = k2=0.8. Figure 8 presents the actual trajectories of the three mobile robots under tracking control using encoder feedback. The trajectories shown remain within 2 cm of the desired reference trajectory for the duration of the experiment. The markers on each trajectory represent the mobile robot positions at 2.5 second intervals for the first ten seconds of the trajectory. The velocities during the initial portion of the trajectory as the mobile robot joins the limit cycle depend on the initial distance from the closed orbit. Robots that start closer to the closed orbit proceed at a slower velocity while those that start further Articles
27
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
away must approach at a higher velocity such that all three robots converge to the limit cycle in the proper orientation. Note that these velocities are automatically generated by the ODEs in equation (17). The trajectories presented in figure 8 are calculated based on the encoder feedback used by the control law. However, because of wheel slip, changes in elevation of the ground surface and/or other external disturbances, the trajectory that the robot actually follows may be different from that calculated based on encoder feedback. Figure 9 presents the reference and actual trajectories of the robot which started at (1.5,0.5), including the camera image trajectory. This .gure shows that, in contrast with the trajectory presented in Figure 6, under some circumstances, the actual trajectory of the mobile robot using encoder-based feedback may deviate from the desired reference trajectory. It may become necessary to include the camera image feedback in the tracking control law when this deviation becomes significant.
N째 1
2011
section III-F. At each sample time, the nearest obstacles on a straight line path from the mobile robot to its target is detected. Real-time obstacle avoidance is carried out by transitioning from the target tracking trajectory to a trajectory that approaches the nearest obstacle in the original path. As soon as the mobile robot is around the obstacle, the trajectory is switched either to another limit cycle approximating and enclosing the next obstacle or to the original target-tracking trajectory. In the following experimental example the mobile robot starts at the origin and its target position is (1.5,0). Obstacles are located at (0.5,0) and (1,-0.1)and are approximated by circular limit cycles with radius r=0.15. Figure 10 shows the desired reference, encoder, and camera image paths for the robot under tracking control with encoder feedback. The numbered markers along the desired reference trajectory represent the desired positions at four different times. Marker 1 represents the initial position of the mobile robot and marker 4 represents the target position. At the first instant of the experiment, the obstacle avoidance algorithm recognizes that obstacle 1 is the closest obstacle lying in the straight line path to the target and a transitional limit cycle trajectory surrounding the obstacle is generated. The mobile robot approaches obstacle 1 following the transitional limit cycle trajectory until the next obstacle is detected in its path. A new transitional limit cycle trajectory is generated beginning at at marker 2. This trajectory is followed until marker 3 at which point the obstacle no longer lies in the direct path to the target. Finally, the transitional target trajectory de.ned in Section III-D is used to converge on the .nal target position.
Fig. 8. Experimental coordinated robot trajectories.
Fig. 10. Experimental obstacle avoidance trajectories.
6. Conclusions
Fig. 9. Elliptical trajectory with camera position data.
5. Obstacle avoidance An application of both the transitional limit cycle and transitional target trajectories is the problem of obstacle avoidance. In this application, a mobile robot is commanded to converge on a target position in the presence of several stationary or moving obstacles. The obstacles can be approximated by elliptical limit cycles as developed in 28
Articles
A novel method combining coordinated control and real-time obstacle avoidance for autonomous systems is presented and experimentally veri.ed using mobile robots. The method generates transitional reference trajectories as a function of the current system trajectory and the desired target trajectory. The method can be used for coordinated control of several robots guaranteeing unique collision free paths and coordi-nated operation. It can also be used for real-time obstacle avoidance and target tracking applications.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
ACKNOWLEDGMENTS Support for this work from the Of.ce of Naval Research under ONR Grant N00014-04-1-0642, the National Science Foundation under NSF Grant DUE-0837637, and the Center for Nonlinear Dynamics and Control (CENDAC) at Villanova University is gratefully acknowledged.
[11]
[12]
AUTHORS Lucas C. McNinch*, Reza A. Soltan, Kenneth R. Muske, Hashem Ashrafiuon, James C. Peyton Jones Center for Nonlinear Dynamics and Control Villanova University, Villanova, PA 19085 USA. E-mails: lucas.mcninch@villanova.edu, reza.aliakbarsoltan@villanova.edu, kenneth.muske@villanova.edu, hashem.ashrafiuon@villanova.edu, james.peyton-jones@villanova.edu. * Corresponding author
N째 1
2011
squares technique: Theory and experimental validation. IEEE Transactions on Robotics, 2005, 21(5), pp. 9941004. Y. Kanayama, Y. Kimura, F. Miyazaki, T. Noguchi. A stable tracking control method for an autonomous mobile robot. In: Proc. of the 1990 IEEE International Conference on Robotics and Automation, 1990, pp. 384-389. R. Fierro, F. L. Lewis. Control of a nonholonomic mobile robot: Backstepping kinematics into dynamics. Journal of Robotic Systems, 1997, 14(3), pp. 149-163.
References [1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
R. Murray. Recent research in cooperative control of multivehicle systems. Journal of Dynamic Systems, Measurement, and Control, 2007, 129(5), pp. 571-583. V. Kunchev, L. Jain, V. Ivancevic, A Finn. Path planning and obstacle avoidance for autonomous mobile robots: A review. In: Proc. of the 10th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, volume 4252 of Lecture Notes in Artificial Intelligence, Springer-Verlag, 2006, pp. 537-544. R. Soltan, H. Ashra.uon, K. Muske. Trajectory planning and coordinated control of robotic systems. In: Proc. of the 2009 ASME IDETC/CIE Conference, 2009. R. Soltan, H. Ashrafiuon, K. Muske. Trajectory realtime obstacle avoidance for underactuated unmanned surface vessels. In: Proc. of the 2009 ASME IDETC/CIE Conference, 2009. L.P. Ellekilde, J. Perram. Tool center trajectory planning for industrial robot manipulators using dyna-mical systems. The International Journal of Robotics Research, 2005, 24(5), pp. 385-396. D.H. Kim, J.H. Kim. A real-time limit-cycle navigation method for fast mobile robots and its application to robot soccer. Robotics and Autonomous Systems, 2003, 42(1), pp. 17-30. R. Grech, S.G. Fabri. Trajectory tracking in the presence of obstacles using the limit cycle navigation method. In: Proc. of the 20th IEEE International Symposium on Intelligent Control and the 13th Mediterranean Conference on Control and Automation, 2005, pp. 101-106. T. Chikamasa. Embedded coder robot NXT instruction manual, 2009. www.mathworks.com/matlabcentral/ fileexchange/ 13399/. L. McNinch, R. Soltan, K. Muske, H. Ashrafiuon, J. Peyton Jones. An experimental mobile robot platform for autonomous systems research and education. In: Proc. of the 14th IASTED International Conference on Robotics and Applications, in press, 2009. G. Antonelli, S. Chiaverini, G. Fusco. A calibration method for odometry of mobile robots based on the leastArticles
29
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
A FRAMEWORK FOR UNKNOWN ENVIRONMENT MANIPULATOR MOTION PLANNING VIA MODEL BASED REALTIME REHEARSAL Received 18th June 2010; accepted 29th September 2010.
Dugan Um, Dongseok Ryu
Abstract: In this paper, we propose a novel framework for unknown environment planning of manipulator type robots. Unknown environment motion planning, by its nature, requires a sensor based approach. The problem domain of unknown environment planning, when compared to model based approaches, is notoriously harder (NPHARD) in terms of demanding technical depth especially for difficult cases. The framework we propose herein is a sensor based planner composed of a sequence of multiple MBPs (Model Based Planners) in the notion of cognitive planning using realtime rehearsal. That is, one can use a certain model based planner as a tactical tool to attack location specific problems in overall planning endeavor for an unknown environment. The enabling technology for the realtime rehearsal is a sensitive skin type sensor introduced in the paper. We demonstrate the feasibility of solving a difficult unknown environment problem using the introduced sensor based planning framework. Keywords: sensor based planning, randomized sampling, unknown environment motion planning, collision avoidance, cognitive planning.
1. Introduction Unknown environment motion planning is one of the most daunting tasks in path planning study. Sensor based approaches have been the dominant trends in the study of unknown environment planning for decades. When it comes to unknown environment planning, a planner calls for continuous perception and planning, thereby closing the loop between sensation and actuation. For instance, SLAM (Simultaneous localization and Motion planning) is one of the trends to solve mobile robot navigation problems in unknown environments. In manipulator planning especially for unknown environments, similar notion is applicable to solve difficult cases. For instance, in [1], Lee and Choset promoted the GVG concept to HGVG to solve higher order unknown environment planning problems. The main point of study is to introduce a new roadmap (HGVG) construction method by which a systematic roadmap of free configuration space can be incrementally constructed using line-of-sight sensor data. Another example is introduced in [2], where simultaneous path planning with free space exploration is proposed using a skin type sensor. In their approach, a robot is assumed to be equipped with a sensitive skin. The roadmap approach proposed by Yu and Gupta [3] solves sensor-based planning problems for articulated ro30
Articles
bots in unknown environments. They incrementally build a roadmap that represents the connectivity of free c-spaces. But the usefulness of collision sensor attached on the end-effector is uncertain to detect all the possible collisions. Sensor based planners have been advanced by numerous technological breakthroughs in sensor technology including laser scanner, vision sensor, proximity sensor, etc. Despite of advanced todayâ&#x20AC;&#x2122;s sensor technologies, unknown environment planning still falls short of a fullfledged solution especially to tackle difficult cases. On the other hand, model based planners are also limited in their capability to deal with certain cases of planning problems due to the absence of realtime environment mapping capability. Sensitive skin [3] proposed by Lumelsky and innovated thereafter by Chung and Um [5] poses itself as a tool to bridge the gap between the model based planning and unknown environment planning domains. It is unique in that it can either report impending collision with any part of the body at any moment or impart a realtime environment mapping capability during the course of planning operation. Endowed with such uniqueness, sensitive skin will result in a SLAM capability for many degrees of freedom robotic manipulators. In addition, we believe that the sensitive skin is unique in that many advanced strategies studied in the model based planning problem domain can be utilized to solve unknown environment planning problems, including problems with difficult cases. We also believe that if a sensor based planner utilizes model based planning strategy in case sensitive manner, majority of the planning problems can be resolved (See Figure 1). To that end, we propose a novel framework of sensor based planning constituting the totality as a sequence of model based planners using a sensitive skin and realtime rehearsal in cognitive sense. In each step of sequence, one can use a particular model based planer tailored to attack location specific problems perceived by the realtime sensing capability of a skin type sensor. The notion of cognitive sense of planner imparts the decision capability so that the planner can select the best strategy case by case in each local area to increase the probability of overall convergence (See Figure 2). For instance, if the perceived local area at a step is with a reasonably open space, one can choose PRM [4] or RRT [7] for fast sampling with evenly distributed sample points. But importance sampling or adaptive sampling [8] can be selected for path findings in areas with difficulties. For better results in narrower passages with many degrees of freedom, bridge test algorithm [9] can be selected for more effective path search.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
Perception objective: Obtain environmental affordance. Action objective: In crowded area, increase traversability. In less crowded area, increase manipulability and observability.
Fig. 1. Sensitive skin as a tool to bridge the gap of two planning domains. The new framework proposed in this paper is to host multiple MBPs as tactical tools within which the best local algorithm will be eclectically selected based on spatial examination. Therefore, of importance in the proposed planner is the cognitive sense of spatial perception for the optimality in the MBP selection process.
The environmental affordance we measure is the spatial distribution of sampled spaces. To that end, the realtime rehearsal via a sensitive skin type sensor takes place to generate a local c-space map and to examine the spatial distribution affordance. Increased manipulability and observability as well as traversability result in probabilistically higher reachability to the goal position. In order to meet the action objective, we propose an action schema of multiple MBPs, among which the best suitable one will be exercised. In summary, the robotic action selection will occur as the action schema in local area, which is cognitively connected to the perception stage by the affordance relations. Before we discuss the proposed framework, the details of the IPA skin sensor are discussed in section II, followed by the algorithm with simulation results in III.
2. IPASkin sensor
Fig. 2. Concept of sensor based planning framework. When the planning expands its scope to perception and action, sampling and path finding problems will fall into a cognitive problem domain. In recent literatures, evidenced often is a trend of the probabilistic motion planning expanding its scope from planning to perception and action in c-space for better manipulability. As detailed in [10], for example, the planner moves the mobile base to a less clustered area to obtain better manipulability to achieve higher success ratio in grasping path planning. In [11], next best view search is discussed to increase the success ratio in which the constructed 3D spatial information of a work space is utilized.
The enabling technology for the realtime rehearsal is a sensitive skin type perception sensor, namely IPA (Infrared Proximity Array) skin sensor from DINAST (Figure 3). Included in Figure 4 are some demonstration pictures of the IPA skin sensor capability. The 3D depth measurement capability of the sensor is shown in pictures at upper left and right corners. With this capability, the proposed framework enables instantaneous 3D virtual map generation for locally sensed areas. Lower left side picture depicts a test for finger tracking function of the IPA sensor to demonstrate a realtime operation capability. Finally the one on the lower right corner is the reconstructed 3D surface map of a human face to show precision surface modeling capability of the IPAsensor. From the unknown environment planning standpoint, the IPA skin sensor is unique in that it can impart a collision shield (See Figure 5) to an entire manipulator so that the IPA equipped robot can sample an arbitrary c-space in realtime, thereby perform realtime collision check in work space. One example of an IPA sensitive skin equipped robot is shown in Figure 6. In summary, IPA sensor endows the reatlime rehearsal capability to a multi-DOF robotic manipulator with following features. 1. Collision shield 2. Realtime workspace mapping
In the framework we propose, for demonstration purpose, global planning objective is rather simple and similar to that of the potential field planner, thus the robot is simply biased toward a goal seeking the shortest path, while the local planning objective is divided into perception and action objectives as below. Fig. 3. IPA skin sensor, sensing range.
Articles
31
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
depending on applications at the cost of resolution loss. Active retina principle can also be adopted for regional magnification for high variant areas. In addition, IPA sensitive skin is equipped with a fish eye lens with image dewarping capability, thus imparting complete hemisphere coverage for maximum sensibility (Figure 3). Table 1. IPA sensitive skin sensor specification. Description Size Sensing range Sensing rate Measurement error Viewing angle
Fig. 4. IPA skin sensor demonstration.
Specification 6.5cm x 4.5cm x 4cm 350mm 20Hz at 640*480 resolution Âą8mm at 350mm hemi-sphere complete coverage
2.2. Cross talk prevention between modules Prevention of crosstalk between sensor modules is one concern when multiple IPA sensitive skin sensors are put together for a collision shield. For instance, IR light emitted from one sensor can be seen by another IPA module as an object, which results in a false alarm. In order to prevent crosstalk, a higher level control logic is in need for robust sensing and 3D mapping. Figure 7 shows the firing diagram based on groups of sensors that has little or no correlation in sensitivity due to geometric configuration. The higher level controller will drive each group of sensors in sequence to prevent crosstalk based on the firing diagram in. The firing diagram will result in 50% increase in sampling time due to the sequential IR light emission. Frequency modulation can be used for minimum loss in sampling time if more groups of sensor are needed.
Fig. 5. The concept of collision shield.
Fig. 7. Multiple Sensor Firing Diagram for crosstalk prevention.
3. RRT-Cabomba planner
Fig. 6. IPA sensitive skin installation on Adept robot. 2.1. IPAsensitive skin sensor specification. IPA sensitive skin is a DSP technology based smart sensor with sensor array of about 0.3 million sensors in synch with each other. At the moment, the refresh rate of the sensor is 20 ms for realtime operation. The sensor itself is compact in size, thus can comply with any surface configuration with a 3D realtime surface modeling power to support advanced motion planning capability. Detailed sensor spec is in Table 1. The frame rate of the IPA sensitive skin is adjustable 32
Articles
The planner we propose to implement the concept in Figure 2 and to maximize the usefulness of the IPA sensitive skin is a RRT variant, namely RRT-Cabomba. The planner resembles how a natural aquatic cabomba stretches in water (Figure 9). RRT-Cabomba searches an unknown space based on local probing via the IPA skin and develops a local map in a virtual workspace perceived in sensing range. In each local virtual workspace, RRTCabomba grows RRT to examine spatial distribution. Depending on the result of the spatial distribution, RRT-Cabomba will determine which MBP is the most suitable for the sensed local area. Search completeness in local area will be granted as long as each MBP used is a complete planner, but the global search completeness has yet to be proved as a whole. The sensor model (Figure 8) used for
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
simulation is similar to the IPA sensitive skin depicted in Figure 3, but in 2D plane.
In step 2, piecewise information from each sensor is put together to form a complete virtual environment. Cartesian coordinate with orientation of each sensor on the manipulator body is obtained via forward kinematics. Obtained data of each sensor are then mapped and projected in a virtual environment for workspace object construction followed by a realtime rehearsal. Growing a c-space RRT with respect to a constructed local workspace provides the result of realtime rehearsal for collision in workspace, and at the same time, generates sampled nodes in c-space. We utilize nth order standard distance distribution as a measure of spatial distribution in Step 4 as detailed below.
Fig. 8. Sensor model.
sN =
1 N
N
책 (q - q ) i
2
mean
i =1
where qi is the sampled nodes of ith joint and qmean is the mean value of qi. sN, if larger than sthreshold, signals an open c-space, or crowded space otherwise. The value of sthreshold is determined empirically. In our case, 0.3 yields reasonable performance (maximum value is 0.5). For step 5, in our simulation, we only use following two MBPs for open and crowded cases.
RRT-Cabomba
Natural cabomba
MBP #1 (Open c-space): Move to a node closest to the mean values of sampled nodes. MBP #2 (Crowded c-space): Move along the q axis with largest standard deviation in c-space.
Fig. 9. RRT-Cabomba. In summary, the RRT-Cabomba is a planner with the following logic. RRT-cabomba 1. Sensor data collection Sense a local workspace. 2. Virtual workspace generation Generate a local virtual environment with measured 3D depth information. 3. Realtime rehearsal Grow an RRT in local virtual environment established in step 2. 4. Cognitive decision Measure the spatial distribution of the grown local tree. Determine which MBP is the most suitable for the sensed area. 5. Advance the robot Move the robot to the next position depending on the MBP strategy chosen at step 4. 6. Convergence check Check if the goal position is reached. If not, loop back to step 1. In step 1, we apply a global algorithm such that the cabomba tree is steered toward the goal position by controlling the sensing direction biased toward the goal position. Steering toward a goal generally, but not always, yields faster convergence to the goal at the cost of being entrapped at local minima. Sensor data collection is necessary to build a complete local environment model to fully grow a c-space RRT in each movement.
Fig. 10. RRT-Cabomba for difficult area path search test.
Original RRT-cabomba
RRT-cabomba switching to bug algorithm in crowded area Fig. 11. Comparison between original and one with bug algorithm. For MBP #1, as mentioned earlier, moving to a mean value of qi offers probabilistically best location for next Articles
33
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
move in terms of maneuverability for each joint, thus may increase reachability as well. The MBP #2 is chosen for reasonable traversability, but can be replaced with variety of other MBP algorithms feasible for specific needs in each local area. One execution example of RRT-cabomba is shown in Figure 10.
N째 1
2011
ba, another simulation set is prepared. Shown in Figure 12 is an execution example with the bug algorithm as the second MBP. As demonstrated in the figure, a group of MBPs in conjunction with realtime rehearsal produces reasonable results in a difficult unknown environment case.
Table 1. Results of two algorithms Algorithm Search time Total collision nodes Total collision free nodes Total No. of nodes
Original RRT RRT-Cabomba - Cabomba with bug algorithm 125 s 155 s 6838 5892 21218
21162
28056
27054
Here we use the identical sensor model as the physical IPA skin sensor, hemisphere frontal sensing range with 3 DOF (x, y, q) mobility. One big drawback of RRT-Cabomba is not being able to handle local minima problems effectively due to the global steering algorithm. Global steering mechanism in a completely unknown environment is still a daunting and not a fully addressed issue. Therefore, in this study, we focus more on reasonable search results for a completely unknown environment with RRT-Cabomba planner in conjunction with the IPA skin sensor, leaving the global completeness of the algorithm a future work. As for the algorithm complexity, since RRT-Cabomba uses mixture of local algorithms depending on the situation, the complexity varies as the planner changes. For instance, if the local planner selected is the RRT planner, than it has the same complexity of RRT in the local area. However, although c-space construction is known to be polynomial in the geometric complexity for unknown environments, the proposed algorithm effectively generates c-space map as the robot explores the environment. Therefore the cspace construction is not the bottleneck of the algorithm. Local minima avoidance with bug algorithm In order to demonstrate the versatility of the RRTCabomba, in this case to tackle the local minima problem, we change the MPB #2 to the bug algorithm. The bug algorithm proposed by Dr. Lumelsky [12] and improved thereafter has a unique property of global convergence in 2D and 3D unknown environments. For higher order manipulator path problems, one can utilize algorithms introduced in [8], [9], [13]. These algorithms deal with difficult cases in higher order path problems. In step 4, if sN is smaller than sthreshold, we switch the local strategy to the bug algorithm. Shown in Figure 11 and Table 1 is the results of the previous RRT-Cabomba and the one with the bug algorithm. No significant statistical difference is evidenced in running performance. Notice though that the second planner follows the wall of the first obstacle it encountered because of the bug algorithm engaged in crowded areas. In order to demonstrate the local minima avoidance capability of the RRT-Cabom34
Articles
Fig. 12. Local minima avoidance with bug algorithm as a crowded area search strategy. Table 2. Local minima avoidance simulation. Algorithm Search time Total collision nodes Total collision free nodes Total No. of nodes Total foot steps
RRT-Cabomba with bug algorithm 187 s 11512 25562 37074 50
One limiting factor of the RRT-Cabomba from the implementation perspective in real world is the realtime rehearsal for advancing a manipulator without interruption when continuous motion is needed. In order to perform the realtime rehearsal, however, virtual world generation with 3D-depth measure of a local area followed by a realtime rehearsal with RRT in local area has to take place with zero time lag. In the previous simulation with a computer equipped with an Intel Core 2 CPU at 2.13 GHz and 2Gb RAM, each foot step takes 3.7 second in average, meaning the robot in the real world has to wait for 3.7 second in each step motion for path search. This bottleneck can be tackled with several improvements in implementation such as faster computer, smaller step size, distributed simulation and so forth. However, the algorithm needs to be significantly improved for a full scale, 6 DOF robotic manipulator operating in a completely unknown environment.
4. Conclusion A novel manipulator path search framework with a sensitive skin type sensor for a completely unknown environment planning, especially for difficult cases with local minima, has been investigated. To that end, a novel IPA sensitive skin has been developed and demonstrated its capabilities in the paper. The proposed algorithm, RRT-Cabomba, provides a unique solution for difficult unknown environment plan-
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
ning cases by merging sensor based planning and model based planning ideas for maximum synergy. In RRTCabomba, multiple MBPs can be employed to tackle local area specific planning problems. Cognitive decision making has been studied with realtime rehearsal for each step motion to eclectically select the best possible local strategy in each step. For the feasibility test of the proposed algorithm, a series of simulations has been performed and the results are shared in the paper. Time-lag in simulation due to expensive calculations would be a bottleneck for a real world implementation, but for the proof of the usefulness, the concept of realtime distributed simulation of the proposed algorithm is under investigation at the moment. In addition, real world implementation with IPA sensitive skin is under consideration for future work.
[10]
[11]
[12]
[13]
N° 1
2011
Berenson D., Kuffner J., Choset H., “An optimization approach to planning for mobile manipulation”. In: Proc. of IEEE International Conference on Robotics and Automation, Pasadena, CA, 19th-23rd May, 2008, pp. 1187-1192. Dunn E., Berg J.V., Frahm J.-M., “Developing Visual Sensing Strategies through Next Best View Planning”. In: Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, 11th -15th Oct., St. Louis, 2009, pp. 4001-4008. Lumelsky V., “Algorithmic and complexity issues of robot motion in an uncertain environment”. Journal of Complexity, 1987, pp. 146-182. Amato N.M., Wu, Y., “A randomized roadmap method for path and manipulation planning”. In: IEEE Int. Conf. on Robotics & Automation, 1996, pp. 113-120.
AUTHORS Dugan Um*, Dongseok Ryu - Texas A&M University – Corpus Christi, Science and Technology Suite 222, 6300 Ocean Drive, Unit 5797, Corpus Christi, Texas 78412. Tel. 361-825-5849, Fax 361-825-3056. E-mail: Dugan.Um@tamucc.edu. * Corresponding author
References [1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
Lee J.Y., Choset H., “Sensor-based Plannig for a Rodshaped Robot in Three Dimensions: Piecewise Retracts of R^3 x S^2”. International Journal of Robotics Research, vol. 24, no .5, May 2005, pp. 343-383. Mehrandezh M., Gupta K.K, “Simultaneous path planning and free space exploration with skin sensor”. In: Proc. Of the IEEE Int. Conf. on Robotics & Automation, Washington DC, May 2002, pp. 3838-3843. Yu Y. Gupta K.K., “Sensor-based probabilistic roadmaps: Experiments with and eye-in-hand system”. Journal of Advanced Robotics, 2000, pp. 515-536. Cheung E., Lumelsky V., “A sensitive skin system for motion control of robot arm manipulators”. In: Jour. of Robotics and Autonomous Systems, vol. 10, 1992, pp. 9-32. Kavraki L., Latombe J.-C., “Randomized preprocessing of configuration space for fast path planning”, In: Proc. Int. Conf. on Robotics & Automation, San Diego, May 1994, pp. 2138-2145. Um D., “Sensor Based Randomized Lattice Diffusion Planner for Unknown Environment Manipulation”. In: Proc. of the IEEE Int. Conf. on Intelligent Robots and Systems, Beijing, China, 2006, pp. 5382-5387. LaValle S.M., Kuffner J.J., “Rapidly-exploring random trees: Progress and prospects”. In: Proc. of Workshop on the Algorithmic Foundations of Robotics, 2000. Kurniawati H., Hsu D. “Workspace importance sampling for probabilistic roadmap planning”. In: Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 2, Sendai, Japan, Sep., 2004, pp. 1618-1623. Zheng S., Hsu D., Reif J.H, “Narrow passage sampling for probabilistic roadmap planning”. IEEE Transactions on Robotics, vol. 21, 2005, pp. 1105-1115. Articles
35
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
INTELLIGENT LEADER-FOLLOWER BEHAVIOUR FOR UNMANNED GROUND-BASED VEHICLES Received 6th May 2010; accepted 20th December 2010.
Pouria Sadeghi-Tehran, Javier Andreu, Plamen Angelov, Xiaowei Zhou
Abstract: In this paper an autonomous leader-follower is presented and tested in an unknown and unpredictable environment. Three different types of controller named as First principles-based proportional (P) controller, Fuzzy Logic Controller, and Model-based Predictive Controller are developed and tested in real-time to provide a smooth following behaviour. The follower used the leader's status sent by a smart phone to differentiate between obstacles and the leader and then using two types of sensor, laser and sonar, during the obstacle avoidance procedure. In order to identify the leader again out of many obstacles around, two alternative techniques are proposed using superposition of the scans collected by the laser and predicting the leader's trajectory using evolving TakagiSugeno (eTS). At the end, experiments are presented with a real-time mobile robot at Lancaster University. Keywords: leader-following robot, human-robot interaction, evolving Takagi-Sugeno.
unpredictable environment. It should be highlighted that it operates autonomously with high-level of intelligence and situation awareness in contrast to currently available UGVs which are operated remotely, but manually [4], [5] and often rely on GPS or pre-loaded maps. The remainder of the paper is organized as follows. First, in Section 2 related works on person-following under investigation is summarized and some challenges in leader-follower tasks are pointed out. The proposed approach is briefly outlined in Section 3. Section 4 describes the basic follower by introducing three different controllers, namely, a first principle controller, a fuzzy logic controller, and a predictive controller. Section 5 introduces detecting obstacles and the method for differentiating them from the leader. Sections 6 and 7 describe status of the leader's motion and obstacle avoidance behaviours. Section 8 describes how the follower rejoins the leader after avoiding the obstacle. Section 9 displays the experimental results. At the end, Section 10 provides conclusion and future works.
2. Related Work 1. Introduction The role of robotics has grown significantly in wide variety of applications such as defence, security, industry, etc. Many autonomous robots are developed to operate with humans in work environments like nursing homes [1], hospitals [2] and office buildings. In order such robots to be socially acceptable to people, they need to interact with humans and navigate in such a way that people expect them to do without using specific technical expertise. Also, in some applications robot is required to operate in an unknown environment and needs to posses the capability of performing multitude of tasks autonomously without complete a priori information while adapting to continuous changes in the working environments [3]. This paper focuses on the ability of mobile robots to follow a moving object/leader. Following the leader can present significant challenges. In order to follow moving objects in an unknown environment, the robot should be aware of the surrounding obstacles and also be capable of distinguishing obstacles from the leader. Furthermore, the follower should be aware of any, possibly unpredictable, behaviour of the leader beforehand and respond to that. This paper describes an intelligent person-follower behaviour for Unmanned Ground-based Vehicles (UGV) aiming to follow the movements of the leader with unknown trajectory or kinematic model of motion. The proposed behaviour requires minimal or no intervention from the leader while following him/her in an unknown and 36
Articles
One of the most common methods in person following is to use an external device to attach to a person/leader or wearing an emitting device located in the range of sight of the mobile robot [6], [7]. However, existing systems that provide positioning information are not a practical solution since they mostly rely on pre-installed and calibrated environment infrastructures [7]. Several researchers have investigated using a camera for tracking moving objects. In [8] a camera-based method is used for face detection of the leader; nevertheless, this method is not robust due to varying background colour and illumination conditions as the robot moves through various environments. Tarokh and Ferrari [9] implemented a fuzzy controller with camera-based target-position estimation to produce steering speed commands to keep the moving target centred in the camera view. In order to improve the target recognition algorithm, they allowed a few objects besides the target to enter the camera's field of view [9]. To address the difficulties of the camera-based method for person-following, the method developed in [10], [12] integrated laser-based and camera-based techniques together. Laser is used to find the leader's legs and a camera to detect the leader's face. Using face detection, however, requires the leader always to face the robot, and is, thus, practically impossible and inconvenient for the leader when the robot follows the person behind. On the other hand, Cielniak and his colleagues used an omni-directional camera [11], which requires a new training of an artificial neural network for each person that needs to be followed. Such a requirement
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
Basic Follower Procedure
if there is an obstacle
Obstacle avoidance behaviour
Determine where the leader is
Rejoining the leader
Fig. 1. Schematic representation of the proposed approach with its three layers: 1) Basic Follower; 2) Obstacle avoidance; 3) Rejoining. The key feature of the person-following behaviour is maintaining a distance and heading of the robot directed towards the leader [15]. The robot has to maintain a certain safe distance from the leader and, at the same time, follow the leader in a smooth motion. To achieve a robust tracking, a first principle controller based on the linearization
(xo, yo)
q
d ref
(xn, yn)
d
Dy
The approach introduced in this paper is innovative and has been implemented on a laboratory demonstrator for UK Ministry of Deference (MoD) funded project [31]. It has a hierarchical architecture and three main layers (Fig. 1). The Basic Follower procedure is operating in the first layer which will be explained in more details in the following section. The second layer is activated only when the follower meets an obstacle and avoids it by manoeuvring around. At the third layer, the follower determines the current position of a person and rejoins the leader.
rob ot fol low er rob ot y-d ire ctio n He ad ing
3. The proposed approach
2011
error [16] is used. Alternatively, in order to obtain more accurate tracking, a fuzzy controller can be applied. Note, that a well-tuned fuzzy controller [17], [23] can also achieve a higher accuracy comparing to the simple linear controller. However, the challenging issue to the conventional controllers is that they generate the manipulated value (control command) according to the observation of system status at the current and the past time instants while the purpose of the control is to minimise the observed error in the forthcoming (future) time instant. Taking into account the dynamic nature of the target system, this delay, in turn, may lead to larger errors. In order to minimise the response delay and improve the control quality, we propose a predictive controller using evolving Takagi-Sugeno (eTS) fuzzy rules [19], [20] as a model predictor. This novel method [18], [19] allows the TS fuzzy model to be designed on-line during the process of the control and operation. This is especially suitable for applications such as UGV autonomous navigation where the mobile robots operate in an unknown environment [21]. The person-follower algorithm has the role of providing the information needed by first principle/fuzzy/ predictive controller to control the motion of the robot. The algorithm analyses the information provided by the laser sensor to detect the position of the leader. Laser provides the distance, d relative to the robot and angle/bearing towards the moving target, q measured in real-time (Fig. 2). The obtained relative distance and relative bearing are then made available as input information to the controller. The outputs (control action) are the velocities of the robot's wheels which will determine how much speed change is needed by the robot in order to follow the leader. The aim is maintaining a pre-defined distance, dref and a bearing angle of 0째 so that the target is closely followed without a collision to any obstacle. Note, that due to unpredictable movement of the target/follower, there will be some misalignment between the heading of the leader and the robot (bearing q in Fig. 2).
absolute y-direction
restricts the generality of the method. Other methods for person-following require a previously acquired map of the environment. Montemerlo and colleagues developed a laser-based leader-follower using mobile robots [13]. However, in order to follow a person by a robot a prior map of the environment is required which makes the method unsuitable for cases where a map of the environment is unavailable or the robot is required to frequently change the place during its operation. Our proposed approach is similar to Shaker and Saade [14], who use a laser range finder to detect a person's position providing distance information to the system for the control process. However, they used the relative velocity of the robot to the person's leg and the difference between the relative distance and the safe distance in [14] instead of using the angle/bearing of the robot towards the moving object (leader) as the inputs of the fuzzy inference system as we do. Moreover, the authors of [14] did not address the challenge when the robot meets an obstacle and the algorithm is only tested in indoor environment when no other obstacles are around.
N째 1
Dx rob ot x-d ire cti on
absolute x-direction
Fig. 2. The leader (target) and the follower attempting to keep a constant distance, dref and to nullify the misalignment, q. In an unknown environment, the follower should be aware of all surrounding obstacles and have enough intelligence to detect and avoid them. To make this happen, we propose two conditions to be checked by the follower. The first condition is a sudden and large drop in the distance Articles
37
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
which causes the negative velocity of the wheels; the second alternative condition is the signal from the leader which indicates its motion status. After collision detection and avoidance procedure, the follower should be able to determine the new position of the leader out of, possibly, many objects in front of it, and rejoin the leader again. Two alternative techniques aiming to identify the current position of the leader are introduced which rely on the prediction of the trajectory of the leader and on super-positioning of the scans taken by the laser. The proposed approach for person-following will be explained in more details in the following sections.
N° 1
2011
dref which leads to a large acceleration of the robot. However, if the distance is smaller than dref (the required reference distance we want the UGV to follow the leader at), the velocity component, Vf is set to Negative. The technical limit of the Pioneer 3-DX robot is Vmax = 1400 mm/s.
4. Basic Follower In order to track the leader, the heading of the mobile robot is controlled indirectly through the difference of the velocities of both wheels. Three alternative controllers are used to control the robot motion while following a person. 1) First principles-based proportional (P) controller; 2) Fuzzy Logic Controller (FLC); 3) Model-based Predictive Controller (MBPC). Fig. 3a. Distance component. 4.1. First Principles-based Controller The P controller is based on the explicit linear description of the problem. It keeps acceleration of the robot proportional to the distance to the target/leader, d [22]. Due to the inertia of the real systems it takes a short period of time after a velocity command is received by the motor for the desired velocity to be reached. The velocities of both wheels (left and right) are selected as control values and the turning of the robot is achieved by control of the velocity difference between the left and right wheels. When the velocity of the left wheel is higher than the velocity of right wheel, the robot makes a right turn and vice versa. Based on these principles, the wheel velocity control model is described by the following equations: Vleft = Vf + Vl Vright = Vf + Vr
-
q
(1)
It consists of two components; Vf the component for maintaining dref , and the pair of velocities Vi and Vr to determine the heading of the mobile robot. The two components are defined by the following equations:
Fig. 3b. Angular component. -
whenr = |q| < q ; a1 = a2 = 0 -
Vf = k1(d - dref )
Vl = í
0 k2q
(2) -
|q| < q 0 - ; Vr = í |q| > q -k2q
-
|q| < q |q| > q
(3)
-
where q is threshold of insensitivity which filters the sensors; k1 and k2 are proportionality coefficients. k1 = 2 and k2 = 3, are chosen based on preliminary tests) Fig. 3 depicts the linear proportionality between; a) the velocity of the robot and distance to the target measured from the robot (Fig. 3a); b) the heading angle (determined by the left and right velocities) and the angle/bearing to the target/leader respectively (Fig. 3b). When the distance between the robot and the target is too large, the velocity component Vf gets a higher value in order to maintain the
38
Articles
-
q
whenr = |q| < q ; í
a1 = -k2 ´ q a2 = k2 ´ q
(4.1)
(4.2)
4.2. Fuzzy Logic Controller (FLC) Fuzzy controllers have recently been used in a variety of applications such as underground trains, robotics, etc. owing mainly to their flexibility and higher accuracy which is achieved due to their non-linearity. The data from the detection of moving objects provided by the laser is, sometimes, noisy and could be, occasionally, incorrect. Therefore, in order to provide a more flexible (non-linear) relation between the inputs (distance and bearing to the target/leader) and outputs (the velocities of both wheels), a Zadeh-Mamdani type FLC [23] has been implemented which consists of twenty five fuzzy rules (Table 1) for each left/right robot wheels.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
q
d Crash N QB SN QB S QB SP QB P QB
Close QB SB SB SB SB
Proper dist. Not Far H QF H SF H SF H H H H
Far QF QF QF SF SF
(N=Negative, SN=Small Negative, S=Small, SP=Small Positive, P=Positive, QB=Quick Backward, SB=Slow Backward, H=Hold, SF=Slow Forward, QF=Quick Forward)
Vl l = 1 Vr l1
l2 .... lR l2 .... lR
a11 a21 a12 a22
....
Table 1.a. Rule table of "Left Velocity".
a1R
N° 1
2011
d q
a2R
(7)
where a=[a11,...,a1R;a21,...,a2R] are the parameters of the consequent part of the FLC. The antecedent part of the fuzzy rules is defined by linguistically interpretable terms that describe the distance (Fig. 5a) and angle/bearing (Fig. 5b); the consequent fuzzy sets are defined in respect to the left velocity and right velocity (Fig. 6).
Table 1.b. Rule table of "Right Velocity". q
d Crash N QB SN QB S QB SP QB P QB
Close SB SB SB SB QB
Proper dist. Not Far H H H H H SF H SF H QF
Far SF SF SF QF QF
Each rule represents a typical situation during the "Leader following" task. The input vector is formed based on real-time measurements by the sensors (laser or sonar) providing the distance, d and angle/bearing q (see Fig. 4).
Fig. 5a. Fuzzy sets for distance (mm).
Fig. 5b. Fuzzy sets for angle/bearing (deg). Fig. 4. FLC schematic diagram. The closeness between the measured input vector and the focal points of each fuzzy set is calculated based on triangular/trapezoidal membership functions illustrated in Fig. 5. The result is aggregated to form the degree of firing for each rule and normalized and aggregated further to form the overall output of the FLC [23].
ti = mi1(d) ´ mi2(q)
i = [1,..., R]
(5)
where mi denotes the ith membership function; d and q are the inputs; R - number of the fuzzy rules, R=25. The commonly used aggregation method called "centre of gravity" is applied to determine the firing strength of each fuzzy rule as follows: li =
ti R
å ti
i=1
(6)
Fig. 6 Fuzzy sets for left/right velocity (mm/s). 4.3. Model-based Predictive Controller As mentioned earlier, the key aim in using a controller is to minimise the observed error provided by the observation of the systems status at the current and the past time instants. However, delay in response caused by dynamic nature of the target may lead to large errors. In order to minimise the error caused by the delay, a more advanced class of controllers is used, namely model-based predictive controllers (MBPC) [24], [27]. MBPC is an optimal Articles
39
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
control strategy that uses the dynamic model of the system to obtain an optimal control sequence by minimising an objective function (8). The MBPC used employs a discrete-time prediction model and control law. The discrete-time model of the robot motion used can be given by [35]: x(k +1) = x(k) + v(k) cos q(k)T y(k +1) = y(k) + v(k) sin q(k)T q(k +1) = q(k) + w(k)T
(8)
where v and w are the linear and angular velocities; and T is a sampling period.
N° 1
2011
and report achieving high reliability. Babuska [27] used a Takagi-Sugeno (TS) fuzzy model as a model predictor; however, the model was predefined off-line with a fixed structure which assumes knowing the order and nature of the non-linearity associated with the leader's motion. In this paper, we propose to go further and use evolving Takagi-Sugeno (eTS) fuzzy ruled-based systems [18-20], [36] which requires no assumptions on the order and complexity of the non-linearity and the structure of the controller. We propose a MBPC that is using eTS (eMBPC) to predict the next position of the leader online for a horizon of several steps ahead based on the historical observations (time-series of realtime readings). xk+H = eTS(xk, ..., xk-D)
(13a)
yk+H = eTS(yk, ..., yk-D)
(13b)
p
In a general, state space form it can be given as: p
X(k +1) = fd (X(k), u(k))
(9)
where u = [v w]T is the control input; and X = [x y q]T describes the configuration The leader's trajectory xn and the robot motion control (velocities) vector, un are related by: Xn(k +1) = fn (Xn(k), un(k))
(10)
The problem of trajectory leader tracking can be stated as "to find a control law in such a way that at any time instant", k satisfies: X(k) - Xn(k) = 0
(11)
This tracking problem is solved by employing MBPC. At each sampling instant, the model is used to predict the behaviour of the system over a prediction horizon, Hp. The predicted output values, denoted X(k + j) for j = 1,..., Hp depend on the state of the process at the current time instant, k and on the future control signals, u(k + j) for j = 0,..., Hc - 1, where Hc is the control horizon. The sequence of future control signal u(k + j) for j = 0,..., Hc - 1, is computed by optimising a given objective function, in order to track the reference trajectory, xr as close as possible [27], [37]. Hp
Hc
j=1
j=1
J = b å [Xr(k + j) - X(k + j)]2 + l å [Du(k + j -1)]2 (12)
where b ³ 0, l ³ 0 is the output error and control increment weighting The first term accounts for minimising the variance of the process output from the reference, while the second term represents a penalty on the control effort. Ohya and Monekata [24] proposed an algorithm to predict the next position and speed of the leader based on the history of the leader's position with time instance recorded. However, they assume that the leader will move with the constant acceleration and same angular velocity, which is unrealistic. On the other hand, some other approaches [25], [26] propose using the predictive target tracking algorithm based on the well established Kalman filter 40
Articles
where eTS(.) is a short-hand notation for the eTS fuzzy rule-based model and D-represent the memory used. In this technique, the controller continuously estimates the future predicted position of the leader as he/she moves. Once the leader's future position is estimated, the motion controller implements the required control signals to the robot wheels having in mind the future position of the leader. This leads to minimising the tracking error caused by the delay in response of the controller and increasing the performance and reliability of the Leader following behaviour. MBPC operates in real-time, on-line and is self-learning (both in terms of its parameters and in terms of its structure). Moreover, it does not need to be fed by an initial structure or tuned which makes it generic. The data are processed on-line (in one-pass) and, therefore, it requires very limited computational resource and is suitable for onboard implementation on the mobile robots (UGV). The improved performance of the prediction is discussed in Section 8.
5. Detecting Obstacles Differentiating between an obstacle and the leader is one of the challenging aspects of this task. As mentioned earlier, in Section 3, distance and angle/bearing to the nearest object are used as inputs of the controller. If the leader disappears from the view of the robot after meeting and obstacle (for example quickly avoiding it), the latter may be misled. The robot may find it difficult to distinguish between cases when the leader stopped and cases when the leader quickly disappeared and the nearest object is, in fact, an obstacle that needs to be avoided. The UGV should have enough intelligence to differentiate between these two cases which may look similar if only distance and bearing are used as inputs. To address this problem, we proposed two conditions to be checked; i. Sudden large drop of the distance ii. Signal related to the leader's motion status If the distance to the leader suddenly drops significantly (more than dref), the robot moves backwards to maintain the reference distance which leads to a negative velocity of the robot wheels (Zone B, Fig. 7). Such moves (that lead to a distance drop) often take place during the
Journal of Automation, Mobile Robotics & Intelligent Systems
process of leader following (Zone A, Fig. 7) but the amount of the distance drop is usually not large since the controller aims to minimise the deviations from dref. However, when an obstacle is detected or the leader stops the distance drop is large and sudden (contrast zone B to zone A, Fig. 7).
Fig. 7. Distance and velocity of the wheels measurements (data from the real-life experiment). The second condition assumes a short informative signal from the leader to be sent wirelessly which indicates its motion status (to be described in more detail in Section 6). Summarising, the two conditions used to distinguish the leader (when (s)he stops) from an obstacle are formulated as; i) the mean velocity of both wheels to be negative, and; ii) the leader to have motion status 'walking'. If both conditions are satisfied at the same time, the robot starts executing an obstacle avoidance manoeuvre (to be described in Section 7). This part of the algorithm can be illustrated by the following IF-THEN rule. IF (mean velocity) < 0 && (leader's status is 'walking') { ObstacleAvoidance procedure } Else
VOLUME 5,
N째 1
2011
smaller purpose-build microprocessing system. Nokia N97 has on board an ARM11 343 Mhz processor and works with Symbian OS v.9.4. The selected technology to connect the robot and the leader was Wi-Fi upon the 802.11 b/g standard [33]. We also experimented the use of another (3G) mobile connection technology; however, it entails the installation of third party antennas which would make system deployment very complex. Therefore, Nokia N97 was selected. Note, that the message that is transmitted between the Nokia N97 and the on-board computer of the Pioneer robot is extremely short (few bits - the status can be a single bit 0/1 plus the address of the receiver in a datagram packet) and can be sent at an extremely low bit rate (it can be done only by request from the robot when in Zone B, Fig. 6) - this situations practically occurs quite rarely, e.g. with a frequency in mHz range). The smart phone Nokia N97 offers some useful features such as; the accelerometer, orientation, azimuth, GPS data, and distance to the nearby objects by a proximity sensor. We discarded GPS data because it implies using satellite connections and also proximity sensor as lacking enough range and precision. From a software point of view, the smart phone has a Java Midlet (J2ME). At the receiver side (the robot/ UGV) a Java Application (J2SE) is deployed on its onboard computer [28]. This application is in charge of gathering all datagram packets sent by the smart phone via UDP/IP protocol. TCP/IP connections were initially discarded because no order of the data is required, hence giving priority to the speed of the reply. Using the UDP datagram protocol allows establishing a connectionless network (the address is a part of the packet) which is a convenient solution for the particular problem. The sampling rate (with which the raw data of the acceleration of the leader are being collected) is being set by default to 40 Hz. The data from the internal (to the smart phone) sensor have been filtered and processed before sending over the channel to avoid peaks and outliers in the signal and to minimise the communication (sending the status data instead of the raw acceleration data reduces the bitrate and the possibility of an interception dramatically). The robot requests a signal from the leader's device only when Zone B, Fig. 6 occurs (either when the leader stops or an obstacle obstructs suddenly his/her way). It is important to stress again that the latter situation happens very rarely with a frequency in mHz range.
{ Basic Follower procedure } After avoiding the obstacle, the robot switches again to the Basic Follower procedure (layer 1, Fig. 1).
6. Status of Leader's Motion A very robust and effective way of informing the robot about the motion status of the leader (walking or stopped) is by using a very short signal emitted by a device carried by him/her, e.g. The latest generation smart phone (Nokia N97) which includes several on-board sensors and more than enough processing capacity to handle them. Eventually, this off-the-shelf device can be substituted by a much
7. Obstacle avoidance behaviour The ability to avoid obstacles is one of the most fundamental competences in mobile robotics and autonomous systems. In order to prevent collision with obstacles, the follower (UGV) should have Obstacle Avoidance behaviour. In our case, after the follower successfully distinguishes an obstacle from the leader, the ObstacleAvoidance procedure is being activated. In order to increase its reliability, we used two types of sensors, sonar and laser. Sonar sensors are used to check if the UGV fully passed the obstacle, and laser, alternatively, is used to monitor if there are no obstacles between the robot and the leader. As a result, the UGV successfully avoided all obstacles that were identified (see Section 9 for the experimental result). Articles
41
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
8. Rejoining the leader After an obstacle is being avoided, the UGV have to identify the leader/target out of the, possibly, many objects that are visible. Two alternative techniques are proposed for the re-joining the Leader behaviour; i. Predicting the leader's trajectory ii. Superposition of the scans taken by the laser 8.1. Predicting the Leader's Trajectory The module for predicting the movements of the leader contributes to the increase of the situation awareness and enables the UGV to rejoin him/her once the obstacle is being avoided. As described in Section 4.3, we use evolving Takagi-Sugeno (eTS) fuzzy rule-based systems [1820] for predicting in real-time the position of the leader/ target. This is illustrated in Fig. 8 for one step ahead prediction. However, to predict the position of the leader after avoiding the obstacle is very difficult (leader may go to a completely opposite direction). In order to minimise the error and achieve a more precise re-joining behaviour another alternative technique is proposed using superposition of the laser's scan which is explained in more details in the next section.
N째 1
2011
objects (e.g. the leader). This approach also provides information about the distance in terms of direction of the moving object(s) (e.g. the leader). It is sparse in terms of space with their total mean deviation bigger than the others.
Fig. 9. Scans taken by the laser mounted on the mobile robot. The static obstacles (walls) are clearly seen as clusters of circle dots centers of these clusters were identified on-line as the square dots; one of the cluster centers was moving which indicates the leader - it is shown as a string of cross points (the position of the cluster centers in different time instants are linked with a solid line indicating of the leader's motion direction).
9. Experimental Result
Fig. 8. Leader's Trajectory prediction using eTS (black dotted line - line trajectory of the leader, red solid line - the prediction using eTS). 8.1.1. Superposition of the Laser's Scans An alternative technique to implement Rejoin the leader behaviour is proposed which is based on super-imposing the scans taken by the laser in real-time. The idea is to subtract the motion of the robot and, thus, to leave the objects that are static clearly separated and visible from the moving objects for which the cloud (cluster) of points will be moving. This is illustrated in Fig. 9. We propose using online clustering to identify the moving object(s)/target(s) and to distinguish them from the static object(s). The online clustering is based on recursive calculation of the Euclidian distance between the new points in respect to the previously gathered centres [32]. From Fig. 8 it is clearly seen that points which belong to static objects remain in the same place of the space, overlapping the previously gathered points corresponding to the same objects. (The reason that overlap in Fig. 9 is not absolutely perfect is the noise associated with the real measurements and data streams induced from both laser sensor and movement of the robot). However, it is easy to distinguish points which belong to an object from points that belong to moving 42
Articles
Fig. 10a. Outdoor experiment.
Fig. 10b. Indoor experiment. The experiment was carried out outdoors (Fig. 10a) and indoors (Fig. 10b) at Lancaster University in October 2009 with a Pioneer 3-DX mobile robot. The Pioneer 3DX [29] is equipped with a digital compass, a laser, sonar sensors and onboard computer. It is very suitable for various research tasks including mapping, vision, monito-
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
ring, localisation, etc [21], [22]. The leader (the role was played by the first author of the paper) carried a late generation smart phone Nokia N97 in his pocket used only in wi-fi mode to provide leader's status ('moving' or 'stationary') as described in Section 6. The experiment was performed in 3 main phases and was fully unsupervised (without using GPS or any external links; the wireless connection only was used to download data; the only manual intervention was to start and stop the robot). In phase 1, the robot started with the Basic Follower behaviour (layer 1, Fig. 1). In phase 2, in order to test the ability of the robot/UGV to autonomously distinguish obstacles from the cases when the leader stops, he quickly disappeared behind an obstacle (Fig. 11a). The robot detected the obstacle after the two conditions were checked and satisfied (Section 5) and avoided the obstacle successfully (Fig. 11b). In phase 3, both Rejoining the leader behaviours (Section 8) were tested successfully (Fig. 11c).
N째 1
2011
The video clips from the experiments are available at: http://www.lancs.ac.uk/staff/angelov/Researc h.htm 9.1. Basic Follower When the Basic Follower behaviour is realised a 90째 fan area is being scanned using the laser with 1째 resolutions [30]. It returns the distance, d and the bearing/angle, q to the nearest object (Table 2). Table 2. Example of the data collected in realtime with the control inputs and outputs. Time (ms) 0 200 400 600 800 1000
d (mm) 588.07 563.28 570.84 583.25 566.37 575.64
q (째) -9.13 8.30 0.467 -8.17 6.27 18.81
Velocity Velocity Left (mm/s) Right (mm/s) 203.54 148.75 134.05 183.87 143.09 140.29 191.01 141.99 131.90 151.57 207.7 94.86
Fig. 11a. Leader quickly disappears behind the obstacle. Fig. 12.a. Distance measured in real-time (the interval when the robot stopped is clearly marked; the high variance when the robot is moving is due to the noise and imperfection of the laser scanner and the controller).
Fig. 11b. Robot avoided the obstacle.
Fig. 12.b. Angle measured in real-time (the interval when the robot stopped is clearly marked; the high variance of the readings when the robot is moving is due to the noise and imperfection of the laser scanner and the controller).
Fig. 11c. Robot searched for the leader and rejoins him.
The aim was to control the distance, d as close as possible to dref (in this experiment dref = 500mm). The sampling frequency was 10Hz (100ms per sample). The control values were generated at each sampling interval. Although, during the experiment the target/leader performed a series of behaviours including acceleration, deceleraArticles
43
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
tion, turning, reversing, the UGV, following the leader smoothly and even moving back when the leader stopped (the distance, d became smaller than the distance, dref ). The velocity was measured by the tachometer (odometer) of the robot [28]. An example of the measured distance and angle to the target is illustrated in Fig. 12. 9.1.1. Controllers Comparison Traditional (P) controllers are known to be heavily reliant on tuning. Therefore, when a P controller was used it required some tests to establish suitable parameters (k1 and k2). FLC was also not free form this disadvantage MBPC is self-learning and, therefore, it did not require extensive prior tuning. Table 2 illustrates the comparison between the three controllers. The discrepancy between the real observation and the target values of distance and angle has been used to calculate the errors. The mean absolute error (MAE) and the root mean square error (RMSE) are used as the criteria for the comparison of the three controllers. As it is clear from the table, predictive controller has assisted the fuzzy controller to achieve a better control performance in terms of distance and to some extent in terms of tracking angle to minimise the delay in control response. FLC also has a better performance compare to P controller in term of distance. In order to improve the angle tracking in FLC which is worse than P controller, more rules describing the response to different observation in angles can be added to the fuzzy controller to achieve higher accuracy. Also, the off-line techniques such as ANFIS [34] can be used in order to get the optimal parameters of fuzzy controller. Table 3. Result comparison.
P Controller FLC MBPC
d (mm) RMSE MAE 8.5 150.6 4.93 112.3 4.45 100.6
q (째) RMSE MAE 0.53 10.00 0.60 11.21 0.54 8.58
9.2. ObstacleAvoidance When the leader meets an obstacle and disappears quickly from the view of the robot, this is detected by activating the procedure described in Section 5 using a signal sent through wi-fi connection from the smart phone that the leader carries. Combining the two conditions the robot/follower initiates an obstacle avoidance manoeuvre. 9.3. Rejoining the Leader This behaviour is activated immediately after the obstacle is being avoided. The challenging part is the need the robot to identify the leader out of many objects. One alternative was to use the prediction (as described in Section 8). Another approach is to super-impose the laser's scans. The following sub-sections describe the results using each of the two techniques. 9.3.1. Predicting the Leader's Trajectory The precision of the prediction as described in Section 8.1 is in order of 100mm for a prediction horizon of 3-5 s (the prediction horizon is determined by the obstacle avoidance module) [31]. Note, however, that this is an useful 44
Articles
N째 1
2011
range of errors, because the aim is to rejoin the leader after an obstacle avoidance and to approximately identify the location of the leader in such a way that there is no other obstacle in the vicinity of dozen cm which is reasonable, although not perfect. A higher precision may be preferable, but the task is very complex since the motion of the leader, especially, when avoiding an obstacle himself is very difficult to predict for a long horizon. Note, also, that the leader is not a dot in the space, but a physical object with width of about 500 mm. 9.3.2. Superposition of the Laser's Scan After avoiding the obstacle, the scans are taken by the laser every 200 ms with 180째 fan view. The detectable range of the laser is [150; 10,000] mm. However, while the robot is moving, these scans had to be adjusted in respect to an absolute coordinate system (e.g. linked to the initial robot's position), see Fig. 2. This is done by super-positioning scans stored in a buffer and ordered by tuples regarding the time of collection taken at consecutive time instants (Fig. 9). Once several (10-15 in this experiment) scans were collected, the robot could detect the position of the leader (red point) out of several static objects (green point) and rejoin the leader and to continue with the Basic Follower procedure, Fig. 1.
10. Conclusion 10.1. Summary In this paper we demonstrated a novel approach for leader follower behaviour of UGV in uncertain environment. The hardware specification of the provided platform was a selected Pioneer 3-DX robot with laser, sonar sensors and a compass on board. As an external sensor, a smart phone with accelerometer and a compass on board was also provided. The software components of the proposed approach were divided in three main layers: following, detecting and avoiding obstacles, and rejoining the leader. Three controllers (the P-controller, FLC, and eMBPC) were applied separately in the Basic Follower layer. In order to detect obstacles and differentiate those from the leader, two conditions were checked; i) sudden large drop in the distance and ii) the current status of the leader provided by the smart phone Nokia N97. When the obstacle is detected, two sensors (laser and sonar) were used for the obstacle avoidance procedure. Finally, in order to rejoin the leader after the robot successfully avoided the obstacle, two alternative techniques were proposed; i) predicting the leader's trajectory, and ii) super-position of the scans taken by the laser. The overall behaviour was tested successfully indoors and outdoors at Lancaster University. 10.2. Future Work The leader follower behaviour presented in this paper assumed that the robot follows the leader behind. However, in order to develop a robot with social skills and intelligence, it needs to engage the leader and accompany him/her as a human would. We intend to address the issue in our future work by developing behaviours which allow the robot to travel side-by-side with the leader in a socially acceptable manner. Another task is to improve the control-
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
ler by developing a self-adaptive fuzzy controller which works online without the compulsory need of an offline pre-training and does not need a prior knowledge about the differential equations governing the system. In addition, to improve the robot's performance in identifying the leader out of many moving objects (after avoiding the obstacle), super-position of the scans collected by the laser can be combined with the direction of the movement of the leader measured by the magnetometer of the smart phone carried by him/her. This issue should be addressed in another publication.
[10]
[11]
ACKNOWLEDGMENTS
[12]
The authors would like to acknowledge the funding support by the UK Ministry of Defence (MoD), Centre for Defence Enterprise - Call 'Reducing the Burden 째n the Dismounted Soldier Capability Vision, Task 4, stage 1 - Assisted Carriage', project RT/COM/7/047.
[13]
AUTHORS Pouria Sadeghi-Tehran*, Javier Andreu, Plamen Angelov - Intelligent Systems Laboratory, School of Computing and Communications, Infolab 21 Lancaster University, LA1 4WA, United Kingdom. E-mails: {p.sadeghi-tehran; j.andreu; p.angelov}@lancaster.ac.uk. Xiaowei Zhou - EntelSenSys Ltd., Lancaster, UK. E-mail: x.zhou@entelsensys.com. * Corresponding author
[14]
[15]
[16]
References [1]
[2]
[3]
[4] [5]
[6]
[7]
[8]
[9]
M. Montemerlo, J. Pineau, N. Roy, S. Thrun, V. Verma, "Experiences with a mobile robotics guide for the elderly". In: Proc. of the National Conference of Artificial Intelligence, 2002, pp. 587-592. J. Evans, "Helpmate: An autonomous mobile robot courier for hospitals". In: Proc. IEEE/RSJ/GI Int. Conf. Intell. Robots Syst., 1994, pp. 1695-1700. D. Hujic, E.A. Croft, G. Zak, R.G. Fenton, J.K. Mills, and B. Benhabib, "The robotic interception of moving objects in industrial settings: Strategy development and experiment". IEEE/ASME Trans.Mechatron., vol. 3, 1998, pp. 225-239. http://www.globalsecurity.org/military/systems/ ground/fcs-arv.htm, accessed 20April 2010. http://www.eca.fr/en/robotic-vehicle/roboticsterrestrial-ugvs-tsr-202-e.o.d.-mobile-robot-forremote-manipulation-of-dangerous-objects/28.htm, accessed 20April 2010. R. Bianco, M. Caretti, and S. Nolfi, "Developing a robot able to follow a human target in a domestic environment". In: Proc. of the First Robocare Workshop,2003, pp. 11-14. O. Gigliotta, M. Caretti, S. Shokur, S. Nolfi, "Toward a person-follower robot". In: Proc. of Second Robocare Workshop,2005. C. Schlegel, J. Illmann, K. Jaberg, M. Schuster, and R. Worz, "Vision based person tracking with a mobile robot". In: Proc. of the Ninth British Machine Vision Conference, 1998, pp. 418-427. M. Tarokh and P. Ferrari, "Robot person following using
[17]
[18]
[19]
[20]
[21]
[22]
[23] [24]
N째 1
2011
fuzzy control and image segmentation". Journal of Robotic Systems, 2003, pp. 557-568. M. Kleinehagenbrock, S. Lang, J. Fritsch, F. Lomker, G. A. Fink, and G. Sagerer, "Person tracking with a mobile robot based on multi-model anchoring". In: Proc. of the 2002 IEEE Int. Workshop on Robot and Human Interactive Communication, 2002, pp. 423-429. G. Cielniak, M. Miladinovic, D. Hammarin, L. Goransson, A. Lilienthal, and T. Duckett, "Appearance-based tracking of persons with an omnidirectional vision sensor". In: Proc. of the Fourth IEEE Workshop on omnidirectional vision, 2003, p. 84. M. Kobilarov, G. Sukhatme, J. Hyams, and P. Batavia, "People tracking and following with mobile robot using an omnidirectional camera and a laser". In: IEEE International Conference on Robotics and Automation, 2006, pp. 557-562. M. Montemerlo, S. Thrun, and W. Whittaker, "Conditional particle filters for simultaneous mobile robot localisation and people-tracking". In: IEEE International Conference on Robotics and Automation, 2002, pp. 695701. S. Shaker, J. J. Saade, and D. Asmar, "Fuzzy inferencebased person-following robot". International Journal of Systems Applications, Engineering & Development, 2008, pp. 29-34. P.X. Liu, M.Q.-H. Meng, "Online Data-Driven Fuzzy Clustering with Applications to Real-Time Robotic Tracking". In: IEEE Trans. on Fuzzy Systems, 2004, pp. 516-523. K. Astrom and B. Wittenmark, "Computer Controlled Systems: Theory and Design". Prentice Hall: NJ USA, 1984. B. Carse, T.C. Fogarty, and A. Munro, "Evolving Fuzzy Rule-based Controllers using GA". Fuzzy Sets and Systems, 1996, pp. 273-294. P. Angelov, "Evolving Rule-based Models: A Tool for Design of Flexible Adaptive Systems". Berlin, Germany: Springer Verlag, 2002. P. Angelov, D. Filev, "An Approach to On-line Identification of Takagi-Sugeno Fuzzy Models". In: IEEE Trans. on System, Man, and Cybernetics, part B Cybernetics, ISSN 1094-6977, 2004, pp. 484-498. P. Angelov, "A Fuzzy Controller with Evolving Structure". Information Science, ISSN 0020-0255, vol. 161, 2004, pp. 21-35. X. Zhou, P. Angelov, "An Approach to Autonomous Self-localization of a Mobile Robot in Completely Unknown Environment using Evolving Fuzzy Rulebased Classifier". In: International Conference on Computational Intelligence Applications for Defence and Security, 2007, pp. 131-138. X. Zhou, P. Angelov, and C. Wang, "A predictive controller for object tracking of a mobile robot". 2nd International Workshop on Intelligent Vehicle Control Systems, IVCS 2008, 5th International Conference on Informatics in Control, Automation, and Modeling, ISBN 978-9898111-34-0, 2008, pp. 73-82. R.R. Yager, D.P. Filev, "Essentials of fuzzy modeling and control", John Wiley and Sons, New York, USA, 1994. A. Ohya, T. Munekata, "Intelligent escort robot moving together with human interaction in accompany behaArticles
45
Journal of Automation, Mobile Robotics & Intelligent Systems
[25]
[26]
[27] [28] [29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
46
vior". FIRARobot Congress, 2002. A.E. Hunt, A.C. Sanderson, "Vision-based predictive robotic tracking of a moving target". Robot. Inst., Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. CMURI-TR-82-15, 1982. R.A. Singer, "Estimation optimal tracking filter performance for manned maneuvering targets". In: IEEE Trans. Aerosp. Electron. Syst.,AES-6., 1970, pp. 473-483. R. Babuska, "Fuzzy Modelling for Control". Kluwer Publishers, Dordrecht, The Netherlands, 1998. Pioneer-3DX, User Guide, Active Media Robotics, Amherst, NH, USA, 2004. Pioneer P3-DX: The High Performance All-Terrain Robot. Homepage: http://www.mobilerobots.com/ ResearchRobots/ResearchRobots/Pioneer P3DX.aspx. Laser Navigation Package: Homepage: http://www. mobilerobots.com/ResearchRobots/Accessories/Laser NavigationPkg.aspx. P. Angelov, J. Andreu, P. Sadeghi-Tehran, X. Zhou, "Assisted Carriage: Intelligent Leader-Follower Algorithms for Ground Platform". Tech. Rep., Lancaster University, 18 Nov. 2009, p. 24 . P. Angelov, "An Approach for Fuzzy Rule-base Adaptation using On-line Clustering". International Journal of Approximate Reasoning, vol. 35 , no 3, 2004, pp. 275289. IEEE 802.11: "Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications". (2007 revision) IEEE-SA. 12 June 2007. Jang, J.-S., "RANFIS: adaptive-network based fuzzy inference system". In: IEEE Transactions on System, Man, and Cybernetics, vol. 23, No3, 1993, pp. 665-685. F. Kuhne, J. da Silva, W. F. Lages, "Mobile Robot Trajectory Tracking Using Model Predictive Control". 2005. P. Angelov, "An Approach to On-line Design of Fuzzy Controller with Evolving Structure". In: 4th International Conference RASC-2002, Dec. 2002, pp. 55-56. Clarke, D. W., C. Mohtadi and P. S. Tuffs, "Generalised Predictive Control. Part 1: The Basic Algorithm. Part 2: Extensions and Interpretation". Automatica, 23(2), 1987, pp. 137-160.
Articles
VOLUME 5,
N째 1
2011
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
ROBOTIC REHABILITATION OF STROKE PATIENTS USING AN EXPERT SYSTEM Received 2nd June 2010; accepted 7th September 2010.
Pradeep Natarajan, Arvin Agah, Wen Liu
Abstract: Approximately 50 to 60 percent of the more than five million stroke survivors are moderately or minimally impaired, and may greatly benefit from rehabilitation. There is a strong need for cost-effective, long-term rehabilitation solutions, which require the therapists to provide repetitive movements to the affected limb. This is a suitable task for specialized robotic devices; however, with the few commercially available robots, the therapists are required to spend a considerable amount of time programming the robot, monitoring the patients, analyzing the data from the robot, and assessing the progress of the patients. This paper focuses on the design, development, and clinically testing an expert systembased post-stroke robotic rehabilitation system for hemiparetic arm. The results suggest that it is not necessary for a therapist to continuously monitor a stroke patient during robotic training. Given the proper intelligent tools for a rehabilitation robot, cost-effective long-term therapy can be delivered with minimal supervision. Keywords: rehabilitation robots, intelligent systems, human-robot interaction, expert systems, stroke therapy.
1. Introduction The goal of this work is to investigate a novel methodology that would enable physical and occupational therapy clinicians to provide long-term robotic rehabilitation of the hemiparetic arm for stroke survivors with minimal supervision. Neither the use of robotics, nor the use of artificial intelligence (AI) techniques is new to the field of medicine. However, the idea of applying robotics to the field of rehabilitation medicine is relatively new. Robotic rehabilitation has yet to attain popularity among the mainstream clinicians. With the increased use of robots in rehabilitation medicine, there are numerous opportunities where AI assisted technologies could play a vital role in assisting the clinicians. This paper focuses on designing, developing, and clinically evaluating an expert system-based post-stroke robotic rehabilitation system for hemiparetic arm [31]. The system serves as a valuable tool for therapists in analyzing the data collected from the robot when it is used by the patient, helping the therapists to make the right decisions regarding the progress of the stroke patient, and suggesting a future training plan. The system is designed, developed, and evaluated by conducting a clinical study. The effectiveness and the usefulness of such a rehabilitation system are analyzed in this paper. If it can be shown that the proposed expert system-based robotic rehabilita-
tion is effective and useful for the therapists and the patients, this work could pave the way for an affordable, easy to use long-term robotic rehabilitation solution for stroke survivors. Moreover, such a system that requires minimal intervention from the therapist will play a role in making remote stroke therapy a reality. 1.1. Motivation Stroke is extremely prevalent and its effect is longlasting; yet the availability of long-term rehabilitation is limited. Every 45 seconds, someone in the United States has a stroke. Stroke is a leading cause of serious, longterm disability in the United States. From the early 1970s to early 1990s, the estimated number of non-institutionalized stroke survivors increased from 1.5 to 2.4 million, and an estimated 5.6 million stroke survivors were alive in 2004 [2]. Approximately 50 to 60 percent of stroke survivors are moderately or minimally impaired, and may greatly benefit from rehabilitation [11], [23]. Loss of voluntary arm function is common after a stroke, and it is perceived as a major problem by the majority of chronic stroke patients, as it greatly affects their independence [4]. In recent years, clinical studies have provided evidence that chronic stroke patients have motor recovery even after 4 to 10 years from the onset of stroke [39], [29]. Given this fact, there has been a strong demand from patients and caregivers to develop effective, long-term treatment methods to improve sensorimotor function of hemiparetic arm and hand for stroke survivors. Even partial recovery of arm and hand sensorimotor function could improve the patients' quality of life, and reduce the socioeconomic impact of this disease-induced disability. 1.2. Problem Statement The major challenges involved in post-stroke rehabilitation are the repetitiveness of the therapy, and the availability of therapists for long-term treatment. Many rehabilitation techniques involve repetitive mechanical movement of the affected arm by a therapist. The utilization of robots for rehabilitation has assisted the therapists with the repetitive tasks of the therapy. However, the therapists are still required to spend a considerable amount of time in programming the robot, monitoring the patients, analyzing the data from the robot and assessing the progress of the patients. Even the few commercially available robots neither include any tools for analyzing the data, nor do they have any decision making capabilities. The commercial rehabilitation robots do not take full advantage of the available computing power. Hence, this paper focuses on designing an expert system-based robot that contains tools Articles
47
Journal of Automation, Mobile Robotics & Intelligent Systems
for post-stroke rehabilitation of the upper limb that are easy to use by the therapists. In order to overcome the repetitiveness of the rehabilitation therapy, the interactive robotic therapist was developed [16], [17]. Since then, even though a number of rehabilitation robotic systems have been developed, they all have one thing in common they lack intelligence, and hence are not easy to use by the therapists. Although most of the systems have the capability to collect different kinds of data during patient training, they still require a therapist (or someone assisting the therapist) to analyze the collected data in order to make the decision regarding the changes in the training program. This makes the system difficult to use and it takes the therapist's time away from the patient. The hypothesis of this work: Stroke patients rehabilitated using an expert system-based robotic system will experience the same improvement as the stroke patients using the same robot without the expert system. The neuro-motor function of the hemiparetic upper limb will be assessed primarily using Fugl-Meyer and Motor Status scores [1] , [40]. 1.3. Approach Integrating an expert system with the rehabilitation robot allows a new training program to be selected by the robot, based on the data collected during the patient's training. This essentially serves as a feedback loop. The primary aim of this research work is to design and implement an expert system-based robotic rehabilitation system, and evaluate it in a clinical setting. The expert system-based rehabilitation robot will be easy to use by the medical personnel (not requiring any type of programming expertise) and therefore will reduce the therapist's time. Since the expert system is developed using the knowledge acquired from multiple therapists, the decisions made by the expert system are no longer the opinion of one therapist which is the case in conventional therapy. The primary research question addressed by this work is whether or not an expert system-based robotic rehabilitation system, in which the robot will be able to autonomously analyze data, and make suggestions for the future training exercises depending on the progress of the patient, can provide the same or better result than the robotic rehabilitation system without an expert system while making the system easier to use for the therapists. This paper is organized into eight sections. The next section provides an introduction to stroke and rehabilitation of the upper limb. Section 3 presents an overview of the current state of robotics in upper limb rehabilitation for stroke. Section 4 gives an introduction to expert systems and the steps followed to develop a successful expert system. Section 5 describes how the entire system was developed and the rationale behind the design of various system components. Sections 6, 7 and 8 detail the clinical study, the results and the conclusions.
2. Stroke rehabilitation A stroke (also known as Cerebral Vascular Accident), occurs when blood flow to any part of the brain stops. When blood supply to a part of the brain is interrupted, it results in depletion of the necessary oxygen and glucose to 48
Articles
VOLUME 5,
N째 1
2011
that area. The functioning of the brain cells (or neurons) that no longer receive oxygen will be immediately stopped or reduced and the oxygen starved neurons will start to die [34], [8]. Brain cells that have died cannot be revived, and the body parts that were receiving signals from those cells for various functions like walking, talking, and thinking may no longer do so. Stroke can cause paralysis, hemiparesis (paralysis of one side of the body), affect speech and vision, and cause other problems [2]. Stroke rehabilitation is the process by which the survivors undergo treatment in order to return to a normal, independent life as much as possible. It is known that most of the motor recovery takes place in the first three to six months after stroke. However, depending on the therapy, minor but measurable improvement in voluntary hand/ arm movement occurs even long after the onset of stroke [5]. Some clinical studies have shown that the brain retains the capacity to recover and relearn the motor control even after four years from the stroke onset [39], [29]. Therapy to reestablish the stroke patients' functional movement is a learning process based on the normal adaptive motor programming [3]. The motor relearning of the stroke patients is based on the brain's capacity to reorganize and adapt with the remaining neurons. It has been reported that rehabilitation and intensive repetitive training can influence the pattern of reorganization [20], [35], [28]. Even though many different treatment approaches have been proposed, e.g., [37], physical therapy practice heavily relies on each therapist's training and clinical experience [33]. Studies of robot-aided motor training for stroke patients have demonstrated that it is not only more productive for patient treatment, but it is also more effective in terms of functional improvement of the hemiparetic upper limb compared to conventional physical therapy [6], [24]. The robot-aided motor training could have great potential to evolve into a very effective and efficient clinical treatment.
3. Robotics in Upper Limb Rehabilitation One of the earliest robots developed for manipulation of the human arm was the interactive robotic therapist [16], [17]. The interactive robotic therapist allows for simultaneous diagnosis and training by therapists through interactions with patients. This system is also used for the quantification of the patients' recovery and progress. Following the successful results of this robotic therapist, several rehabilitation robots were designed, including the MirrorImage Motion Enabler (MIME) [7], Assisted Rehabilitation and Measurement Guide (ARM Guide) [36], Motorized Upper-Limb Orthotic System (MULOS) [21], and GENTLE/s haptic system [28]. Researchers agree that in general, compared with conventional treatment, robotassisted treatment definitely has therapeutic benefits [7] , [30]. Robot-assisted treatment has been demonstrated to improve strength and motor function in stroke patients. In one clinical trial even follow up evaluations for up to three years revealed sustained improvements in elbow and shoulder movements for those who were administered robotic therapy [1], [40]. The InMotion2 robot is a commercially available rehabilitation robot [19]. The InMotion2 robot can be program-
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
med to interact with a patient to shape his/her motor skills by guiding the patient's limb through a series of desired exercises with a robotic arm. The patient's limb is brought through a full range of motion along a single horizontal plane to rehabilitate multiple muscle groups [16], [17], [24]. The InMotion2 robot is available in the Neuromuscular Research Laboratory (NRL) at the University of Kansas Medical Center (KUMC) and was used.
4. Expert systems An expert system is an interactive computer-based decision tool that uses both facts and heuristics to solve difficult decision problems based on the knowledge acquired from human experts. Any application that needs heuristic reasoning based on facts is a good candidate for expert systems. Some of the earliest expert systems were developed to assist in areas such as chemical identification (DENDRAL), speech recognition (HEARSAY I and II), and diagnosis and treatment of blood infections (MYCIN) [18]. The expert system development process consists of four phases - knowledge acquisition, knowledge representation, tool selection and development, and verification and validation [18], [26]. Each phase of the expert system development process is discussed in this section. Knowledge Acquisition: Knowledge acquisition refers to any technique by which computer systems can gain the knowledge they need to perform some tasks [38]. Knowledge often goes through stages of refinement in which it becomes increasingly formal and precise. Part of this process involves identifying the conditions under which the knowledge is applicable, and any exceptional conditions. In order to design the knowledge base, several discussions with physical and occupational therapists were conducted. In these meetings, the capabilities of the rehabilitation robot were demonstrated to the therapists. Based on the discussions, it was clearly understood that the expert knowledge in the field of physical therapy was very complex, based on practical experience, very subjective to the patient and the type of motor impairment. A pilot survey to better understand the current clinical practices in stroke rehabilitation was conducted among physical and occupational therapists in Kansas and Missouri [33]. Knowledge Representation: The knowledge-based expert system should encapsulate the expertise of the therapists, in order to be an effective tool during rehabilitation therapy. The captured information takes into account factors that are relevant to stroke rehabilitation such as: The general principles of therapy. The initial conditions of the stroke patient. The most effective training exercises, along with the determinants for each exercise. The methodology by which therapists assess the patient's progress. The knowledge gathered from the experts is first refined in a manner such that it is applicable to the InMotion2 robot which is used for stroke rehabilitation of the arm. Next, the refined knowledge is represented using a standard format such as a production system. A production
N° 1
2011
system is a system based on rules. A rule is a unit of knowledge represented in the following form [15], [14]: IF <conditions> THEN <actions>. The expert knowledge represented as a production system is used to make the decisions regarding the selection and/or modification of any training exercise. The expert system is used to monitor the progress of patient's motor learning and modify the training exercises. From the accumulated records of the patient's arm movement data, the progress of the patient can be determined. Tool Selection and Development: The expert system was developed as high-level intelligence for the InMotion2 robot controller programs. This high-level intelligence monitors the progress of the patient and issues appropriate guidance to the robot for its low-level motion control. One of the most commonly used open-source tools for developing expert systems is C Language Integrated Production System (CLIPS) [14]. Programs for analyzing the raw data collected during the patient training are developed in C. Tcl/Tk is used for the user interface of the robot. A training exercise is defined as a sequence of tasks or training goals. The rule-based expert system analyzes the symbolic information provided by the system, such as the initial subject conditions, the various movement parameters, the current movement data, and evaluates the subject's progress. As a result of the assessment, the expert system can modify the training exercise. The modification could include selecting a new exercise from the given set of exercises, and/or determining how the determinants of the exercise should be adjusted. In many cases the same exercise could be selected with different determinants (such as the range of motion, velocity of motion, and assistive or resistive forces). Figure 1 shows the architecture of the expert system while it is in use. Verification and Validation: The expert system is tested at the Neuromuscular Research Laboratory of the Department of Physical Therapy and Rehabilitation Sciences at the University of Kansas Medical Center (KUMC). Next, the expert system was demonstrated to some therapists at KUMC in order to be validated. The experts presented different cases and made sure the expert system satisfies the requirements.
5. Research methodology This section addresses the design, development, and clinical evaluation of an expert system-based robotic rehabilitation system using the InMotion2 robot. The first step in developing this system is to understand the current stroke rehabilitation practices. Based on interviews with therapists, literature review, and a survey among clinicians in Kansas and Missouri, a list of patient conditions and corresponding treatment protocols were formulated. These protocols were fine-tuned to suit robotic therapy with the help of therapists. This became the knowledge base (also known as rule base) for the expert system. After the knowledge base was finalized, the protocol for the clinical study was developed. Based on the requirements of the clinical study, the software components of the system were developed and implemented. The system was tested and the clinical study was conducted upon approval from the Institutional Review Board at KUMC. Articles
49
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 5,
N° 1
2011
resistive forces, and range of motion. These parameters need to be selected and modified by the expert system after taking into consideration the various patient conditions. The entire decision tree for the expert system is presented as the treatment protocols.
Fig. 1. Expert System while in use. 5.1. Design Requirements In the conventional rehabilitation, the therapist performs an initial assessment of the stroke patient's sensorimotor skills. Many standard tests such as Fugl-Meyer, and Motor Status Assessment are widely accepted as quantitative tests. Based on the initial assessment, the therapist chooses one or more exercises for the patient and starts the rehabilitation process. This cycle of assessing the patient and administering therapeutic exercises is repeated as long as it is feasible. Several studies have shown that robotic post-stroke therapy can be effective but there is no published literature that outlines a comprehensive and generic treatment procedure. In most of the reported studies, a therapist makes an initial assessment of the patient, then chooses one or more exercises with suitable determinants (the variable parameters for each exercise), and then begins the robotic rehabilitation. When the patient undergoes therapy with the robot, the therapist visually observes the various motor skills of the patient and assesses the progress of the patient. In some cases the therapist manually analyzes the data collected from the training programs and makes a decision regarding the patient's progress. Depending on the therapist's assessment, once again one or more training exercises with suitable parameters are chosen for the patient and this process is repeated. In the expert system-based rehabilitation system, instead of the therapist continuously monitoring the patient and providing the robot with the training exercise and the parameters, the expert system makes the necessary decisions. The system undertakes the usually time-consuming task of analyzing voluminous training data in order to evaluate the patient's progress without the intervention of the therapist. The expert system then presents the future training exercise and the parameters along with the explanation for the decisions. The therapist reviews the decisions and the explanation. Once the therapist approves, the robotic training is repeated. This allows the therapist to supervise the entire process for multiple patients within a short amount of time. For the therapist, it is not necessary to monitor each patient continuously. The determinants or the variable parameters of each robotic training exercise include the single plane movement patterns, the number of repetitions or desired time duration, velocity of the training motion, assistive forces, 50
Articles
5.2. Knowledge Base Development Understanding the current practices is imperative for the development of an expert system-based robotic stroke rehabilitation system. Given the broad range of therapy approaches, it is important to obtain data on what stroke rehabilitation methods are actually being used by clinicians. The pilot survey was aimed at understanding the current stroke rehabilitation practices of physical and occupational therapists who were providing care in two Midwest states: Kansas and Missouri. More than 100 clinicians participated in the survey. The knowledge collected from clinical experts enabled the development of the treatment protocols which serve as the rehabilitation system's knowledge base. The stroke rehabilitation methods adopted by therapists vary widely and they seem to combine principles from different approaches in their current practice. This may be an indication of a need for an optimal approach to be developed through more research. The majority responses from the clinicians were used to construct the knowledge base of the expert system for robotic rehabilitation. The self-reported background information of the clinicians correlates with the dated treatment choices reported in sections of the questionnaire. The uncertainty among clinicians revealed in some sections of the survey shows that more evidence of clinical approaches is needed to ensure efficacious treatments [33]. 5.3. Knowledge Representation The knowledge gathered is implemented as rules for the expert system. These rules are used by the expert system to determine the appropriate robotic training for the patients during the clinical study. A step-by-step treatment protocol has been developed in conjunction with the knowledge gathered from the experts and the current literature. This protocol is given in a diagrammatic format in this section. When a stroke patient begins the robotic therapy, the therapist makes an initial assessment which includes all the base-line evaluations. During the initial assessment three main conditions of the patient are determined - tone, strength, and Passive Range of Motion (PROM). For each patient, tone can be normal or high, strength can be diminished, and PROM can be limited or normal. Figure 2 shows the treatment plan that has to be followed if the patient's tone is normal and the PROM is limited. Similar to treatment plan 1, two more treatment protocols were developed one for treating diminished strength, and the other for high tone and limited PROM. Definitions of acronyms used in the treatment plan are: ROM - Range of Motion of the patient PROM - Passive ROM, the range in which the patient is unable to actively contract the muscles to perform the movement on his/her own. AROM -Active ROM, the range in which the patient is
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 5,
N° 1
2011
Fig. 2. Treatment Plan 1 - normal tone and limited PROM. able to actively contract the muscles to perform the movement on his/her own. Patient's progress during the stretching treatment is monitored primarily using the range of motion. Subsequent training exercise parameters are modified as follows: Increase amplitude as tolerated to increase ROM. Patient's progress during the strength treatment is monitored primarily using the accuracy. Subsequent training exercise parameters are modified according to the following: Accuracy of 90% or better over a given number of repetitions, number of trials, or time. Progress resistance for patients functioning with AROM. If applicable, wean patients off assistance as tolerated. Adjust time demand. Modify target constraints to make the task more difficult. 5.4. Software Implementation The software components in this research were developed according to the needs of the clinical study. The design provides the human user full control and maximum flexibility for manual override at any point during the clinical study. Figure 3 gives an overview of the software architecture as well as the flow of data through the system. The components can be grouped into three categories: (1) The expert system developed using CLIPS. (2) The robot testing and training programs in Tcl/TK (3) The analysis program developed using C. The functioning of the overall system can be explained in a step by step manner as follows: In the patient's first visit, initial testing with the robot is done. During this test various arm movement parameters and the patient's conditions regarding tone, strength, and PROM are recorded and saved in the
parameters and conditions data file, respectively. The list of parameters is given in Table 1. The expert system takes the data files as input and selects an appropriate treatment regimen for the patient. In addition to selecting the training exercises, the expert system also makes the necessary modifications to the parameters data file. The robot training program takes the parameters data file as input and provides the appropriate exercise to the patient.
Fig. 3. Overview of the software components. During the training, the program records the data relevant to the patient's movement at 30 milliseconds interval. The data includes the x and y position, the x and y forces, the x and y velocities, and the time at every data point. The data from the training sessions is analyzed by the analysis program. This program calculates the average deviation, the percentage accuracy, the average velocity, and the average of the peak resultant velocity. The analysis program stores the calculated values back in the para-meters data file. Articles
51
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
The parameters data file is the input to the expert system. The expert system checks if the new parameters are different from the old parameters in terms of accuracy, range or motion, velocity, etc., as shown in the treatment plan. If they are different, the conditions for progress are checked and subsequently any applicable changes are made to the parameters data file. This new parameters data file is used as input by the training program and the training cycle repeats. The list of parameters in file is given in Table 1, and how the parameters are represented in the InMotion2 robotic training is explained. Center of y-axis - The center position (origin) of y-axis can vary from patient to patient due to reasons such as the position of the chair in front of the robot, the length of the patient's arm, etc. ROM - The range of motion is represented as the radius of a circle. This implies that the range is not direction specific. For example, if a patient has AROM of 0.14m then it can be understood that the patient can actively move the arm under his/her own power to any point within the circle of radius 0.14m from the center (the origin) position. Table 1. Patient Parameters in Data File. PARAMETER AROM PROM resist_force assist_force center_y deviation accuracy velocity max_res_vel
SHORT DESCRIPTION Active Range of Motion (m) Passive Range of Motion (m) Maximum tolerable resistance (N/m) Minimum required assistance (N/m) Center position, origin of y-axis (± m) Average deviation from straight line path (m) Average % accuracy with respect to length of motion segment Average velocity calculated from time taken (m/s) Average of the peak resultant velocity (m/s)
Resistance - In the InMotion2 robot, any force is a function of a parameter called stiffness. This is similar to that of the stiffness of a spring called the spring constant. The robot uses back-drivable electric servo motors to implement a spring-like feel. The stiffness is measured in Newtons per meter. For a spring, it is the amount of force required to stretch the spring by one meter, and it can be represented as:
52
N° 1
2011
force from the robot. As the stiffness is increased and the robot arm is programmed to move along a specified path, then it will exert assistive force on the patient's arm. Higher stiffness means that the robot arm follows the programmed path more closely and provides increased assistance to the patient's arm. Deviation - During training, the robot is programmed to record the position data about every 30 milliseconds. The data file also stores the information about the desired straight line path in the form of starting point and ending point. If the starting point is given as (x1, y1) and the ending point is (x2, y2) then the equation of the straight line can be given as: Ax + By + C = 0 where A = y2 y1, B = x1 x2, and C = (x2.y1) (x1.y2) Using this equation of the line, the perpendicular distance to the line from any given point, (xp, yp), can be calculated as follows:
The calculated distance is given as the deviation from the desired straight line path at any given instant. Accuracy - The calculated accuracy is an extension of the deviation. The average deviation is represented as a fraction of the length of the motion segment. For example, more than 96% accuracy means that the average deviation is less than 4% of the length of the motion segment. Velocity - The velocity is calculated from the time taken to complete a motion segment. Although the instantaneous velocity is recorded every 30 ms, this velocity is not constant. Therefore, in order to calculate the average velocity of the patient's arm, the time taken to complete each motion segment is noted. Based on the time and the distance of the motion segment, the average velocity is determined. Resultant Velocity - The velocity recorded in the data file from the robot controller is the instantaneous x and y velocity vectors. The magnitude of the resultant is calculated using the formula,
Rvel = ( xvel ) 2 + ( yvel ) 2
F = -k x
6. Clinical study
where k is the spring constant, x is the displacement of the spring. When the robot arm is set to be stationary at a point and if one tries to move the arm, one will be moving against the resistance of the arm. This resistance will be felt like the stiffness of a spring and the force experienced will increase as the arm is moved farther away from the set position. This method is used in the strength training exercises. Assistance - The assistive forces applied to a patient's arm by the robot arm is also manipulated as a function of the stiffness. When the patient does not need any assistance from the robot, the stiffness can be set to 0, i.e., no
The aim of this clinical study is to test various aspects of the newly developed expert system-based stroke rehabilitation system in a clinical setting. The results of the clinical study will serve as “proof of concept” for a possible full-scale study. Two chronic stroke patients were recruited for this study with the help of the Kansas Stroke Registry, established at the University of Kansas Medical Center. Subjects in this study were adults, greater than 21 years of age, who are diagnosed patients with symptoms of mild to moderate stroke. One subject assigned to the experimental group underwent robotic training with the expert system the other subject assigned to the control group underwent
Articles
Journal of Automation, Mobile Robotics & Intelligent Systems
robotic training without the expert system. A step-by-step overview: STEP 1. Base-line (initial) testing should be done for all subjects by a therapist. This will be used for post-training comparisons. The therapist should also find out the patient conditions regarding Passive Range of Motion (PROM limited/normal), and tone in the passive range (normal/ MAS1/ MAS1+/MAS2). STEP 2. Initial testing using the robot - should be done for all subjects. The various parameters of arm movement for the subject are measured using the appropriate robotic testing programs. The parameters measured are listed in Table 1. (Steps 3, 4, 6 and 7 are for the experimental group. For the control group only step 5 is given; however the parameters for step 5 are manually calculated and verified by the therapist.) STEP 3. Use the recorded data file from the testing program and run the analysis program on it. This program calculates the average deviation, accuracy, average velocity and peak resultant velocity, and stores the results. STEP 4. Run the expert system to determine the steps in training. STEP 5. Subject training - run the training programs as suggested by the expert system (or by the therapist for control group) after the parameters are verified by the therapist. The warm-up programs stretch the ROM and slowly increase the velocity. There are five trials each with two repetitions on the diagonal pattern. Resistance and assistance is minimal. The train_ROM and train_strength programs are used for stretching the range and for strengthening the affected arm respectively. There will be 10 trials each with two repetitions. STEP 6. After every two sessions (i.e., one week of training with the same parameters) use the recorded data file from the training sessions and run the analysis program on it. This program calculates the average deviation, accuracy, average velocity, and peak resultant velocity, and stores the results. STEP 7. Run the expert system to determine the progress and the future training steps. Repeat steps 5, 6 and 7 for four weeks (two sessions/ week). STEP 8. Repeat step 1 to collect all the end-treatment data.
7. Experimental results As explained in the study protocol two human subjects were recruited for this study. One of the subjects was assigned to the experimental group and one to the control group.
VOLUME 5,
N째 1
2011
7.1. Baseline Evaluations A baseline evaluation was conducted for each subject to assess the sensorimotor function. Primary measure for the motor function are the Motor Status Score for shoulder and elbow (MS1) and Fugl-Meyer Assessment (FMA) score for upper extremity. A quantitative assessment of motor function of hemiparetic upper limb was also conducted using the robot. Other neuromotor functional assessment techniques which were used include: Motor Status Score for wrist and fingers (MS2), and Modified Ashworth Scale (MAS). Motor Status Score [12] - MS1 consists of a sum of scores given to 12 shoulder movements and five elbow/forearm movements (maximum score = 40). MS1 uses a six-point ordinal (unequal intervals) grading scale (0, 1-, 1, 1+, 2-, and 2), ranging from no volitional movement to faultless movement. The MS1 is capable of detecting a significant advantage of robot therapy for shoulder and elbow [1], [40]. Fugl-Meyer Assessment [13] - FMA scale is applied to measure the ability to move the hemiparetic arm outside the synergistic pattern on a three-point scale (maximum score of 66 points). The FMA scale is also widely used in evaluating the effectiveness of robot-aided therapy [7], [30]. Modified Ashworth Scale - MAS is a six-point rating scale that is used to measure muscle tone. Quantitative assessment of motor function - During initial testing, the program displays a circular pattern with eight different targets along the circumference and the subjects are asked to move the robot arm from the center of the circle to each of the targets sequentially. During this movement the velocity of the motion, the active and passive range of motion, the accuracy of movement, and the required assistive/resistive forces are measured and recorded. These values provide a quantitative assessment of the subject's upper limb motor function. 7.2. Subject Training The therapist and the subjects were unaware of the subjects' group assignment. The therapist's opinion was sought regarding the best robot-aided treatment option for both subjects. The therapist opined that since the subjects do not have PROM limitation (i.e., within the applicable range of robot therapy), they should be trained for improving strength and accuracy. Experimental Subject Training: For the experimental subject, the expert system is used to determine the treatment plan. Since the subject does not have PROM limitation, the expert system chose strength training treatment. For the robot training, the parameter values are chosen from the initial testing data. However for the AROM, if the subject's AROM is greater than 14cm, then it is automatically capped at 14cm [10]. In the strength training exercise, the robot arm is positioned at the center of a square and the targets are placed at the four corners. A screenshot of this program is shown in Figure 4, and a subject using the program is shown in Figure 5. The subject is asked to reach the targets moving along the diagonal of the square. The robot arm resists any movement away from the center position. The maximum tolerable resistance measured during the testing session is Articles
53
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
used by the training program. The training session consists of 10 trials in which the subject is required to reach the targets twice, i.e., two repetitions in each trial. The training program also keeps track of the number of targets missed by the subject. Each target has a maximum timeout period of about two minutes after which the target expires and moves to the next position. The analysis program is used on the experimental subject's training data to measure the average deviation, percentage accuracy, and velocity of movement. To allow for day to day variations in the motor functions, analysis is done after two training sessions instead of every session. Following the analysis, the expert system is used to determine the progress made by the subject. According to the expert system, perceivable progress is made only if these conditions are all satisfied: At least 95% of the targets are reached, Measured accuracy is better than 90%, and Velocity has improved compared to the previous result.
Fig. 4. Screenshot of the strength training program. For the experimental subject, only once during the four week training did the expert system detect progress and subsequently increased the resistance value for future training. This change in resistance was approved by the therapist. Control Subject Training: As mentioned earlier, the therapist determined that strength training would be appropriate for the control subject as well. The same strength training program is used under the supervision of the therapist. The main difference in the treatment for the control subject is that the performance of the subject during training was visually monitored and manually noted if any targets were missed by the subject. Based on this observation therapist determined whether the subject has made enough progress to warrant any increase in resistance. 7.3. End-Treatment Evaluations The end-treatment evaluation was conducted within five days after the completion of the training. The same tests that were used during the baseline evaluation were used again to assess the neuromotor functions of the subjects. The results of the end-treatment evaluations of the two subjects are given in Table 2. In addition, the table also shows the change in scores compared to the baseline values. 54
Articles
N° 1
2011
Fig. 5. A subject using the strength training program. Table 2. Comparison of Subjects.
7.4. Effectiveness of the Rehabilitation System After the initial testing, the expert system was used to determine a treatment plan for the experimental subject. The expert system arrived at the conclusion that strength training should be carried out because the subject had no PROM limitation within the range of the robot. The therapist agreed with this decision of the expert system and the subject went through strength training for four weeks. During the four weeks, the expert system was used to monitor the progress. Only once during this period did the expert system detect progress and increased the resistance. After reviewing the data, the therapist agreed with that decision as well. For the control subject, after the initial assessment, the therapist determined that strength training would be most suitable. After the the decision, the control subject's initial conditions were input to the expert system and it arrived at the same conclusion as the therapist. After the first two sessions of strength training, according to the therapist's observation the subject had made progress and so decided to increase the resistance. The therapist's decision was checked against the expert system but it did not detect enough progress with the subject because the velocity had not improved. The therapist was very pleased with the analysis program. The data collected during a training session typically contains close to 115,000 data points (one data entry for every 30ms). The analysis program makes it possible to quickly analyze and summarize the data from an entire training session.
Journal of Automation, Mobile Robotics & Intelligent Systems
8. Conclusions The objective of this work is to design, develop, and evaluate an expert system based post-stroke robotic rehabilitation system [31]. The new rehabilitation system can be a valuable tool for therapists in analyzing the data from the robot, helping them make the right decisions regarding the progress of the stroke patient, suggesting future training exercises, and delivering robotic therapy. To understand the current stroke rehabilitation practices, a survey was conducted among clinicians in Kansas and Missouri. The majority responses from the clinicians were used to construct a treatment plan for robotic rehabilitation. The treatment plan was implemented as the rule base of the expert system. The delivery of robotic rehabilitation required the development of certain testing programs and training programs, and a data analysis program that can analyze the voluminous training data and summarize it to the expert system. These associated components were developed as part of a new robotic rehabilitation system. The rehabilitation system was evaluated in a clinical setting with two human subjects. The clinical study is intended to verify the effectiveness of expert system-based robotic rehabilitation. The effectiveness of the expert system, the testing and training programs, and the analysis program was evident from the fact that the therapist agreed with the analysis and the decisions made by the system. The expert system-based rehabilitation was studied both for its correctness and usefulness. The correctness of the expert system was evaluated based on how close its decisions are to that of the therapist. Twice the expert system made decisions regarding the treatment plan and regarding the progress of the subject in the experimental group. Both times the therapist agreed with the decisions and was satisfied by the explanation provided by the expert system. For the control subject the therapist made the decisions about the treatment plan and the progress. When the expert system was used to test the therapist's decision, it produced the same treatment plan but not the same deci-
VOLUME 5,
N째 1
2011
sion about the subject's progress. The one time in which the expert system produced a different result can be attributed to the fact that the therapist made the decision about the subject's progress based mainly on visual observation. The therapist did not use any tools to analyze the quantitative data. The therapist followed the procedure that clinicians follow in everyday practice. The training programs record data at an interval of about 30 milliseconds. The data file produced by the training programs on average consists of about 115,000 data points. Manual analysis of one of the data files using a spreadsheet program could take an average computer user anywhere from one to two hours.Atherapist using the robot does not have the time to quantitatively analyze all of the patient's data. The data analysis program developed as part of this rehabilitation system can analyze the data file and produce summaries within a few seconds. It produces information such as the average deviation of the subject's arm from the targeted straight line path, calculates the percentage accuracy as a fraction of the length of the path, calculates the average time taken to reach the targets and thereby the velocity, the average x and y directional forces, and the mean peak resultant velocity. Having this information immediately after a training session would enable the therapist to make sound decisions based on quantitative data. The ability to summarize a training session also means that the therapist does not have to observe the patient continuously. The therapist can simply look at the summarized results at the end of a training session and make decisions. The results also show that a subject trained with the robot tends to show improvement in his/her motor functions. This result is consistent with many other studies that have shown that robot therapy improves motor function in hemiparetic arm of stroke patients [1], [30]. Figure 6 shows the move-ment of the affected arm of the experimental subject before and after robot-aided strength training. It can also be seen that the accuracy has improved marginally. The mean deviation before the therapy was
Fig. 6. Movement of the experimental subject's arm along the x-y plane, before and after treatment. The graph on the left shows the data from the initial testing and the graph on the right shows the data from the end-treatment testing. Articles
55
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
0.0139m and after the therapy it was 0.0134m. The results of this clinical study presented in Table 2 also show that the experimental subject's FMA for both sensory and motor scores improved by two. Similarly for the control subject, the FMA motor score has improved by three. The end-treatment testing with the robot also revealed that the experimental subject had a marked improvement (32%) in the velocity of the affected arm. There is also a slight increase in arm strength of both subjects as measured using the robot. At the end of the clinical study, both subjects were asked to fill out an exit survey. Their feedback showed that the subjects were comfortable and enjoyed using the robot. This illustrates that robotic therapy in addition to being effective, can be entertaining and enjoyable for stroke patients as well. The results presented in Table 2 show that both the control subject and the experimental subject benefited from robotic therapy. The improvement in motor performance was similar for both subjects. This proves that the quality of care provided by the expert system-based rehabilitation system is comparable to the care provided by a therapist using existing robotic treatment methods. The main limitation of this rehabilitation system is that stroke therapy is highly subjective, varying from therapist to therapist and also from patient to patient. Currently, there is no consensus among the therapists regarding the best treatment options. It is still possible that a therapist might reach a conclusion different from that of the one suggested by the expert system based on his/her beliefs and clinical experience. Since the clinical study conducted to evaluate the rehabilitation system is limited to only two stroke patients, it should be construed as a “proof of concept” as the results are not statistically significant. One area of immediate focus following this research could be modifying the expert system to behave as a lowlevel intelligent controller for the robot producing a realtime adaptive system. Another area directly related to this research is tele-rehabilitation [9], [22]. The results from this research prove that minimally supervised therapy is possible. A rehabilitation robot can be made available at a com-munity clinic and a therapist from a remote location can periodically review the suggestions made by the expert system in order to approve or modify it. The delivery of robotic rehabilitation and its effectiveness could be augmented by incorporating virtual reality and haptic feedback devices [32], [27] or by a portable exoskeleton arm [25].
AUTHORS Pradeep Natarajan - IBM Corporation, Lenexa, KS 66219 USA. E-mail: pnatara@us.ibm.com. Arvin Agah* - Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045 USA. E-mail: agah@ku.edu. Wen Liu - Department of Physical Therapy and Rehabilitation Sciences, University of Kansas Medical Center, Kansas City, KS 66160 USA. E-mail: wliu@kumc.edu. * Corresponding author
56
Articles
N° 1
2011
References [1]
[2] [3]
[4]
[5] [6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
Aisen M.L., Krebs H.I., Hogan N., McDowell F., Volpe B.T. The effect of robot-assisted therapy and rehabilitative training on motor recovery following stroke. Archives of Neurology, vol. 54, 1997, pp. 443-446. American StrokeAssociation, 2010. http://www.strokeassociation.org. Bach-Y-Rita P., Balliet R. Recovery from a stroke. In Duncan P.W., Badke M.B. (Eds.) Stroke Rehabilitation: The Recovery of Motor Control, Yearbook Medical Publishers, Chicago, 1987. Broeks J.G., Lankhorst G.J., Rumping K., Prevo A.J. The long-term outcome of arm function after stroke: Results of a follow-up study. Disability and Rehabilitation, vol. 21, no. 8, 1999, pp. 357-364. Bruno-Petrina, A. Motor recovery in stroke. eMedicine Medscape , 2010. Burgar C.G., Lum P.S., Shor P.C., Van der Loos H.F.M. Rehabilitation of upper limb dysfunction in chronic hemiplegia: robot-assisted movements vs. conventional therapy. Archives of Physical Medicine and Rehabilitation, vol. 80, 1999, p. 1121. Burgar C.G., Lum P.S., Shor P.C., Van der Loos H.F.M. Development of robots for rehabilitation therapy: The Palo Alto VA/Stanford experience. Journal of Rehabilitation Research and Development, vol. 6, no. 37, 2000, pp. 663-673. Caplan L.R., Mohr J.P., Kistler J.P., Koroshetz W., Grotta J. Should thrombolytic therapy be the first-line treatment for acute ischemic stroke? New England Journal of Medicine, vol. 337, 1997, pp. 1309-1310. Carignan C.R., Krebs H.I. Telerehabilitation robotics: Bright lights, big future? Journal of Rehabilitation Research and Development, vol. 43, no. 5, 2006, pp. 695-710. Dipietro L., Krebs H.I., Fasoli S.E., Volpe B.T., Stein J., Bever C.T., Hogan N. Changing motor synergies in chronic stroke. Journal of Neurophysiology, vol. 98, 2007, pp. 757-768. Dombovy M.L. Rehabilitation and the course of recovery after stroke. In Whisnant, J. (Ed.) Stroke Population, Cohorts and Clinical Trials, Butterworth-Heinemann, Boston, 1993. Ferraro M., Demaio J.H., Krol J., Trudell C., Rannekleiv K., Edelstein L., Christos P., Aisen M.L., England, J., Fasoli, S., Krebs, H.I., Hogan, N., Volpe, B.T. Assessing the motor status score: A scale for the evaluation of upper limb motor outcomes in patients after stroke. Neurorehabilitation and Neural Repair, vol. 16, no. 3, 2002, pp. 283-289. Fugl-Meyer A.R., Jaasko L., Leyman Iolsson S., Steglind S. The post stroke hemiplegic patient. I. A method for evaluation of physical performance. Scandinavian Journal of Rehabilitation Medicine, vol. 7, 1975, pp. 13-31. Giarratano J., Riley G. Expert Systems: Principles and Programming, Second Edition, PWS Publishing Company, Boston, MA, 1994. Grzymala-Busse J.W. Managing Uncertainty in Expert Systems, Kluwer Academic Publishers, Norwell, MA, 1991.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
[16]
[17]
[18]
[19] [20]
[21]
[22]
[23]
[24]
[25]
[26] [27]
[28]
[29]
[30]
Hogan N., Krebs H.I., Charnnarong J., Srikrishna P., Sharon A. MIT-MANUS: A workstation for manual therapy and training. In Proceedings of the IEEE International Workshop on Robot and Human Communication, Tokyo, Japan, 1992, pp. 161-165. Hogan N., Krebs H.I., Charnnarong J. Interactive robotic therapist. US patent 5,466,213, Massachusetts Institute of Technology, 1995. Ignizio J.P. Introduction to Expert Systems: The Development and Implementation of Rule-Based Expert Systems, McGraw-Hill, Inc., New York, NY, 1991. Interactive Motion Technologies, Inc., 2010. http://www.interactive-motion.com/. Jenkins W.M., Merzenich M.M., Oches M.T., Allard T., Guic-Robles E. Functional reorganization of primary somatosensory cortex in adult owl monkeys after behaviorally controlled tactile simulation. Journal of Neurophysiology, vol. 63, 1990, pp. 82-104. Johnson G.R., Carus D.A., Parrini G., Marchese S.S., Valeggi R. The design of a five-degree-of-freedom powered orthosis for the upper limb. In: Proc. of the Institution of Mechanical Engineers, vol. 215, no. H, 2001, pp. 275-284. Johnson M.J., Loureiro R.C.V., Harwin W.S. Collaborative tele-rehabilitation and robot-mediated therapy for stroke rehabilitation at home or clinic. Intelligent Service Robotics, vol. 1, no. 2, 2008, pp. 109-121. Jorgensen H.S., Nakayama H., Raaschou H.O., ViveLarsen J., Stoier M., Olsen T.S. Outcome and time course recovery in stroke. Part I: The Copen-hagen stroke study. Archives of Physical Medicine and Rehabilitation, vol. 76, 1995, pp. 406-412. Krebs H.I., Hogan N., Aisen M.L., Volpe B.T. Robotaided neurorehabilitation. IEEE Transactions on Rehabilitation Engineering, vol. 6, no. 1, 1998, pp. 75-87. Lee S.,AgahA., Bekey G.A.An intelligent rehabilitative orthotic system for cerebrovascular accident. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Los Angeles, California, 1990, pp. 815-819. Liebowitz J. The Handbook of Applied Expert Systems, CRC Press LLC., New York, NY, 1998. Liu W., Mukherjee M., Tsaur Y., Kim S.H., Liu H, Natarajan P., Agah A. Development and feasibility study of a sensory-enhanced robot-aided motor training in stroke rehabilitation. In: Proceedings of the 31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Minneapolis, Minnesota, 2009, pp. 5965-5968. Loureiro R., Amirabdollahian F., Topping M., Driessen B., Harwin W. Upper limb robot mediated stroke therapy GENTLE/s approach. Autonomous Robots, vol. 15, 2003, pp. 35-51. Luft A.R., McCombe-Waller S., Whitall J., Forrester L.W., Macko R., Sorkin J.D., Schulz J.B., GoldbergA.P., Hanley D.F. Repetitive bilateral arm training and motor cortex activation in chronic stroke: a randomized controlled trial. Journal of the American Medical Association, vol. 292, 2004, pp. 1853-1861. Lum P.S., Burgar C.G., Shor P.C., Majmundar M., Van der Loos H.F.M. Robot-assisted movement training compared with conventional therapy techniques for the
[31]
[32]
[33]
[34] [35]
[36]
[37]
[38] [39]
[40]
N째 1
2011
rehabilitation of upper-limb motor function after stroke. Archives of Physical Medicine and Rehabili-tation, vol. 83, 2002, pp. 952-959. Natarajan P. Expert System-based Post-Stroke Robotic Rehabilitation for Hemiparetic Arm. Ph.D. Dissertation, Department of Electrical Engineering and Computer Science, University of Kansas, 2007. Natarajan P., Liu W., Oechslin J., Agah A. Haptic display for robotic rehabilitation of stroke. In: Proc. of the 11th International Symposium on Robotics and Applications, Budapest, Hungary, vol. ISORA-70, 2006, pp. 1-6. Natarajan P., Oelschlager A., Agah A., Pohl P.S., Ahmad S.O., Liu W. Current clinical practices in stroke rehabilitation: Regional pilot survey. Journal of Rehabilitation Research and Development, vol. 45, no. 6, 2008, pp. 841-850. National Institute onAging, 2010. http://www.nia.nih.gov. Nudo R.J., Wise B.M., SiFuentes F., Milliken G.W. Neural substrates for the effects of reha-bilitative training on motor recovery after ischemic infarct. Science, vol. 272, 1996, pp. 1791-1794. Reinkensmeyer D.J., Lum P.S., Winters J.M. Emerging technologies for improving access to movement therapy following neurologic injury. In Winters, J.M. (Ed.) Emerging Challenges in Enabling Universal Access, RESNApress, 2002, pp. 136-150. Sanchez R.J., Liu J., Rao S., Shah P., Smith R., Cramer S.C., Bobrow J.E., Reinkensmeyer D.J. Automating arm movement training following severe stroke: functional exercises with quantitative feedback in a gravity-reduced environment. IEEE Transactions on Neural and Rehabilitation Engineering, vol. 14, no. 3, 2006, pp. 378-389. Stefik M. Introduction to Knowledge Systems, Morgan Kaufmann Publishers, Inc., San Francisco, CA, 1995. Taub E., Miller N.E., Novack T.A., Cook E.W., Fleming W.C., Nepomuceno C.S., Connell J.S., Crago J.E. Technique to improve chronic motor deficit after stroke. Archives of Physical Medicine and Rehabilitation, vol. 74, 1993, pp. 347-354. Volpe B.T., Krebs H.I., Hogan N., Edelstein L., Diels C., Aisen M. A novel approach to stroke rehabilitation: robot-aided sensorimotor stimulation. Neurology, vol. 54, no. 10, 2000, pp. 1938-1944.
Articles
57
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
CONSIDERATIONS ON COVERAGE AND NAVIGATION IN WIRELESS MOBILE SENSOR NETWORK Received 8th August 2010; accepted 14th December 2010.
Tadeusz Goszczynski, Zbigniew Pilat
Abstract: In wireless mobile sensor networks, periodic calculation of coverage is very important, since mobile sensors can be moved adequately to current needs, thus increasing the coverage. Those moves require the execution of navigation tasks. Global network central algorithms for those tasks are very costly regarding energy consumption and computational resources. Considerations presented herein pertain to the best algorithms for a network of numerous small mobile sensor nodes used for monitoring of large terrains. Localized algorithms suggested in this paper help calculate the coverage on-line and locally with the involvement of neighboring nodes only. Furthermore, localized collaborative navigation is presented. It enables yielding position estimation with no GPS use and ensures improvement rather than deterioration over time. It uses multi-sensor fusion algorithms based on optimization and a distributed iterative extended Kalman Filter. Keywords: wireless, mobile sensor network, coverage, navigation.
bile sensor network could take advantage of its mobility to improve its coverage by self-deployment of sensors consuming as little power as possible. Power consumption is of critical importance in such networks. Topology control algorithms that reduce energy consumption have been an area of thorough research and numerous works, many of which are presented in [1]. Our paper describes selected techniques for evaluation of both coverage and localization of nodes, focusing on those presenting a distributed method and requiring as low energy consumption and as little computation resources as possible. The remaining sections are organized as follows: Section 2 provides the theoretical framework of coverage, Section 3 describes evaluation of sensor field exposure, Section 4 contains a short review of existing works concerning algorithms for coverage calculation, Section 5 presents the distributed algorithm for minimal exposure path evaluation, Section 6 pertains to navigation problems in mobile sensor network, Section 7 presents distribution multi-sensor fusion algorithm for navigation and Section 8 presents performance analysis of location estimates. The last section provides some conclusions and gives suggestions for future work.
1. Introduction There are two major problems regarding wireless mobile sensor networks, i.e. whether a network has adequate coverage of monitoring area and whether a network is able to rearrange sensor-nodes to fulfill the specific requirements of coverage. The ability to self-deploy and selfconfigure is of critical importance for mobile sensor networks because of the unattended nature of intended applications. The network should be able to dynamically adapt its topology to meet application-specific requirements of coverage and connectivity. In static networks, topology control is achieved by controlling the transmission power or sleep/wake schedules of densely deployed nodes. In contrast, mobile sensor networks can exploit control over node positions to affect network topology thus eliminating the need for over-deployment and increasing the net area sensed. A key challenge posed by this objective is the typically global nature of the desired network properties, one of which is coverage of a network. A wireless sensor network is a collection of sensors that offer the ability to communicate with one another and the ability to sense the environment around them, but have limited computational and battery capacity, e.g. solarpowered autonomous robots. These considerations concern the best algorithms for a network of numerous small mobile sensor nodes used for monitoring of large terrains, including those used in safety applications. This paper attempts to select the best solution to how a wireless mo58
Articles
2. Calculation of coverage One of the fundamental problems regarding sensor networks is the calculation of coverage. Exposure is directly related to coverage in that it is a measure of how well an object, moving on an arbitrary path, can be observed by the sensor network over a period of time. The minimal exposure path is a path between two given points such that the total exposure acquired from the sensors by traversing the path is minimized [2]. The path provides information about the worst-case exposure-based coverage in the sensor network. Exposure can be defined as an integral of a sensing function that generally depends on the distance between sensors on a path from a starting point pS and a destination point pD. The specific sensing function parameters depend on the nature of the sensor device and -K usually have the form d , with K typically ranging from 1 to 4. Generally speaking, sensors have broadly diversified theoretical and physical characteristics. Most sensor models share two facets: Â&#x2014; sensing ability diminishes as distance increases; Â&#x2014; sensing ability improves as the sensing time (exposure) increases (due to noise effects). With this in mind, for a sensor s, the general sensing model S at an arbitrary point p can be expressed as follows:
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
(1) where d(s,p) is the Euclidean distance between the sensor s and the point p, and positive constants l and K are sensor technology-dependent parameters.
3. Sensor Field Intensity and Exposure In order to introduce the notion of exposure in sensor fields, the Sensor Field Intensity for a given point p in the sensor field F should be defined. Sensor field intensity can be defined in several ways. Two models are presented for the sensor field intensity: All-Sensor Field Intensity and Closest-Sensor Field Intensity. Exposure for an object O in the sensor field during the interval [t1,t2] along the path p(t) is defined [3] as: (2) where:
All-Sensor Field Intensity for a point p in the field F is defined as the effective sensing measurements at point p from all sensors in F. Assuming there are n active sensors, s1, s2,â&#x20AC;Śsn, each contributing with the distance-dependent sensing the function can be expressed as: n
IA (F, p) =
S (si, p)
(3)
1
Centralized method algorithm requires sensor nodes not only to perform the exposure calculation and shortestpath searching in the sensor network, but also to know the topography of the network. Both functionalities, particularly discovering the network topography, may require extensive computation resources and energy consumption. Communication, required to discover the network topography, is the major energy consumer in wireless sensor networks. Therefore, it is important that a localized minimal exposure path algorithm is developed so that sensors can estimate the network's minimal exposure path without having to know the entire network's topography. In such a localized algorithm, the number of messages sent across the network and the computation performed at each node should be kept at the minimum level.
4. Algorithms for coverage evaluation Most research conducted so far has focused on reducing the design and maintenance (including deployment) costs or increasing the sensor network's reliability and extending its lifetime. However, another crucial problem is to determine how successfully the sensor network monitors the designated area. This is one of the most important criteria for evaluating the sensor network's effectiveness [4]. Another problem is the navigation of mobile nodes in order to improve coverage of the network. We review several techniques of coverage estimation and a few techniques for localization and navigation of mobile sensor-nodes in order to move it to improve the coverage of the sensor network. Coverage in a mobile sensor network can be evaluated as optimization problem for best case or worst-case coverage. The best-case coverage involves
N° 1
2011
two approaches: maximum exposure path and maximum support path. The worst-case coverage has another two: maximum breach path and minimum exposure path [5]. For safety applications, the latter two approaches are chosen. Maximum breach path is not a unique one, since it finds a path such that the exposure at any given time does not exceed a given value. Therefore, the minimum exposure path is chosen, since it attempts to minimize the exposure acquired throughout entire measured time interval. Determining such a path enables the user to change the current location of some nodes to increase the coverage. The authors of [6] suggest that the problem should be transformed to a discrete by generating n x n square grid and limit the existence of the minimal exposure path only along the edges and diagonals of each grid square. Each edge is assigned a weight calculated by special function using numerical integration. The solution produced by the algorithm approaches optimum at the cost of run-time and storage requirements. [7] discusses an algorithm for a network that initially deploys a fixed number of static and mobile nodes and then static nodes find the coverage holes and mobile nodes are relocated to the targeted localizations to increase coverage. In the next section, we present the distributed local algorithm, which do not present limitation of fixed number of nodes. The authors of [8] developed distributed algorithms for self-organizing sensor networks that respond to directing a target through a region and discussed self-organizing sensor networks capable of reacting to their environment and adapting to changes. It can also evaluate coverage of the network in a different way. They described an innovative application: using the sensor network to guide the movement of a user equipped with a sensor that can communicate with the network across the area of the network along a safest path. Safety is measured as the distance to the sensors that detect danger. The protocols for solving this problem implement a distributed repository of information that can be stored and retrieved efficiently when needed. In [9], the author proposes a virtual force algorithm (VFA) as a sensor deployment strategy to enhance the coverage after initial random placement of sensors. For a given number of sensors, the VFA algorithm attempts to maximize the sensor field coverage. A discreet combination of attractive and repulsive forces is used to determine virtual motion paths and the rate of movement for the randomly placed sensors. Once the effective sensor positions are identified, a one-time movement with energy consideration incorporated is carried out and the sensors are redeployed to these positions.
5. Localized algorithm for minimum exposure calculation The best solution for the objective set forth in these considerations is the localized algorithm for minimum exposure calculation proposed by Veltri and others [10]. It ensures low costs of communication and computation in the wireless sensor network without the necessity to determine the entire network's topography and does not require a fixed and limited number of nodes. In this algorithm, only neighboring nodes need to be updated with information and minimum path can be calculated online in an easy and efficient manner. Articles
59
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
The following assumptions were made: sensor nodes do not possess the necessary knowledge to compute the shortest path locally and hence rely on forwarding messages to their neighbors using a shortest-path heuristics; a sensor node stores topological information it receives and forwards the topological information it knows; the Voronoi diagram-based minimum exposure path approximation algorithm is used to further reduce the computation (of exposure) at each sensor node. To use the Voronoi diagram to estimate the minimum exposure path, grid points are placed along Voronoi edges, and grid points on the same Voronoi cell are fully connected (see Fig.1).
Fig. 1. Voronoi diagram for sensor network. Circles represent sensors, ellipses a starting point and a destination point. The weight of an edge between two Voronoi grid points is the single-sensor optimal solution weight for the sensor corresponding to the Voronoi cell. However, this weight only applies if the shortest path lies entirely within the Voronoi cell. If the path strays beyond the Voronoi cell, a straight line is used to weight the edges. Furthermore, the single-sensor optimum solution is used to bound areas to search; if the single-sensor optimum solution between two points is larger than an already found estimated solution, those two points are not investigated during subsequent iterations of the localized algorithm. Two types of messages, Forward messages and Search messages, are passed among sensors in the sensor network. Search message - the receiving node will search locally. Forward message - the receiving node will forward it to a neighboring sensor. To reduce the costs of communication and computation on the wireless sensor network, a heuristic method was applied to rapidly come to a solution that is hoped to be close to the best possible answer. A sensor node selects its neighbor node as the recipient of the message. It would pick the node that has potentially large number of distinct neighbors so that it can quickly learn the network topography or it is close to the destination. In [10], the authors combined these two 60
Articles
N° 1
2011
factors and used the following formula to calculate the heuristic value for node j with respect to the sender node i: ,
, ,
(4)
where: D(i, j) - distance between the sender i and its neighbor j, - maximum communication radius, R - number of hops that the message has currently h been transmitted (assume h starts with 1) - positive constant k To balance the above two unrelated values, both are normalized, where the first term rewards nodes that are far away from the sender and the second term penalizes neighbor j if it is further to the destination than the sender node i, (the case in which the second term is negative). The constant k reflects how rapidly the weight is shifted from picking a neighbor remote from a sensor to picking a neighbor close to the destination. The method tends to pick a sensor closer to the destination as h, the number of hops, increases to prevent the message from being circulated endlessly throughout the sensor network. The algorithm for localized approximation of minimum exposure path consists of 4 steps: the sensor that is closest to the starting coordinate sends a Search message to the node that is determined on the basis of the above heuristic value; when this Search message reaches its destination sensor (the sensor closest to the ending coordinate), the sensor calculates the minimum exposure path using a Voronoi-based approximation algorithm and the network's topological information it receives. (The Voronoi-based approximation algorithm gives the near-optimum exposure path within a Voronoi cell without using the grid-based method requiring extensive computational resources and hence reduces the computation requirements); next, the algorithm selects the sensor in the location that most probably contains the minimum exposure path and sends a Forward message to this sensor. When the appropriate sensor receives the Forward message, it sends a Search message back to the destination sensor to acquire more information on the sensor network's topography that is needed by the Voronoi-based approximation algorithm; this process is repeated until no sensor node requires any further topological information or no locations look promising in comparison to the current minimum exposure path calculated. The minimum exposure path approach is very useful in the network's evaluation. Once the minimum exposure path is known, the user can manipulate sensors in the network or add sensors to the network to increase coverage.
6. Algorithms for navigation in mobile sensor network The problem arising after the calculation of the network's coverage is how to navigate nodes effectively in order to move some of them and improve this coverage
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
[11]. Czapski [12] suggested a fully adaptive routing protocol for the wireless sensor network taking advantage of the nature of sensed events to localization of individual nodes. In [13], the authors described techniques enabling the incorporation of GPS measurements with an IMU sensor. A complementary filter known as the Kalman Filter (KF) provides the possibility to integrate values from the two sources whilst minimizing errors to provide an accurate trajectory of the vehicle. The following GPS and INU data is post-processed by an Extended KF (EKF). In [14], the authors described a multiple sensor fusionbased navigation approach using an interval analysis (IA)based adaptive mechanism for an Unscented Kalman filter (UKF). The robot is equipped with inertial sensors (INS), encoders and ultrasonic sensors. An UKF is used to estimate the robot's position using the inertial sensors and encoders. Since the UKF estimates may be affected by bias, drift etc., an adaptive mechanism using IA to correct these defects in estimates is suggested. In the presence of landmarks, the complementary robot position information from the IA algorithm using ultrasonic sensors is used to estimate and bound the errors in the UKF robot position estimate. Using GPS system is costly and power consuming and known collaborative methods of localization based on positions of neighboring nodes also consume plenty of energy and are not precise. In our paper, we consider a network with a dynamically changing number of nodes. Each node can move according to its own kinematics and there is no correlation between them, but their initial positions are known. The task of a navigation algorithm is to estimate the position of each node at all instances in a globally fixed coordinate system. There should be two kinds of measurements: the displacement of each node between two time instances (from inertial unit) and the distance between any two nodes within a certain range at each time instance (from RF measurements). The multimodal fusion problem can be presented as a graphical model, where nodes represent variables and links represent constraints. The problem is to estimate the values of the variables - representing nodes, given the constraints representing the links. For estimation of a single value for each variable as the best solution, this can be formed as a maximum a posteriori (MAP) estimation problem.
7. Iterative distribution algorithm The standard Kalman Filter approach with a distributed iterative EKF formulation provides excellent accuracy and convergence. In the distributed approach, the location of each node estimates an independent computational unit requiring very limited communications overhead to exchange intermediate location estimates with neighbors. This formulation enables practical implementations with minimum location accuracy reduction. Using a standard EKF algorithm, the state vector of each node can be estimated in parallel. However, since the estimation results of one node are strongly dependent on the estimated locations of the other nodes, one pass through the algorithm is not sufficient to accurately estimate the locations of entire network. To solve this problem, Wu and others [15] proposed an iterative distribution algorithm, in which:
N° 1
2011
each node estimates its state by using its inertial navigation unit (INU) observation; each node obtains RF distance measurements from all the remaining nodes; each node exchanges its state estimate with all the remaining nodes; the RF distance measurements are used to re-estimate each node's state by EKF. After multiple iterations of the last two steps, the state of each node will converge to its optimization point. In this manner, the locations of the entire sensor network can be accurately estimated in parallel with virtually no redundant computation and with minimal inter-node information exchange. The algorithm estimates limit errors in position estimations by continuous fusion of new INU measurements and previously fused location estimations. Position displacements can be determined with MEMS-based INUs and the distance between two nodes can be measured by time-of-arrival TOA-based radio frequency distance measuring sensors. The main underlying assumption of the EKF approach is that the node state can be modeled as a multidimensional Gaussian random variable. This assumption is justified by the fact that an INU measurement gives highly peaked unimodal distribution. The final distribution of the state, after considering the ranging constraints, exhibits a dominant mode (the correct solution) close to the INU peak. Since the distance measurement is not a linear function of the locations, the standard Kalman Filter approach with a distributed iterative EKF formulation that provides excellent accuracy and convergence was used. Since the INU measures the location offset, which is directly proportional to velocity, the filter will have nearly no lag. Moreover, the authors made a few assumptions regarding the node motion and use a simple constant velocity model for prediction, enabling flexible implementation that accepts asynchronous sensor inputs. They postulated the integration of two sensors: INU and distance sensor to estimate absolute and relative positions of sensor nodes. INU sensors prevent geometric ambiguities and distance sensors reduce the drift rate of the individual INUs by a factor of Ön by providing mutual constraints on possible position estimates of n collaborating nodes. Collaborative navigation gives improvements of performance in drift-rate reduction and resetting errors of location estimates even after a dynamic change of system conditions.
8. Calculating errors Considerations of error reduction effects expected when multiple nodes collaborate to determine their locations. If rA and rB are vectors representing the INU-derived estimates of the locations of nodes A and B respectively and rAB is the estimate of the distance from B to A (TOA measurement), then the location of node A can be estimated by the average of INU in node A - rA or by: rB - rAB. If rB and rA have independent errors of size sINU and the error of rAB is negligible as compared to it, then this average is statistically optimum and has the reduced error of sINU /Ö2. The error can be further reduced when additional Articles
61
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
nodes collaborate in RF distance measurements. The error of an average of n independent and identically distributed measurements with common error s! is: Ön ; If y =
(x1 + ... + xn) n
and xi (i=1, ... ,n)
(5)
are identically distributed measurements with common standard deviation sx then: sy =
sx Ön
(6)
For a n-sized network with direct communication, a given node has n independent INU location estimates. One estimate is from its own INU; the remaining n-1 are based on the INU location estimates of the remaining n-1 nodes, the inter-node distance measurements and the previously fused location estimates. A combination of n independent INU estimates provides a 1!/Ön error improvement. This improvement is independent of the specific multi-sensor data fusion algorithm employed, provided the fusion algorithm is formulated in a way that leverages the principles of averaging independent sensor estimates. It is also independent of specific noise model of the underlying sensor (INU or otherwise) that is used to obtain the base position estimate. In practice, instead of requiring the relative location of B from A, only the distance between them is needed for the scaling result to hold. The use of continuous INU measurements and previously fused location estimates virtually eliminates geometric ambiguities like flips, translations, and rotations. 8.1. Sensitivity to Distance Constraint Error The derivation of the error-scaling law assumes that the ranging measurement error is negligible compared to that of the inertial measurement to achieve the 1!/Ön improvement. The analysis presented in [16] shows that the scaling law improvement is quite insensitive to distance measurement errors and that its effect should hold quite well even as the distance measurement error approaches that of the inertial sensors. Reasonable values of INU and ranging error result in a deviation from the ideal scaling behavior by only a few percent, while a typical worst-case combination yields at most an 11 percent deviation. 8.2. Resetting errors of all location estimates When nodes with independent self-location estimates come within range of one another for distance measurement, the errors of all their location estimates are immediately reset to lower values. Reset error levels are independent of the history of ranging activities among the nodes and long-term error growth is insensitive to splitting or joining of the group. The errors in incremental node displacement estimates contribute to the errors in the location estimates in an additive fashion. Let Dxki and dxki be real displacement and estimated by INU displacement of node i made between tk and tk+1 respectively. The additive uncertainty associated with this estimate is denoted 2 . The location estimates Nki with a common variance sINI of node i at time tk are given in [17] K
xKi (1) k=1
62
K
dxki =
Articles
k=1
Dxki +
K
Nki k=1
(7)
N° 1
2011
If at time tk node i joins a group of n-1 other nodes, then, using the independent averaging assumption, the location estimate of node i becomes xKi (n) =
1 n
n
[xKi (1) + xij]
(8)
j=1
where xij is the highly accurate location of node i relative to node j enabled by distance measurement. Although the position uncertainty of a node in a small group may grow while the groups are separated, it is immediately reset to the appropriate lower value when the groups merge to form a larger group. 8.3. Evaluation of performance Analytical prediction of these multi-sensor fusion algorithms should be checked using some simulations. In [17], the authors gave a few formulas for the evaluation of the algorithms' performance. Absolute and relative RMS errors can be calculated using the results of Monte Carlo simulations. The absolute RMS error vector of the entire network at tk will be 1 m mn l=1
e(k) =
n
x il (k) - x il (k))2
(9)
i=1
where e(k) = [ex(k), ey(k) , ez(k) ], x il (k), x il (k)
(10)
are the estimated locations and real locations, respectith th vely, of the i node in the l Monte Carlo run; and m is the total number of the runs. The relative averaged RMS error can be defined as d(k) =
1 m
m l=1
1 n-1 n (d l (k) - d ijl (k))2 n(n-1) i=1 j=i+1 ij
(11)
where d ijl and d ijl are the estimated distance and real distance, respectively, between nodes i and j at time tk th from the l run.
9. Conclusions Calculation of exposure is one of fundamental problems in wireless ad-hoc sensor networks. This paper introduced the exposure-based coverage model, formally defined the exposure and analyzed several of its properties. An efficient and effective algorithm for minimum exposure paths for any given distribution and characteristic of sensor networks was presented. The minimum exposure path algorithm developed as a localized approximation algorithm was chosen for planned network of numerous small mobile sensor nodes used for monitoring of large terrains. The algorithm works for arbitrary sensing and intensity models and provides an unbounded level of accuracy as a function of run time. It works even after a few nodes of network are damaged and requires minimum consumption of energy. The second problem arising after the calculation of the network's coverage is to move some of the nodes in order to improve this coverage and to navigate nodes effectively (also in GPS-denied areas). Position displacement can be determined with micro-electromechanical system-based INUs. As the author of [18] convinces us, for many navigation applications, improved accuracy/performance is
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
not necessarily the most important issue, but meeting performance at reduced cost and size is definitely vital. In particular, small navigation sensor size enables the introduction of guidance, navigation, and control into applications previously considered out of reach. In recent years, three major technologies have enabled advancements in military and commercial capabilities. These are Ring Laser Gyros, Fiber Optic Gyros, and Micro-ElectroMechanical Systems (MEMS). The smallest INUs presented have the size of 3.3 cc. Distributed fusion of multiple independent sensors using the suggested navigation algorithms can exploit the complementary nature of each sensor-nodes characteristics for overall improvements in system accuracy and operational performance without sacrificing operational flexibility, estimating both absolute and relative positions for the members of a mobile sensor network by continuously fusing pair-wise inter-node distance measurements and the position displacement measurements of individual nodes. The benefits of the collaborative error reduction can be realized without use of anchor reference nodes and also with as few as two sensor nodes. It is likely that progress in computing and sensing technologies will soon determine new criteria for algorithms of mobile sensor-nodes and, therefore, in our future works, we plan to make adequate simulations with both algorithms working simultaneously on sensor nodes with different environment parameters.
AUTHORS Goszczynski Tadeusz*, Pilat Zbigniew - Industrial Research Institute for Automation and Measurements, PIAP, Al. Jerozolimskie 202, 02-486 Warsaw, tel.: +4822-8740-377, fax: +48-22-8740-204. E-mail: tgoszczynski@piap.pl. * Corresponding author
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
References [1]
[2]
[3] [4]
[5]
[6]
[7]
Banerjee S., Misra A., “Routing Algorithms for EnergyEfficient Reliable Packet Delivery in Multihop Wireless Networks”, Mobile, Wireless, And Sensor Networks. IEEE Press John Wiley & Son Inc. 2006, pp. 105-139. Pottie G. J., Kaiser W. J., “Wireless Integrated Network Sensors“, Communications of the ACM, vol. 43, no. 5, Mai 2000, p. 51-58. Estrin D., Govindan, R., Heidemann, J., “Embedding the Internet“, Communications of the ACM 2000. Li X., Wan P., Frieder O., ”Coverage Problems in Wireless Ad-Hoc Sensor Networks“, IEEE Transactions for Computers, 2002. Ghosh A., Das S., “Coverage and Connectivity Issues in Wireless Sensor Networks“, Mobile, Wireless, And Sensor Networks. IEEE Press John Wiley & Son Inc. 2006, pp. 221-256. Meguerdichian S., Koushanfar F., Qu G., Potkonjak M., ”Exposure in wireless ad-hoc sensor networks“. In: Proc. 7. Annual Int. Conf. Mobile Computing and Networking (MobiCom'01), Rome, Italy, 2001, pp. 139150. Wang G., Cao G., LaPorta, “A bidding protocol for
[17]
[18]
N° 1
2011
deploying mobile nodes sensors”. In: Proc. IEEE InfoCom (InfoCom'04), Hong Kong, March 2004, pp. 80-91. Li Q., De Rosa M., Rus D., “Distributed Algorithms for Guiding Navigation across a Sensor Network”. In: MobiCom'03, 14th-19th September, 2003, San Diego, California, USA. Zou Y., Chakrabarty K., “Sensor Deployment and Target Localization Based on Virtual Forces”. In: ACM Transactions on Embedded Computing Systems (TECS), vol. 3, no. 1, February 2004, pp. 61-91. Veltri G., Huang Q., Qu G., Potkonjak M., “Minimal and maximal exposure path algorithms for wireless embedded senor networks“. In: Proc. 1. Int. Conf. Embedded Networked Sensor Systems (SenSys'03), Los Angeles, USA, 2003, pp. 40-50. Hu L., Evans D., “Localization for mobile sensor networks”. In: Proc. of Int'l Conf. on Mobile Computing and Networking. Philadelphia, PA, 2004, pp. 45 - 57. Czapski P., ”Fully adaptive routing protocol for wireless sensor network“, Pomiary Automatyka Robotyka, 1/2010, pp. 5-9. Foy S., Deegan K., Mulvihill C., Fitzgerald C., Markham Ch., McLoughlin S., “Inertial Navigation Sensor and GPS integrated for Mobile mapping”. In: Proc. Geographical Information Science Research Conference, (GISRUK 2007), NUI, Maynooth, Ireland, 2007. Ashokaraj I., Tsourdos A., Silson P., White B., “Mobile robot localisation and navigation using multi-sensor fusion via interval analysis and UKF”, Department of Aerospace, Power and Sensors, RMCS, Cranfield University, Shrivenham, Swindon, UK - SN6 8LA. Available at: http://cswww.essex.ac.uk/technicalreports/2004/cs Wu S., Kaba J., Mau S., Zhao T., ”Teamwork in GPSDenied Environments. Fusion of Multi-Sensor Networks“, GPS World Questex Publications, vol. 20, no 10 , 2009, pp. 40-47. Zhao T., Wu S., Mau S.C., Kaba J., “Collaborative Effects of Mobile Sensor Network Localization through Distributed Multimodal Navigation Sensor Fusion”. In: Proceedings of the Institute of Navigation National Technical Meeting 2008, 28th-30th January, 2008. Wu S., Kaba J., Mau S., Zhao T., ”Distributed MultiSensor Fusion for Improved Collaborative GPS-denied Navigation”. In: Proceedings of the 2009 International Technical Meeting of The Institute of Navigation, Anaheim, CA, USA, 2009. Barbour N., ”Advances in Navigation Sensors and Integration Technology. Inertial Navigation Sensors“, RTO EDUCATIONAL NOTES EN-SET-064, pp. 25-40. www.rta.nato.int
Articles
63
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
SPECIAL ISSUE
Editorial to the Special Issue Section on Hybrid Intelligent Systems for Optimization and Pattern Recognition Part II Guest Editors: Oscar Castillo and Patricia Melin
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 5,
N° 1
Editorial Oscar Castillo*, Patricia Melin
Editorial to the Special Issue Section on “Hybrid Intelligent Systems for Optimization and Pattern Recognition - Part II” The special issue on hybrid intelligent systems for optimization and pattern recognition comprises five contributions, which are selected and extended versions of papers previously presented at the International Seminar on Computational Intelligence held at Tijuana, Mexico on January of 2010. The papers describe different contributions to the area of hybrid intelligent systems with application on optimization and pattern recognition. In the papers, an optimal combination of intelligent techniques is applied to solve in an efficient and accurate manner a problem in a particular area of application. In the first paper, by Mario I. Chacon-Murguia et al., a method for Fusion of Door and Corner Features for Scene Recognition is presented. This paper presents a scenery recognition system using a neural network hierarchical approach. The system is based on information fusion in indoor scenarios. The system extracts relevant information with respect to color and landmarks. Color information is related mainly to localization of doors. Landmarks are related to corner detection. The corner detection method proposed in the paper based on corner detection windows has 99% detection of real corners and 13.43% of false positives. In the second paper, by Martha Cárdenas et al., the Optimization of a Modular Neural Network for Pattern Recognition using Parallel Genetic Algorithm is presented. In this paper, the implementation of a Parallel Genetic Algorithm for the training stage and the optimization of a monolithic and modular neural network for patter recognition is presented. The optimization consists in obtaining the best architecture in layers, and neurons per layer achieving the less training error in a shorter time. The implementation was performed in a multi-core architecture, using parallel programming techniques to exploit its resources. We present the results obtained in terms of performance by comparing results of the training stage for sequential and parallel implementations. In the third paper, by Claudia Gómez Santillán et al., an Adaptive Ant-Colony Algorithm for Semantic Query Routing is presented. In this paper, a new ant-colony algorithm, Adaptive Neighboring-Ant Search (AdaNAS), for the semantic query routing problem (SQRP) in a P2P network is described. The proposed algorithm incorporates an adaptive control parameter tuning technique for runtime estimation of the time-tolive (TTL) of the ants. AdaNAS uses three strategies that take advantage of the local environment: learning, characterization, and exploration. Two classical learning rules are used to gain experience on past performance using three new learning functions based on the distance traveled and the resources found by the ants. The experimental results show that the AdaNAS algorithm outperforms the NAS algorithm where the TTL value is not tuned at runtime. In the fourth paper, by Leslie Astudillo et al., a new Optimization Method Based on a Paradigm Inspired by Nature is described. A new optimization method for soft computing problems is presented, which is inspired on a nature paradigm: the reaction methods existing on chemistry, and the way the elements combine with each other to form compounds, in other words, quantum chemistry. This paper is the first approach for the proposed method, and it presents the background, main ideas, desired goals and preliminary results in optimization. In the fifth paper, by Marco Aurelio Sotelo-Figueroa et al., the Application of the Bee Swarm Optimization BSO to the Knapsack Problem is presented. In this paper, a novel hybrid algorithm based on the Bees Algorithm (BA) and Particle Swarm Optimization (PSO) is applied to the Knapsack Problem. The Bee Algorithm is a new population-based search algorithm inspired by the natural foraging behavior of honey bees, it performs a kind of exploitative neighborhood search combined with random explorative search to scan the solution, but the results obtained with this algorithm in the Knapsack Problem are not very good. Although the 66
Editorial
2011
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 5,
N째 1
2011
combination of BA and PSO is given by BSO, Bee Swarm Optimization, this algorithm uses the velocity vector and the collective memories of PSO and the search based on the BA and the results are much better. In conclusion, this special issue represents a contribution to the state of the art in the area of hybrid intelligent systems with application on optimization and pattern recognition.
Guest Editors: Patricia Melin and Oscar Castillo Tijuana Institute of Technology, Tijuana, Mexico pmelin@tectijuana.mx, ocastillo@tectijuana.mx
Editorial
67
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
FUSION OF DOOR AND CORNER FEATURES FOR SCENE RECOGNITION
Mario I. Chacon-Murguia, Rafael Sandoval-Rodriguez, Cynthia P. Guerrero-Saucedo
Abstract: Scene recognition is a paramount task for autonomous systems that navigate in open scenarios. In order to achieve high scene recognition performance it is necesary to use correct information. Therefore, data fusion is becoming a paramount point in the design of scene recognition systems. This paper presents a scenery recognition system using a neural network hierarchical approach. The system is based on information fusion in indoor scenarios. The system extracts relevant information with respect to color and landmarks. Color information is related mainly to localization of doors. Landmarks are related to corner detection. The corner detection method proposed in the paper based on corner detection windows has 99% detection of real corners and 13.43% of false positives. The hierarchical neural systems consist on two levels. The first level is built with one neural network and the second level with two. The hierarchical neural system, based on feed forward architectures, presents 90% of correct recognition in the first level in training, and 95% in validation. The first ANN in the second level shows 90.90% of correct recognition during training, and 87.5% in validation. The second ANN has a performance of 93.75% and 91.66% during training and validation, respectively. The total performance of the systems was 86.6% during training, and 90% in validation.
the previously mentioned. However, vision sensors are the most frequently used in this task [5], [6], [7]. The advantage of a vision sensor is that it provides compound information that may be separated into useful properties like color, edges, texture, shape, spatial relation, etc. Therefore, it is possible to achieve data fusion with the information of a vision sensor. This paper presents the design of a hierarchical neural system for scene recognition using information fusion from indoor scenarios provided by a camera. The problem to solve is constrained to the recognition of 10 indoor scenarios shown in Figure 1. The features used in the design of the hierarchical neural networks are related to door position, and corner detection. The paper is organized in the next sections. Section 2 presents the corner detection method. Door detection through color analysis and segmentation is explained in Section 3. Section 4 describes the design of the hierarchical neural network, and the paper concludes with the results and conclusions in Section 5.
Keywords: scene recognition, robotics, corner detection.
1. Introduction The advance of science and technology has motivated new and more complex engineering applications. These new challenges must involve not only the design of adaptive and dynamic systems but also the use of correct information. It is everyday more evident that good multicriteria decision making systems requires the fusion of data from multiple sources. A research area where data fusion has become a fundamental issue is autonomous robot navigation. Making a robot to navigate and perceive its environment requires similar information as the used by a human [1], [2], [3]. This information usually comes from range detection sensors such as ultrasonic, laser, or infrared, and also from image acquisition sensors, such as CCD or CMOS cameras [4]. The information of each sensor must be processed adequately in order to extract useful information for the navigation system of the robot. One paramount issue in autonomous navigation of robots is related to scenery recognition. Recognition of sceneries consists on the identification of a scenario perceived through measurements provided by a sensor. The sensor may be any of 68
Articles
Fig. 1. Scenarios to be recognized.
2. Corner detection method Corner detection is used in areas such as robotics and medicine. In robotics, it is used for data fusion, navigation, and scene recognition. In medicine, it is applied for image registration such as x-rays, ultrasounds, and medical diagnostics [8]. In other applications, corner detection is used for object recognition, stereo vision, motion detection, among many other usages [9]. Corners are the features more abundant in images of the real world, in contrast to straight lines [10]. For this reason, the use of corners is commonly found in tasks such as image matching. One of the advantages that corners offer is that, if we have images of the same scene, although taken from different perspectives, we will find almost the same corners, which is a good feature for image registration. That in turn will provide information for naviga-
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
tion of mobile robots. There are different definitions for what it is considered a 'corner'. F. Mokhtarian and R. Suomela, in [11], explain that the points of corners in an image are defined as points where the contour of the image has its maximum curvature. Juan Andrade-Cetto, in [12], mentions that corners are one of the most simple features which can be extracted from an image, and define corners as those points in an image where there is change in the intensity in more than one direction. Krishnan Rangarajan et al. [13] describe a corner as the point of union of two or more straight lines. In this paper, a corner is considered accordingly to the definition of Rangarajan. Based on this consideration, the proposed method described in this paper locates corners based on corner window analysis. The methods for the detection of corners can be divided in two groups: those which can accomplish the detection from the image in gray scale, and those which first detect edges and then detect corners. Among the methods of the first group, the most mentioned in the literature are the method of SUSAN [14] and the method of Harris [15]. The method of SUSAN differentiates from other methods in that it does not compute the derivative of the image under analysis and that it is not necessary to reduce the noise that could be present in the image. It uses a circular mask which scans the whole image, comparing the gray levels of the central pixel in the mask and the rest of the pixels inside the mask. All the pixels with a gray level equal to the central pixel level are considered as part of the same object. This area is called USAN (Univalue Segment Assimilating Nucleus). The USAN area has a maximum value when the center is in a plain region of the image, a mean value when it is on an edge, and a minimum value when it is on a corner. The method of Harris is more sensitive to noise because it is based on the first derivative of the image. However, it is invariant to rotation, translation and illumination, which give it advantages over other methods. This method uses a window which scans the image and determine sudden changes in gray levels which results from rotating the window in several directions. Among the second group of corner detectors, which use any method of edge detectors, we can mention the one of X.C. He and N.H.C. Yung [16]. They use the method of Canny and indicate the steps to follow for the detection of corners calculating the curvature for each edge. Other authors use windows for corner detection from edge images, such as K. Rangarajan et al. [13]. In a similar way, G. Aguilar et al. [17], compare images of fingerprints for the identification of persons using 3x3 windows. On those, they propose different bifurcations to be found, which we could call 'corners'. W. F. Leung et al. [18], use 23 windows of different bifurcations, and 28 different windows of other type of corners for their detection in the finger print image using neural networks. The method described in this work is based on the second group of corner detectors. Those which first apply edge detection and then detect corners using windows over the edge image. The general scheme of the corner detection method is shown in Figure 2. The original image I (x, y) is convolved with a Gaussian filter G to remove noise that could be present in the image, yielding the image Is(x,y). A gradient operator and a threshold to determine the edges are appli-
N° 1
2011
ed to the image Is(x,y). These two operations correspond to the edge detection using the method of Canny. The Canny method was used because it yielded better results than the Sobel and other common edge operators. The resulting image Ib(x,y) is convolved (*) with 30 corner detection windows, wc of order 3x3, to detect the corners present in the image. The resulting image Ie(x,y) contains the corners found.
Fig. 2. Corner detection process. 2.1. Edge Detection The corner definition adopted in this work is the one provided by Rangarajan. It is necessary to find the main line intersections of the scene under analysis. These lines are detected through an edge detection procedure. Among the edge detector operators tested in this work were Sobel, Prewitt, Robert, Canny, and Laplacian. It was decided to use the Canny edge detection method [18], because it generated the best edges in the experiments achieved in this research. It is also one of the most mentioned and used edge detector methods in the literature. The Canny method was developed by John F. Canny in 1986. This method detects edges searching maxima of the image gradient. The gradient is obtained using the derivative of a Gaussian filter. The edges are defined by considering two thresholds related to weak and strong edges which makes the method more robust under noise circumstances. The method uses two thresholds, to detect strong and weak edges, and includes the weak edges in the output only if they are connected to strong edges. This method is therefore less likely than others to be fooled by noise, and more likely to detect true weak edges. For a complete description of the method, the reader is encouraged to read [19]. The parameters used for the Canny method were, s = 1.2, µH = 0.18, and µL = 0.18. These values were chosen after several experimentation results. 2.2. Corner detection windows The papers from G. Aguilar [17], and W.F. Leung et al. [18], coincide in that there are different types of bifurcations or corners that we call them, Y´s, V´s, T´s, L´s, and X´s, accordingly to the form they take, as shown in Figure 3. Based on the similitude of these corners with fingerprint marks, it was decided to investigate the possibility of using a unified theory between fingerprint recognition and scene recognition. Thus, from the fingerprint recognition works, some windows were chosen to detect corners.
Fig. 3. Type of corners. Articles
69
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
These selected windows in addition to other proposed in this work make a set of 30 windows. Each corner detection window, wc, is a 3x3 mask and their structures are illustrated in Figures 4 and 5. The set wc of windows is composed as follows. Windows w1, w2, w3, and w4, are four windows modified from the work of Leung et al. [18]. The modification consists on the aggregation of one pixel, because they try to find terminal points, and in our case we look for crossing lines. The extra pixel is darkened in these windows. Windows w5 to w20 were also taken from Leung. The windows w17 to w20 appear in Aguilar et al. [17]. The subset w21 to w30 are windows proposed in this paper. The proposed windows were defined by analysis of the corners usually found in the set of images considered in this work.
N° 1
2011
Fig. 6. Generic weight matrix, Tn. Each corner detection window is then associated with an index window Bi Tn : wc Þ Bi
for c,i = 1,..., 30
(1)
obtained by Bi = wc ´ Tn
(2)
where the multiplication is element by element and not a matrix multiplication. In this way, each wc window is related to an index window Bi. In the same way, each index window Bi can be associated to a total weighting factor ai obtained by ai = 1+ 3 bi
(3)
bi Î B i
where the bi corresponds to the weighting factor in Bi. Corner detection of a scene is accomplished by the next steps. First convolve the binary Canny result image Ib(x,y) with the index matrix Bi Ici(x,y) = Ib(x,y)* Bi +1
(4)
Fig. 4. Windows w1 to w16 for corner detection. This step yields the possible corners related to each corner window wc. The next step is to decide which of the possible candidate pixels in each Ici(x,y) is a corner that corresponds to wc. This process is realized scanning the Ici(x,y) and assigning a pixel value according to
ì1
pci(x,y) = ai
îo
otherwise
pei(x,y) = í
(5)
to produce a new set of images Iei(x,y), where pci(x,y) Î Ici(x,y) and pei(x,y) ÎIei(x,y). The value 1 indicates that the pixel pci(x,y) is a corner of the type wc. This process ends up with 30 binary images that indicate the position of the different type of corners. The final step consists on the union of the Iei(x,y) images to produce the final corners 30
IFC(x,y) = ² Iei(x,y)
(6)
i=1
Fig. 5. Windows w17 to w30 for corner detection. 2.3. Corner detection Corner detection is achieved through a windows matching process. The image is scanned with the different corner detection windows wc, trying to match the window corner shape with the edge pixels. Assuming a 3x3 neighborhood and two possible values {0,1} for each pixel, the number of permutations is 29 = 512. The values 2n for n = 0,1,…, 8 can be considered as weighting values to yield a generic weight matrix Tn, starting at the coordinate p(1,-1) as shown in Figure 6.
70
Articles
Experiments of corner detection were performed on images from 16 different scenarios, see Figure 7. Scenarios 1 to 10 represent the environment of interest in this work. They correspond to the real environment where a mobile robot is going to navigate across. Scenarios 11 to 16 were added only for evaluation purposes in other scenarios. Besides, semi-artificial scenarios to compute the quantitative performance were obtained from some of these real scenarios. The purpose of using semi-artificial scenes is to obtain a correct performance measure of the proposed method, which would be hard to compute in real (noisy) scenes. These type of images allow to compute false positives and false negatives detections in a simpler form, and to achieve
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
Fig. 7. Scenarios considered to test the corned detection method.
a)
b)
c)
d)
Fig. 8. a) Semi-artificial scenarios, 3, 6, 10, 15,16. b) corners to detect, c) Harris detection, c) SUSAN detection, d) detection with the proposed method. an analysis of these cases. A semi-artificial image is a simplified image extracted from a real scene. Results of the application of the proposed method to the scenarios are shown in Figure 8. Figure 8a is the semi-artificial scenario obtained from a real one; the corners to be detected are in 8b. The detected corners with the Harris and proposed methods are illustrated in 8c and 8d, respectively. A summary showing the performances of both, the proposed and the Harris methods are shown in Tables 1 and 2. The detection
of real corners is very alike in the two methods, 99% and 98%, respectively. Where there is a more noticeable difference is in the false positives, where the proposed method has 13.43%, while the Harris has a 25.86%. Comparison with the SUSAN algorithm is not possible because it requires multi gray level information. In the case of the Harris method, it was assumed two-gray-level images. A qualitative comparison will be given over the original images later on. Our explanation for the false negatives and false posiArticles
71
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
tives is as follows. False negatives are mainly due to corners that do not match exactly to any wc. For example, two false negatives of scene 2 are indicated with circles in Figure 9a. A close view of these cases is illustrated in Figure 9b; it can be observed that the missed left corner does not match exactly with the w27. On the other hand, false positives tend to appear as a consequence of multiple corner detections. This is because more than one wc make a match with the image edge structure. Figure 10 shows multiple detections in scene 1. It also shows the corner structures w5, w8, w9, w28 that make a match with the w5, and w8 windows. Figure 11 shows the corners detected by Harris (a), SUSAN (b), and the proposed method (c). It can be observed that, in general, Harris and SUSAN tend to detect more corners than the proposed method. However, the false positive rate is assumed to be very high, as proved with the semi-artificial images using the Harris method. Considering that corner information is used for robot navigation, high rate on false positives may lead to complicate more the scene recognition than the lack of some corners. Table 1. Performance of the proposed method. Scenario 3 6 10 15 16 Total
Real corners 42 55 32 34 38 201
Corner detected 40 61 36 46 43 226
Hits
False False positives negatives 40/95.23% 0/0% 2/5% 55/100% 6/11% 0/0% 32/100% 4/12% 0/0% 34/100% 12/35% 0/0% 38/100% 5/13% 0/0% 199/99% 27/13.43% 2/1%
Table 2. Performance of the Harris method. Scenario
10 15
Real corners 42 55 32 34
Corner detected 55 70 43 42
16 Total
38 201
40 250
3 6
Hits
a)
41/97.6% 52/94.5% 32/100% 34/100%
False positives 14/33% 18/33% 11/34% 8/24%
False negatives 1/2.4% 3/5.5% 0/0% 0/0%
38/100% 197/98%
1/2.6% 52/25.86%
0/0% 4/2%
Fig. 9. a) False negatives, b) One false negative zoomed out, c) Pixel view, d) Window w27.
Fig. 10. False positives in scene 1.
72
Articles
b)
c)
Fig. 11. Corner detection by a) Harris b) SUSAN and c) Proposed method.
3. Scene Segmentation Interior environments are highly structured scenarios, because they are designed under specific conditions related to size, shape, position and orientation of their components like doors, walls, aisles. Therefore, finding landmarks of these scenarios are important hints for autonomous robot navigation systems. Z. Chen and S.T. Birchfield [19], state that doors are particular landmarks for navigation since they indicate input and output points. Besides, doors provide stable structures and they are semantically significant. This section describes the segmentation process to obtain features related to the doors found in the scenarios under analysis. The process is shown in Figure 12. The RGB image is transformed to the HSV color space to be more tolerant to illumination changes [19]. Then detection of the doors is achieved by color analysis.
Fig. 12. A block diagram of the process for door detection.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
3.1. RGB to HSV color space The HSV color space is one of the spaces commonly used in color analysis. Schwarz, Cowan and Beatty [20], performed an experiment to compare five color-space models. In this experiment they found significant differences. In particular, they found that the RGB model is fast but it is imprecise. On the other hand, the HSV is not as fast as the RGB but it is very precise for color analysis. The color space transformation from RGB to HSV is obtained by the following equations (7)
V = maxi (R, G, B)
(8) G-B ì1 ï 6 max(R, G, B) - min(R, G, B) ï B-R ï1 ï 6 max(R, G, B) - min(R, G, B) + 1 ï H=í B- R 1 ï1 + ï 6 max(R, G, B) - min(R, G, B) 3 ï R- G 2 ï1 + ïî 6 max(R, G, B) - min(R, G, B) 3
if max(R, G, B) = R and G ³ B if max(R, G, B) = G if max(R, G, B) = B
(9) 3.2. HSV component analysis Statistics of the HSV component values where determined by a sampling process. The sampling consisted on a set of 11samples from each door in the scenarios, see Figure 13. Each sample corresponds to a window of 5x5 pixels. The process involved the computation of the mean and variance of the mean distributions of the windows samples over the HSV values. Table 3 shows the mean distributions over the scenarios, while Table 4 presents the statistic values of the means.
2011
Table 4. Statistic values of the means. Component H
Mean H = 0.078
S V
S = 0.440 V = 0.307
Standard deviation Hs = 0.018 Ss = 0.132
-
Vs = 0.077
3.3. Door segmentation Door detection in autonomous robot navigation is an important issue because they appear in many interior environments [21]. Thus, doors are important landmarks that can be used for scene recognition. Door detection is achieved in this work by analysis of the HSV components implemented in the following condition if ïH - Hpï< Th and ïS - Spï< Ts and ïV - Vpï< Tv Then p(x, y) is a door pixel
if max(R, G, B) = R and G < B
N° 1
(10)
where Hp, Sp, and Vp are the HSV components of the pixel p at coordinates (x,y). The thresholds Th, Ts, and Tv are Th = ïH - Hsï, Ts =ïS - Ssï, Tv =ïV - Vsï
(11)
After the classification of the color pixels, blobs of less than 300 pixels are eliminated since they are not considered doors. Figure 14 illustrates some examples of doors detected by the previous method.
4. Recognition of scenarios The recognition of the scenarios is achieved with a hierarchical neural network, HNN. This type of architecture was selected due to the similarity of some of the scenarios. The HNN is composed of two levels, Figure 15. The first level is composed by one neural network and the second by two neural networks.
Fig. 13. Color door sampling in two scenarios. Fig. 14. Door segmentation. Table 3. Mean distributions over the scenarios. Scenario
H Mean
S Mean
V Mean
2 3 4
0.078 0.059 0.084
0.361 0.471 0.605
0.357 0.205 0.252
5 7 9
0.075 0.099 0.075
0.393 0.367 0.576
0.360 0.361 0.243
10
0.078
0.308
0.372
The idea of this HNN is to separate the scenarios into 4 classes, and then use the second level to resolve more specific cases. The first neural network is a feedforward backpropagation network with 32 inputs, 32 neurons in the hidden layer, and 4 output neurons, sigmoid tangent activation functions in the hidden layer, and sigmoid logarithmic functions in the output layer. This first neural network is trained to classify the four classes, see Figure 16, using the next feature vector
Articles
73
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
é CPxi ù ê C ú X 1 = ê Pyi ú ê hi ú êa ú ë i û
2011
(12)
here CPxi and CPyi are the centroids of the blobs that corresponds to the doors, while hi and ai are the normalized height and width, respectively, of those blobs. Figure 17 presents examples of the door blobs with their respective centroids. The neural network of the second level is trained to classify classes 2 and 4 into their corresponding scenarios, as shown in Figure 18, using the next feature vector é CEx ê X 2, 3 = ê CEy êë N
N° 1
Fig. 17. Examples of door centroids.
ù ú ú úû
(13) where CEx and CEy are the centroids of the corner coordinates and N is the number of corners found in the scenario. Figure 19 presents examples of the corners with their respective centroids.
Fig. 18. Second level of classification.
Fig. 15. Hierarchical neural network scheme.
Fig. 19. Examples of corner centroids.
Fig. 16. First level of classification. The neural network of the second level is a feedforward-backpropagation, with 3 inputs, 26 neurons in the hidden layer, 2 output neurons, sigmoid tangent activation functions in the hidden layer, and sigmoid logarithmic functions in the output layer.
74
Articles
Results of the scenario recognition are commented next. The performance by levels of the HNN over the 10 scenarios considering the two levels is illustrated in Figure 20. The first classification was achieved by door detection using the centroids, height and area of the door blobs. The ANN of the first level was trained to classify the 10 scenarios into 4 classes. This ANN had 90% of correct classification during training, and 95% in validation. Class 1 that contains the scenarios 2, 5, 7, and 10 was considered as one type of scenario because of the high degree of similarity among the four scenarios. This similarity turns to be hard to resolve even for human observers. Class 3 was not reclassified because it only contains images of scenario 9. Regarding the classification of scenarios in the classes 2 and 4, in the second level, it was performed by using corner detection information as well as the number of corners. ANNs 2 and 3 were trained with this information. The ANN 2 separated class 2 into scenario 3 and 4 wih a performance of 90.90% in training and 87.5% during validation. The neural network 3 that classifies class 4 into the scenarios 1, 6, and 8 has a performance of 93.75%
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
in training, and 91.66% in validation. The total performance of the systems considering all the scenarios was 86.66% for training, and 90% in validation.
N° 1
2011
AUTHORS Mario I. Chacon-Murguia*, Rafael Sandoval-Rodriguez, Cynthia P. Guerrero-Saucedo - Visual Perception on Robotic Applications Lab Chihuahua Institute of Technology, Chihuahua, Mexico. E-mails: {mchacon, rsandoval, cpguerrero}@itchihuahua.edu.mx. * Corresponding author
References [1]
[2]
[3]
Fig. 20. System performance for levels.
5. Results Two important results are derived from this work. The first one is related to the proposed corner detector method and the second to the recognition for scenarios. In regards the corner detector method we can mention that the proposed method has similar hit performance in semiartificial scenarios as the Harris detector, 99% and 98%, and false negatives 1% and 2%, respectively. However the proposed method is better with respect to false positives, 13.43% and 25.86%. On the other hand, the use of combined information, door and corner information, provide important discriminative data validated by the hierarchical ANN. The ANN findings present an adequate overall performance of 86.66% for training, and 90% in validation.
[4]
[5]
[6]
[7]
[8]
6. Conclusions In conclusion, it can be said that the proposed corner detector method shows good corner detection in semiartificial as well as in real indoor and outdoor scenarios as it was validated in the corner detection experiments performed in this research. Besides, the corner detection method provides correct information that is validated with the performance achieved in theANNs 2 and 3.
[9]
[10]
[11]
Another important conclusion is that the proposed solution to the scene recognition problem based on fusion of color and corner features proved to be effective based on the experimental results obtained in this work. Results shown in this research confirm that complex problems like scene recognition for robot navigation are well faced with information fusion where different type of information complements each other.
[12]
[13]
[14]
ACKNOWLEDGMENTS The authors thanks to Fondo Mixto de Fomento a la Investigación Científica y Tecnológica CONACYT - Gobierno del Estado de Chihuahua, by the support of this research under grant CHIH-2009-C02-125358. This work was also supported by SEP-DGEST under Grants 2173.09-P and 2172.09-P.
[15]
[16]
Kemp C., Edsinger A., Torres-Jara D., “Challenges for Robot Manipulation in Human Environments”, IEEE Robotics & Automation Magazine, March 2007, pp. 20-29,. Durrant-Whyte H., Bailey T., “Simultaneous Localization and Mapping (SLAM): Part I”, IEEE Robotics & Automation Magazine, June 2006, pp. 99-108. Bailey T., Durrant-Whyte H., “Simultaneous Localization and Mapping (SLAM): Part II”, IEEE Robotics & Automation Magazine, September 2006, pp. 108-117. Addison J., Choong K., “Image Recognition For Mobile Applications”. In: International Conferences on Image Processing ICIP 2007, 2007, pp. VI177-VI180. DeSouza G., Kak A., “Vision for Mobile Robot Navigation: A Survey”, IEEE Transactions On Pattern Analysis And Machine Intelligence, vol. 24, no. 2, February 2002 , pp. 237- 267. Kelly A., Nagy B., Stager D., Unnikrishnan R., “An Infrastructure-Free Automated Guided Vehicle Based on Computer Vision”, IEEE Robotics & Automation Magazine, September 2007, pp. 24-34. Srinivasan M.V., Thurrowgood S., Soccol D., “Competent Vision and Navigation Systems”, IEEE Robotics & Automation Magazine, September 2009, pp. 59-71. Antoine Maint J.B., Viergever M.A., “A Survey of Medical Image Registration”, Medical Image Analysis, 1998, vol. 2, no. 1, pp. 137. Zitova B., Flusser J., “Image Registration Methods: a survey”, Image and Vision Computing, vol. 21, 2003, pp. 977- 1000. TissainayagamP.,Suter D., ”Assessing the Performance of Corner Detectors for Point Feature Tracking Applications”, Image and Vision Computing, vol. 22, 2004, pp. 663-679. Mokhtarian F., Suomela R., “Curvature Scale Space for Robust Image Corner Detection”. In: International Conference on Pattern Recognition, Brisbane, Australia, 1998. Andrade J., Environment Learning for Indoor Movile Robots. PhD Thesis, Universidad politecnica de Catalunya, 2003. Rangarajan K., Shah M., van Brackle D., “Optimal Corner Detector”, 2nd International Conference on Computer Vision, December 1988, pp. 90-94, Smith S.M., Brady J.M., “SUSAN - A New Approach to Low Level Image Processing”. Int. Journal of Computer Vision, vol. 23, no. 1, May 1997, pp. 45-78. Harris C.G., Stephens M., “A combined corner and edge detector”. In: Proceedings of the Alvey Vision Conference, Manchester 1988, pp. 189-192. He X.C., Yung N.H.C., “Curvature Scale Space Corner Detector with Adaptive Threshold and Dynamic Region Articles
75
Journal of Automation, Mobile Robotics & Intelligent Systems
[17]
[18]
[19]
[20]
[21]
76
of Support”. In: Proceedings of the 17th International Conference on Pattern Recognition, vol. 2, August 2004, pp. 791- 794. Aguilar G., Sanchez G., Toscano K., Salinas M., Nakano M., Perez H., “Fingerprint Recognition”. In: 2nd International Conference on Internet Monitoring and Protection - ICIMP 2007, July 2007, pp. 32-32. Leung W.F., Leung S.H., Lau W.H., Luk A., “Fingerprint Recognition Using Neural Network, Neural Networks for Signal Processing”. In: Proceedings of the 1991 IEEE Workshop, 30th Sept.1st Oct. 1991, pp. 226-235. Chen Z., Birchfield S.T., “Visual detection of linteloccluded doors from a single image”, In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2008, pp. 1-8. Schwarz M.W., Cowan M.,W., Beatty J. C., “An experimental comparison of RGB,YIQ, LAB, HSV, and opponent color models”, ACM Transactions on Graphics, vol. 6, issue 2, 1997, pp. 123-158. Cariñena P., Regueiro C., Otero A., Bugarín A., Barro S., “Landmark Detection in Mobile Robotics Using Fuzzy Temporal Rules”, IEEE Transactions on Fuzzy Systems, vol. 12, no. 4,August 2004, pp. 423-235.
Articles
VOLUME 5,
N° 1
2011
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
OPTIMIZATION OF A MODULAR NEURAL NETWORK FOR PATTERN RECOGNITION USING PARALLEL GENETIC ALGORITHM Martha CĂĄrdenas, Patricia Melin, Laura Cruz
Abstract: In this paper, the implementation of a Parallel Genetic Algorithm (PGA) for the training stage, and the optimization of a monolithic and modular neural network, for pattern recognition are presented. The optimization consists in obtaining the best architecture in layers, and neurons per layer achieving the less training error in a shorter time. The implementation was performed in a multicore architecture, using parallel programming techniques to exploit its resources. We present the results obtained in terms of performance by comparing results of the training stage for sequential and parallel implementations. Keywords: modular neural networks, parallel genetic algorithm, multi-core.
1. Introduction The recognition of individuals, from their biometric features, has been driven by the need of security applications, mainly of security such as in surveillance systems for control of employee assistance, access control security places, etc. These systems, have been developed with different biometrics including face recognition, fingerprints, iris, voice, hand geometry, and more [15]. Although there are systems based on classical methods, biometric pattern recognition developed in the area of artificial intelligence with techniques such as fuzzy logic, data mining, neural networks, and genetic algorithms. Real-world problems are complex to solve and require intelligent systems that combine knowledge, techniques and methodologies from various sources. In this case, we are talking about hybrid systems, and these can be observed in some applications already developed in [6], [13], [14]. Artificial Neural Networks applied to pattern recognition have proved to give good results. Therefore is complex to deal with monolithic neural networks. The use of modular neural networks can divide the complex problem into several task smaller, in order to get a efficient system and good results. Artificial Neural Networks and Modular Neural Networks have high potential for parallel processing. Their parallel nature makes them ideal for parallel implementation techniques; however it's difficult to find optimal network architecture for a given application. The architecture and network optimal parameter selection is the most important part of the problem and is what takes a long time to find. Genetic Algorithms (GAs) are search techniques that used to solve difficult problems in a wide range of disciplines. Parallel Genetic Algorithms (PGAs) are pa-
rallel implementations of GAs, which can provide considerable gains in terms of performance and scalability. PGAs can easily implemented on networks of heterogeneous computers or on parallel machines like a multicore architecture. Dongarra et al. [10] describe several interesting applications of parallel computing. In the coming years, computers are likely to have even more processors inside, and in [3] a description of multi-core processor architecture is presented. An introduction to Multi-Objective EvolutionaryAlgorithms can also be found in [8]. The primary goal of this research is to implement an optimized modular neural network system for multimodal biometric. In this paper first we describe the implementation of a monolithic neural network optimized with a PGA, and later the first stage of the modular system that consist in a modular neural network for only one biometric measure, the system is optimized using a PGA masterslave synchronous. The paper is organized as follows: in the section 2 we explain relevant concepts include in this research, section 3 defines the problem statement and the method proposed, section 4 presents the results achieved and finally Section 5 show the conclusions and future work.
2. Theoretical Concepts Soft Computing consists of several computing paradigms, including fuzzy logic, neural networks and genetic algorithms, which can be combined to create hybrid intelligent systems, these systems leverage the advantages of each of the techniques involved [15]. In this research, we use the paradigms of neural networks and genetic algorithms. 2.1. Artificial Neural Networks A neural network is a computational structure capable of discriminating and modeling nonlinear characteristics. It consists of a set of units (usually large) of interconnected simple processing, which operate together. Neural networks have been widely used because of their versatility for solving problems of prediction, recognition, approach [6], [15], [12], [24]. These systems emulate, in a certain way, the human brain. They need to learn how to behave (Learning) and someone should be responsible for teaching (Training), based on previous knowledge of the environment problem [16], [26], 25]. The most important property of artificial neural networks is their ability to learn from a training set of patterns, i.e. is able to find a model that fits the data [22].
Articles
77
Journal of Automation, Mobile Robotics & Intelligent Systems
2.2. Modular Neural Networks A review of the physiological structures of the nervous system in vertebrate animals reveals the existence of a representation and hierarchical modular processing of the information [18]. Modularity is the ability of a system being studied, seen or understood as the union of several parts interacting and working towards a common goal, each performing a necessary task to achieve the objective [2]. According to the form in which the division of the tasks takes place, the integration method allows to integrate or to combine the results given by each of the constructed modules. Some of the commonly used methods of integration are: Average, Gating Network, Fuzzy Inference Systems, Mechanism of voting using softmax function, the winner takes all, among others.
Fig. 1. Modular Neural Network. In Fig. 1 shows a general diagram of a Modular Neural Network, in this model the modules work independently and in the end a form commonly called integrator, performs the function of deciding between the different modules to determine which of them has the best solution (including network of gateways, fuzzy integrator, etc.) [17]. 2.3. GeneticAlgorithms John Holland introduced the Genetic Algorithm (GA) in 1970 inspired by the process observed in the natural evolution of living beings [26], [19]. Genetic Algorithms (GAs) are search methods based on principles of natural selection and genetics. A GA presents a group of possible solutions called a population; the solutions in the population called individuals, each individual is encoded into a string usually binary called chromosome, and symbols forming the string are called genes. The Chromosomes evolve through iterations called generations, in each generation the individuals are evaluated using some measure of fitness. The next generation with new individuals called offspring, is formed from the previous generation using two main operators, crossover and mutation, this representation is shown in Fig 2.
Fig. 2. Structure of a Simple GA. 78
Articles
VOLUME 5,
N° 1
2011
These optimization techniques are used in several areas such as business, industry, engineering and computer science, also are used as a basis for industrial planning, resource allocation, scheduling, decision-making, etc. The GA is commonly used in the area of intelligent systems, some examples of optimization of fuzzy logic systems and neural networks are shown in [4]. GAs find good solutions in reasonable amounts of time, however, in some cases GAs may require hundreds or more expensive function evaluations, and depending of the cost of each evaluation, the time of execution of the GA may take hours, days or months to find an acceptable solution [4], [19]. Computers with a chip multiprocessor (CMP) give the opportunity to solve high performance applications more efficiently using parallel computing. However, a disadvantage of GAs is that they can be very demanding in terms of computation load and memory. The Genetic Algorithms have become increasingly popular to solve difficult problems that may require considerable computing power, to solve these problems developers used parallel programming techniques, the basic idea of the parallel programs is to divide a large problem into smaller tasks and solve simultaneously using multiple processors. The effort for efficient algorithms has led us to implement parallel computing; in this way it's possible to achieve the same results in less time. However, making a GA faster is not the only advantage that can be expected when designing a parallel GA. A PGA has an improved power to tackle problems that are more complex since it can use more memory and CPU resources [1]. 2.4. Parallel GeneticAlgorithms The way in which GAs can be parallelised depends of several elements, like how the fitness is evaluated and mutation is applied, if single or multiples subpopulations (demes) are used, if multiple populations are used, how individuals are exchanged, how selection is applied (globally or locally). Existing different methods for implementing parallel GAs and can be classified in the next general classes [20]: Master-Slave parallelisation (Distributed fitness evaluation), Static subpopulation with migration, Static overlapping subpopulations (without migration), Massively parallel genetic algorithms, Dynamic demes (dynamic overlapping subpopulations), Parallel Steady-state genetic algorithms, Parallel messy genetic algorithms, Hybrid methods. Our Implementation is based on the Master-Slave Synchronous parallelisation, and for that reason we describe only this method, other methods can be reviewed in [4], [19]. Master-slave GAs have a single population. One master node executes the operator's selection, crossover, and mutation, and the evaluation of fitness is distributed among several workers (slaves) processors. The workers evaluate the fitness of every individual that they receive from the master and return the results. A Master-Salve GA depending on whether it waits to receive the fitness values
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
Fig. 3. Execution of Parallel Master-Slaves synchronous GA. for the entire population before proceeding to the next generation can be synchronous or asynchronous. In Fig. 3 we show a Master-Slaves Synchronous GA. 2.5. Chip Mpltiprocessors (CMPs) The improvement in actual processors is based on the development of Chip Multiprocessors (CMPs) or Multicore processors, thus to increase the efficiency of a processor, increases the number of cores inside the processor chip. Multi-core processors technology is the implementation of two or more â&#x20AC;&#x153;execution coresâ&#x20AC;? within a single processor, some of the advantages of multi-core architectures are shown in [10], [7]. These cores are essentially two or more individual processors on a single chip. Depending on the design, these processors may or may not share a large on-chip cache; the operating system perceives each of its execution cores as a discrete logical processor with all the associated execution resources [5]. CMPs can achieve higher performance than would be possible using only a single core. The low inter-processor communication latency between the cores in a CMP helps make a much wider range of applications viable candidates for parallel execution. The increasing complexity of parallel multicore processors necessitates the use of correspondingly complex parallel algorithms. However to exploit these architectures is necessary to develop parallel applications that use all the processing units simultaneously. In order to achieve parallel execution in software, hardware must provide a platform that supports the simultaneous execution of multiple threads. Software threads of execution are running in parallel, which means that the active threads are running simultaneously on different hardware resources, or processing elements. Now it is important to understand that the parallelism occurs at the hardware level too. The improvement measure or speedup takes as reference, the time of execution of a program in a mono-processor system regarding the time of execution of the same program in a multiprocessor or multi-core system, which is represented as follows: speedup =
ts , tp
memory programming and Distributed memory [5], also the parallelism can be implemented in two ways, implicit parallelism, that some compilers perform automatically, these are responsible to generate the parallel code for the parts of the program that are parallel, and the explicit parallelism which is implemented using parallel languages, and the responsible of the parallelism is the programmer, that defines the threads to work, the code of each thread, the communication, etc., this last parallelism gets higher performance.
3. Problem Statement In this case, we focus on the parallel genetic algorithms for optimizing the architecture of a monolithic neural network and Modular Neural network for recognition of persons based on the face biometry implemented in multicore processors. For determining the best architecture and parameters for a neural network there is no particular selection criterion, for example the number of layers and neurons per layer for a particular application is chosen based on experience and to find an optimal architecture for the network becomes a task of trial and error. In addition, there are others methods that with a empirical expression can calculate and determining the architecture of neural network for a specific problem [23]. The database used for this research is The ORL Database of Faces of the Cambridge University Computer Laboratory [9]. This database contains ten different images of 40 persons with gestures, for our implementation not apply any preprocessing for this time, the examples pictures shown in Fig. 4.
(1)
Where ts is the time it takes to run the program in a mono-processor system and tp is the time it takes to run the same program in a system with p execution units. There are many models of parallel programming, the two main choices and the most common are Shared-
Fig. 4. Some images of the ORL database, the database is composed by 400 images, there are images of 40 different persons (10 images per person). Articles
79
Journal of Automation, Mobile Robotics & Intelligent Systems
For the implementation monolithic and Modular, the database was the same, and the same structure of chromosome for optimization. 3.1. Monolithic Neural Network Implementation Neural networks were applied to a database of 40 persons, and we used 5 images per person for training and 5 images per person for test. First, we implemented the traditional monolithic neural network, and before we implemented a Parallel GA for optimizing layer and neurons per layer. The training method for the neural network is the Trainscg (Scaled Conjugate Gradient), with an error goal of 0.01e-006 and between 100 and 150 generations. The GeneticAlgorithm was tested in a Multi-core computer with following characteristics: CPU Intel Core 2 Quad 2.4 GHz, Bus 1066 MHz, 8MB of L2 cache, Memory 6 GBytes DDR2 of main memory, all the experiments were achieved in the MatLab Version R2009b using the Parallel computing toolbox. Parallel Genetic Algorithm for Optimization The Master-Slave Parallel genetic Algorithm was codified with a binary chromosome of 23 bits, 2 bits for number of layers, and 7 bits for number of neurons per layer. The maximum number of layers is 3 and neurons 128, this is shown in Figure 5. The proposed algorithm was implemented in a Shared Memory Multi-core machine with 4 cores, taking one core as master and the remaining cores as slaves.
Fig. 5. Chromosome representation of the problem.
Fig. 6. Parallel GA Implementation.
80
Articles
VOLUME 5,
N° 1
2011
The Genetic Algorithm has the following characteristics: Chromosome Size: The number of genes in each individual for this application is 23 binary bits. Population size: Defines the number of individuals that will compose the population. Population Size =20 Termination Criteria: Maximum number of generations for solving the problem. Max Generations=50 Selection: We used Stochastic Universal Sampling Selection Prob=0.9 Crossover: The selected individuals have a probability of mating, acting as parents to generate two new individuals that will represent them in the next generation. The crossing point is random with a probability of 0.7. Mutation: Represents the probability that an arbitrary bit in the individual sequence will be changed from its original stat. Mutation Probability 0.8 The flow chart of Fig. 6 shows the parallel GA implementation. Other methods for solving a Parallel GA can be seen in [21], [11], [4]. 3.2. Modular Neural Network Implementation For the modular neural network implementation, we develop a previous stage of normalization for the database; this stage is a parallel one like the training stage. To the original database we apply an algorithm for standardize the size of the image. Depending on the database, if necessary, applies this stage. In Fig. 7, show the general diagram of the system, the original database it's distributed in all the slaves available for normalization to form a new database, that is the input to the modular neural network. In the system, we have a synchronization step that execute the master to help coordinate the process in all the slaves. When we have the inputs of the system, the PGA start creating a random initial population in the stage of synchronization, the master divides the population and send it to the slaves (in this case the cores of processor), and the slaves take a part of the population to evaluate. In all the Slaves, for each individual of the GA, load the architecture of the network, read the images and put as input in the Module that corresponds. The images are propagate in the network and calculate the error of training. When finish the part of population the slaves send the results of evaluation to the master. Master waits to all slaves finish to collect the entire population and make selection based on the fitness of all the individuals. The Master performs the operators of crossover, mutation and generates the new population to be divided and evaluated in the slaves, until the maximum number of generations is reached. We used the method of integration Gating Network. The modular architecture consists in dividing the database between the modules or core processors available. The experimental results achieved with this PGA implementation presented in the following section.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
Fig. 7. General Diagram of the Modular Neural Network System.
4. Results Different experiments developed to observe the performance of parallel genetic algorithms; the results presented in this section, the results in time represent the average time execution of each neural network in the population of the PGA. 4.1. Monolithic Neural Network First, we train the monolithic neural network without optimization, in sequential form.After manually changing the architecture for several times, we defined the architecture of the neural network with the expression of Salinas [23] as follows: First hidden layer (2 * (k + 2)) = 84. Second hidden layer (k + m) = 45. Output layer (k) = 40. where k corresponds to the number of individuals and m to the number of images of each of them. Table 1 shows the average of 20 trainings in sequential form of the network in a dual-core and quad-core machines, in this experiment we enabled only one of the cores available and one thread of execution in the processor for simulating sequential execution. Fig. 8 shows the usage of a dual-core machine in the training of the network.
In the experiment of training with implicit parallelism without optimization all the cores and threads available are enabled. Matlab R2009b uses as a default the implicit parallelism for the applications run on it and take all the cores for execution automatically generating a thread of execution per processor. The results obtained for Matlab in a dual-core and quad-core machines shown in the Table 2. Table 2. Average of 20 trainings of implicit parallelism form of the network for dual-core and quad-core machines. No. Cores Epochs Error Goal Error Total Time 4 150 1.00E-06 0.00113 1:58 min 2 150 1.00E-06 0.00384 1:41 min The results show that the execution of serial training of the network are more efficient that the implicit paralellism of Matlab, because when a single core is working (Fig. 9) all the cache memory is available for them.
Table 1. Average of 20 trainings in sequential form of the network for dual-core and quad-core machines. No. Cores Epochs Error Goal Error Total Time 4 150 1.00E-06 0.00178 1:18 min 2 150 1.00E-06 0.00468 1:14 min
Fig. 9. Cores Usage in the implicit parallelism in training of the network. Implicit Parallelism with GAoptimization We optimize the monolithic Neural Network with a simple GA in the form of implicit parallelism. Table 3 shows the average of 20 training test for 2 and 4 cores. Figures 10 and 11 show the usage of processor in dualcore and quad-core machines.
Fig. 8. Cores Usage in the sequential training of the network. Table 3. Average of 20 training of implicit parallelism of Simple GA for optimization of the network with for dual-core and quad-core machines. N. Cores 2 4
Ind 20 20
Gen 30 30
Cross 0.7 0.7
Mut 0.8 0.8
Error 3.0121e-004 9.7361e-005
Time/ nework 1:51min 4:10 min
Average time 33 min 83 min Articles
81
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
Fig. 13. CPU Performance Explicit Parallelism with 4 cores. Fig. 10. CPU Performance Implicit Parallelism with 2 cores. Table 5 shows a comparison between all the training experiments, and observed that.
Fig. 11. CPU Performance Implicit Parallelism with 4 cores.
4.2. Modular Neural Network The results presented in this section are in Table 6 that shows the average of 10 trainings in sequential form of the Modular network in a dual-core and quad-core machines.
Explicit Parallelism with Parallel GAoptimization In this experiment, we utilize the matlab pool that enables the parallel language features within the MATLAB language by starting a parallel job, which connects this MATLAB client with a number of labs. The average results for 20 executions are shown in table 4. Fig. 12 and 13 show the usage of the processor for training with explicit parallelism in a dual-core and quad-core machines.
Fig. 14. CPU Performance Explicit Parallelism with 2 cores modular neural network. In the parallel implementation of the modular neural network utilize the Matlab pool starting a parallel job. The average results for 10 executions are shown in the Table 7. Fig. 15 shows the performance of CPU for the parallel training of a Modular Neural Network.
Fig. 12. CPU Performance Explicit Parallelism with 2 cores. Table 4. Average of 20 training of explicit parallelism of PGA for optimization of the network with for dual-core and quad-core machines. N. Cores 2 4
Ind 20 20
Gen 30 30
Cross 0.7 0.7
Mut 0.8 0.8
Error 3.3121e-004 4.5354e-004
Time/ nework 1:03min 0:35 sec
Average time 21 min 12 min
Table 5. Table of Results of experiments Sequential and Parallel per network.
No. Cores Average time/ network
RNA Sequential 2 4 1:18 min 1:14 min
GA-RNA GA. ImplĂcit Parallelism 2 4 1:58 min 1:41 min
GA. Explicit Parallelism 2 4 1:03 min 0:35 sec
Table 6. Average of 10 trainings in sequential form of the modular network for dual-core and quad-core machines. N. Cores 2 4
Epochs 150 150
Error Goal 1.00E-06 1.00E-06
Error 0.0005 0.0031
Time p/red 2:13 min 2:05min
Time p/gen 20:11 min 20:49 min
Table 7. Average of 10 training of explicit parallelism of PGA for optimization of the modular network with for dual-core and quad-core machines. N. Cores 2 4
Epochs 150 150
Error Goal 1.00E-06 1.00E-06
Error 0.00012 0.00025
Time p/red 0:58 min 1:40 min
Time p/gen 9:37 min 16:36 min
Table 8. Results of experiments Sequential and Parallel per network and the speedup. No. Cores 2 4 82
Articles
Modular Network Sequential Time 2:05 min 2:13 min
PGA-Modular Network GA. Explicit Parallelism Time Speedup 1:40 min 1.46 0:58 sec 3.67
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
[6]
[7]
Fig. 15. CPU Performance Explicit Parallelism with 4 cores modular neural network. In Table 8 we show the results of a modular neural network with a sequential training and with a parallel training, for this experiment modules of the modular network are trained in parallel, the optimization of the network is done in about half (dual core) or about a quarter (quad-core) time, we obtain a speedup of 1.46 in a dual core machine and 3.67 in a Quad core machine. This are the first results obtained, we continue with experiments.
[8]
[9]
[10]
5. Conclusions and Future Work We have presented the experiments with training of the monolithic and modular neural network for database of face; we used different implementations of parallelism to show that the parallel GA using multi-core processor offers best results in the search for optimal neural network architectures in less time. The genetic algorithms take considerable time to successfully complete convergence depending of application, but most of the times achieve satisfactory optimal solutions. Genetic Algorithms can be parallelized to speedup its execution; and if we use Explicit Parallelization we can achieve much better speedup than when using implicit Parallelization, anyway it's necessary to make more tests. Multi-core computers can help us solve high performance applications in a more efficient way by using parallel computation. The future work consists in considering larger size databases and implementing a modular neural network in a multi-core cluster applying different techniques of parallel processing.
[11]
[12]
[13]
[14]
[15]
[16]
[17]
AUTHORS Martha Cárdenas*, Patricia Melin, Laura Cruz - Tijuana Institute of Technology, Tijuana, México. E-mails: mc.marthacardenas@gmail.com, pmelin@tectijuana.mx. * Corresponding author
[18]
[19]
References [1]
[2] [3]
[4] [5]
E. Alba, A. Nebro, J. Troya, “Heterogeneous Computing and Parallel Genetic Algorithms”, Journal of Parallel and Distributed Computing, vol. 62, 2002, pp. 13621385. C. Baldwin, K. Clark, Design Rules, Vol. 1: The Power of Modularity, Mit Press, Cambridge, 2000. T.W. Burger, Intel Multi-Core Processors: Quick Reference Guide, http://cachewww.intel.com/cd/00/00/20/ 57/205707_205707.pdf E. Cantu-Paz, Efficient and Accurate Parallel Genetic Algorithms, KluwerAcademic Publisher, 2001. M. Cárdenas, J. Tapia, O. Montiel, R. Sepúlveda, “Neurofuzzy system implementation in Multicore Processors”, IV Regional Academic Encounter, CITEDIIPN, 2008.
[20]
[21] [22]
[23]
N° 1
2011
O. Castillo, P. Melin, “Hybrid intelligent systems for time series prediction using neural networks, fuzzy logic and fractal theory”, IEEE Transactions on Neural Networks, vol. 13, no. 6, 2002. L. Chai, Q. Gao, D.K. Panda, “Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System”. In: The 7th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007). May 2007, pp. 471-478. C.A. Coello, G.B. Lamont, D.A. Van Veldhuizen, Evolutionary Algorithms for Solvin Multi-Objective Problem, Springer: Heidelberg, 2004. The Database of Faces, Cambridge University Computer Laboratory, http://www.cl.cam.ac.uk/research/ dtg/attarchive/facedatabase.html J. Dongarra, I. Foster, G. Fox, W. Gropp, K. Kennedy, L. Torczon, A. White, Sourcebook of Parallel Computing, Morgan Kaufmann PublishersL San Francisco, 2003. S. González, Optimization of Artificial Neural Network Architectures for time series prediction using Parallel Genetic Algorithms. Master thesis, 2007. K. Hornik, “Some new results on neural network approximation,'' Neural Networks, vol. 6, 1993, pp. 10691072. A. Jeffrey, V. Oklobdzija, The computer Engineering Handbook, Digital Systems and Aplications, 2nd edition, CRC press, 1993. P. Kouchakpour, A. Zaknich, T. Bräunl, Population Variation in Genetic Programming, Elsevier Science Inc, ISSN:0020-0255, 2007. P. Melin, O. Castillo, Hybrid Intelligent Systems for Pattern Recognition using Soft Computing: An Evolutionary Approach for Neural Networks and Fuzzy Systems, Springer, 2005. P. Melin, O. Castillo, Hybrid Intelligent Systems for Pattern Recognition Using Soft Computing. Springer, Heidelberg, 2005. B. Morcego, Study of modular neural networks for modeling nonlinear dynamic systems, PhD thesis, Universitat Politecnica de Catalunya, Barcelona, Spain, 2000. I. Quiliano, Sistemas Modulares, Mezcla de Expertos y Sistemas Híbridos, February 2007, in Spanish, http://lisisu02.usal.es/~airene/capit7.pdf M. Mitchell, An introduction to genetic algorithms, MIT Press, 1998. M. Nowostawski, R. Poli, “Parallel Genetic Taxonomy”, In: Proceedings of the Third International Conference in Knowledge-Based Intelligent Information Engineering Systems, December 1999, pp. 88-92. DOI: 10.1109/KES.1999.820127. A. Ross, K. Nandakumar, A.K. Jainet, Handbook of Multibiometrics, Springer 2006. P. Salazar, "Biometric recognition using techniques of hand geometry and voice with computer vision for feature extraction, neural networks and fuzzy logic ", Master thesis, Division of Graduate Studies and Research in Computer Science, ITT, 2008, p. 57. R. Salinas, Neural Network Architecture Parametric Face Recognition, University of Santiago of Chile, 2000, pp. 5-9. http://cabierta.uchile.cl/revista/17/ Articles
83
Journal of Automation, Mobile Robotics & Intelligent Systems
[24] [25]
[26]
84
artículos/paper4/index.html R. Serrano, Multicore computing applied to Genetic Algorithms. Master. Thesis, CITEDI-IPN, 2008. M. Shorlemmer, “Basic Tutorial of Neural Networks”. In: Artificial Intelligence Research Institute, Barcelona, Spain, 1995. M. Soto Castro, Face and Voice Recognition in real time using Artificial Neural Networks, Master Thesis, Tijuana Institute of Technology, 2006.
Articles
VOLUME 5,
N° 1
2011
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
ADAPTIVE ANT-COLONY ALGORITHM FOR SEMANTIC QUERY ROUTING
Claudia Gómez Santillán, Laura Cruz Reyes, Elisa Schaeffer, Eustorgio Meza, Gilberto Rivera Zarate
Abstract: The most prevalent P2P application today is file sharing, both among scientific users and the general public. A fundamental process in file sharing systems is the search mechanism. The unstructured nature of real-world largescale complex systems poses a challenge to the search methods, because global routing and directory services are impractical to implement. This paper presents a new antcolony algorithm, Adaptive Neighboring-Ant Search (AdaNAS), for the semantic query routing problem (SQRP) in a P2P network. The proposed algorithm incorporates an adaptive control parameter tuning technique for runtime estimation of the time-to-live (TTL) of the ants. AdaNAS uses three strategies that take advantage of the local environment: learning, characterization, and exploration. Two classical learning rules are used to gain experience on past performance using three new learning functions based on the distance traveled and the resources found by the ants. The experimental results show that the AdaNAS algorithm outperforms the NAS algorithm where the TTL value is not tuned at runtime. Keywords: parameter tuning, search algorithm, peer-topeer, adaptive algorithm, local environment, ant-colony algorithms.
1. Introduction Although popular for other uses, the World Wide Web is impractical for user-to-user file sharing as it requires centralized infrastructure such as an HTTP server. In the past decade, a new class of networks called peer-to-peer (P2P) systems began to spread as a solution to the increasing demand of file sharing among Internet users. In P2P networks, the users interconnect to offer their files to one another [1]. The participants, called peers, may connect and disconnect freely, and do so constantly, which triggers frequent changes in the network structure [2]. One of the main advantages is that peers are equal in terms of functionality and tasks which are developed. This produces high fault tolerance and auto-organization: peers form unstructured networks with an acceptable connectivity and performance. The Semantic Query Routing Problem (SQRP) consists in deciding, based on a set of keywords, to which neighbor to forward the query to search files related with the keywords [2], [3]. The lack of global structure caused that flooding-based search mechanisms have been mainly employed. Flooding-based mechanisms are simple, but unfortunately generate vast amounts of traffic in the network and may produce congestion on Internet. Existing approaches for
SQRP in P2P networks range from simple broadcasting techniques to sophisticated methods [1], [4], [5]. The latter includes proposals based on ant-colony systems [6] that are specifically suited for handling routing tables in telecommunications. There exist few algorithms used for SQRP, including SemAnt [3] and Neighboring-Ant Search (NAS) [7], the latter based on the former. In this work we propose an algorithm as an extension to NAS, called the Adaptive Neighboring-Ant Search (AdaNAS). AdaNAS is hybridized with three local strategies: learning, structural characterization and exploration. These strategies are aimed to produce a greater amount of results in a lesser amount of time. The time-to-live (TTL) parameter is tuned at runtime based on the information acquired by these three local strategies.
2. Background In order to place the research in context, this section is divided in four parts. The first part models a P2P network with graph theory, and in the second part we continue with structural characterization. The third part describes the basic ant-colony algorithms for SQRP algorithms and the last part explains parameter tuning and adaptation. 2.1. Graph Theory A P2P network is a distributed system that can be modeled mathematically as a graph, G = (V,E), where V is a set of nodes and E Í V x V is a set of (symmetrical) connections. For more information on graph theory, we recommend the textbook by Diestel [8]. Each peer in the network is represented by a node (also called a vertex) of the graph. The direct communications among the peers are represented by the connections (also called the edges) of the graph. We denote by n the number of nodes in the system and identify the nodes by integers, V = 1, 2, 3, ..., n. Two nodes that are connected are called neighbors; the set of all neighbors of a node i is denoted by G(i). The number of neighbors of a node i is called degree and is denoted by ki. Two nodes i and j are said to be connected if there exists at least one sequence of connections that begins at i, traverses from node to node through the connections of the graph, and ends at j. Such sequences of connections are called routes or paths and the number of connections traversed is the length of the route. 2.2.
Structural Characterization using Degree Distribution For the purpose of analyzing the structure and behavior of complex systems modeled as graphs, numerous characterization functions have been proposed [9]. There are two main types of these functions: those based on global inforArticles
85
Journal of Automation, Mobile Robotics & Intelligent Systems
mation that require information on the entire graph simultaneously and those based on local information that only access the information of a certain node i and its neighborhood G(i) at a time. The degree ki of a node i is a local measure of network. P(k) denotes the number of nodes that have degree k, normalized by n. The measure of P(k) can be interpreted as the probability that a randomly chosen node i has degree k. The values of P(k) for k Î [0, n-1] (supposing that there can only be at most one connection between each pair of distinct nodes) form the degree distribution of the graph. Whereas the degrees themselves are local properties, obtaining the degree distribution is a global computation. The degree distribution is widely used to classify networks according to the generation models that produce such distributions. Among the first and most famous generation models are the uniform random graphs of Erdös and Rényi [10], and Gilbert [11] that yield a binomial distribution that at the limit, approaches the Poisson distribution and most of the nodes in the graph have similar degrees [12]. In the past decade, another type of generation models became popular as various studies revealed that the degree distribution of some important real-world networks (including the WWW, the Internet, biological and social systems) was not Poisson distribution at all, but rather a power-law distribution [13], [14], [15], P(k) ~ k-g with values of g typically ranging between two and three. The models that produce such distributions are called scalefree network models. The notable structural property in networks with power law distribution is the presence of a small set of extremely well-connected nodes that are called hubs, whereas a great majority of the nodes has a very low degree [16], [17]. This property translates into high fault tolerance under random flaws, but high vulnerability under deliberate attack [14]. 2.3. Parameter Tuning andAdaptation Metaheuristics offer solutions that are often close to the optimum, but with a reasonable amount of resources used when compared to an exact algorithm. Unfortunately, the metaheuristics are usually rich in parameters. The choice of the values for the parameters is nontrivial and in many cases the parameters should vary during the runtime of the algorithm [18], [19]. The process of selecting the parameter values is known as tuning. The goal of offline tuning is to provide a static initial parameter configuration to be used throughout the execution of the algorithm, whereas online tuning, also known as parameter control or adaptation, is the process of adjusting the parameter values at runtime. We design a discrete model for adaptation based on the proposed by Holland in 1992 [20]. We assume that the system takes actions at discrete steps t = 1, 2, 3,..., as this assumption applies to practically all computational System. The proposed model is described in section four..
3. SQRP Search Strategies In this section we present the problem focused in this work. First, we describe the semantic query routing problem (SQRP) as a search process. Then, strategies for solve SQRP are shown including our proposed algorithm 86
Articles
VOLUME 5,
N° 1
2011
which uses an adaptive strategy for adjusting an important parameter for the search process: TTL. 3.1. SQRPDescription SQRP is the problem of locating information in a network based on a query formed by keywords. The goal in SQRP is to determine shorter routes from a node that issues a query to those nodes of the network that can appropriately answer the query by providing the requested information. Each query traverses the network, moving from the initiating node to a neighboring node and then to a neighbor of a neighbor and so forth, until it locates the requested resource or gives up in its absence. Due to the complexity of the problem [2], [3], [5], [21], [22], [23], solutions proposed to SQRP typically limit to special cases. The general strategies of SQRP algorithms are the following. Each node maintains a local database of documents ri called the repository. The search mechanism is based on nodes sending messages to the neighboring nodes to query the contents of their repositories. The queries qi are messages that contain keywords that describe searched resource for possible matches. If this examination produces results to the query, the node responds by creating another message informing the node that launched the query of the resources available in the responding node. If there are no results or there are too few results, the node that received the query forwards it to one or more of its neighbors. This process is repeated until some predefined stopping criteria is reached. An important observation is that in a P2P network the connection pattern varies among the net (heterogeneous topology), moreover the connections may change in time, and this may alter the routes available for messages to take. 3.2. SQRPAlgorithms The most popular technique for searching in P2P systems is flooding, where each message is assigned a positive integer parameter known as the time-to-live (TTL) of the message. As the message propagates from one node to another, the value of TTL is decreased by one by each forwarding node. When TTL reaches zero, the message will be discarded and no longer propagated in the system. The main disadvantage of flooding is the rapid congestion of the communication channels [24]. Another widely used search strategy is the random walk [21]. A random walk in a graph is a route where the node following the initiating node is chosen uniformly at random among its neighbors. 3.2.1. AntSearch Wu et al. [23] propose an algorithm called AntSearch. The main idea in the AntSearch algorithm is using pheromone values to identify the free-riders, prevent sending messages to those peers in order to reduce the amount of redundant messages. The estimation of a proper TTL value for a query flooding is based on the popularity of the resources. Wu et al. use three metrics to measure the performance of the AntSearch. One is the number of searched files for a query with a required number of results, R: a good search algorithm should retrieve the number of results over but close to R. The second one is the cost per result that defines the total amount of query messages divided by the number of searched results; this metric measure
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
how many average query messages are generated to gain a result. Finally, search latency is defined as the total time taken by the algorithm. 3.2.2. SemAnt Algorithms that incorporate information on past search performance include the SemAnt algorithm [3], [25] that uses an ant-colony system to solve SQRP in a P2P environment. SemAnt seeks to optimize the response to a certain query according to the popularity of the keywords used in the query. The algorithm takes into account network parameters such as bandwidth and latency. In SemAnt, the queries are the ants that operate in parallel and place pheromone on successful search routes. This pheromone evaporates over time to gradually eliminate old or obsolete information. Also Michlmayr [3] considers parameter tuning for the SemAnt algorithm, including manual adjustment of the TTL parameter from a set of possible values 15, 20, 25, 30, 35 and concludes that 25 is the best value for the parameter. The adjustment of the TTL is made without simultaneous tuning of the other parameters. 3.2.3. Neighboring-Ant Search NAS [7] is also an ant-colony system, but incorporates a local structural measure to guide the ants towards nodes that have better connectivity. The algorithm has three main phases: an evaluation phase that examines the local repository and incorporates the classical lookahead technique [4], a transition phase in which the query propagates in the network until its TTL is reached, and a retrieval phase in which the pheromone tables are updated. Most relevant aspects of former works have been incorporated into the proposed NAS algorithm. The framework of AntNet algorithm is modified to correspond to the problem conditions: in AntNet the final addresses are known, while NAS algorithm does not has a priori knowledge of where the resources are located. On the other hand, differently to AntSearch, the SemAnt algorithm and NAS are focused on the same problem conditions, and both use algorithms based onAntNet algorithm. However, the difference between the SemAnt and NAS is that SemAnt only learns from past experience, whereas NAS takes advantage of the local environment. This means that the search in NAS takes place in terms of the classic local exploration method of Lookahead [4], the local structural metric DDC[26] which measures the differences between the degree of a node and the degree of its neighbors, and three local functions of the past algorithm performance. This algorithm outperforms methods proposed in the literature, such as Random-Walk and SemAnt [7]. 3.2.4. Adaptative Neighboring-Ant Search The proposed algorithm in this work, Adaptive Neighboring Ant Search (AdaNAS) is largely based on the NAS algorithm, but includes the adaptation of the TTL parameter at runtime, in addition to other changes. The mechanism that may extend the TTL for an ant is called the survival rule. It incorporates information on past queries relying on the learning strategies included in AdaNAS, basic characteristics of SQRP and a set of parameters that are adjusted according to the results found when using the survival
N° 1
2011
rule itself. The rule evaluates the length of the shortest known route that begins with the connection (i, j) from the current node i to a node that contains good results for the query t. The form in which the algorithm operates is explained in detail later in Sections 4 and 5.
4. AdaNAS Model In this section, we present a multi-agent model in order to describe the adaptive behavior ofAdaNAS. 4.1. The General Model The environment is the P2P network, in which two stimuli or inputs are observed: I1 : the occurrences of the documents being searched, I2 : the degree ki of the node i. The environment has the following order to send stimuli: observing I1 has a higher priority than observing I2. AdaNAS is an ant-colony system, where each ant is modeled as an agent.AdaNAS has four agent types: The query ant is accountable for attending the users' queries and creating the Forward Ant; moreover it updates the pheromone table by means of evaporation. There is a query ant for each node in the net and it stays there while the algorithm is running. The Forward Ant uses the learning strategies for steering the query and when it finds resources creates the backward ant. It finishes the routing process when its TTL is zero or the amount of found resources is enough that is denoted by R then, it creates an update ant. The backward ant is responsible for informing to query ant the amount of resources in a node found by the Forward Ant. In addition, it updates the values of some learning structures that are the bases of the survival rule which will be explaining later (Section 4.2.3). The update ant drops pheromone on the nodes of the path generated by the Forward Ant. The amount of pheromone deposited depends on quantity of found resources (hits) and number of edges traveled (hops) by the Forward Ant. The system is subdivided into four parts: the structures A to adapt to the environment (called agents), the adaptation plan P, the memory M, and the operators O. Typically, A has various alternative states A1, A2, A3,... among which one is to be chosen for the system, according to the observations made on the environment. On the other hand, P is typically a set of rules, one or more which can be applied. These rules apply the operations in the set O. An operator is either a deterministic function, denoted as (Ai, Pj) ® Ak, or a stochastic function to a probability distribution over a set of states for selecting Ak. The memory M permits the system to collect information on the condition of the environment and the system itself, to use it as a base for the decision making. The observations of the environment are taken as stimuli that trigger the operators. The routing process implemented in the Forward Ant is required to be adaptive, thus A is defined in function of this agent. The possible states for A are five:
Articles
87
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
A1:
No route has been assigned and the Forward Ant is at the initial node. The ant can be only activated when the query ant send it a query and can only receive once time each stimulus. A2: A route has been assigned and TTL has not reached zero. A3: TTL is zero. A4: Forward Ant used the survival rule to extend TTL. A5= X: Terminal state is reached by the Forward Ant. The Figure 1 shows the AdaNAS adaptive model. According to the stimuli -the number of documents found (dotted line, I1) and degree of the node (solid line, I2) - an operator is selected. The line style for state transitions follows that of the stimuli: dotted line for transitions proceeding from I1 and solid for I2. The memory M is basically composed of four structures that store information about previous queries. The first of these structures is the three dimensional pheromone table t. The element ti,j,t is the preference for moving from node i to a neighboring node j when searching by a keyword t. In this work, we assume that each query contains one keyword and the total number of keywords (or concepts) known to the system is denoted by C. The pheromone table M1 = t is split into n bi-dimensional tables, tj, t, one for each node. These tables only contain the entries tj,t for a fixed node i and hence have at most dimensions C x ïG(i)ï. The other three structures are also three-dimensional tables M2 = D, M3 = N and M4 = H, each splits into n local bi-dimensional tables in the same manner. The information in these structures is of the following kind: currently being at node i and searching for t, there is a route of distance Di,j,t starting at the neighbor j that leads to a node identified in Ni,j,t that contains Hi,j,t hits or matching documents.
P5:
N° 1
2011
stored in M2, M3 and M4 permits to extend TTL and determines how much TTL must be extended. The modified transition rule. A variation of transition rule that eliminates the pheromone and degree effects. The operators O ofAdaNAS are the following:
O1: (A1, I1) - P1 ® A1: The ant has been just created and documents were found from the initial node (I1), so backward ant updates the short-time memory (P1) and no change in the agent state of the system is produced. O2: (A1, I2) - P3 ® A2: The ant has been just created and must select a neighbor node (I2) according with the transition rule (P3), this will produce a tracked route by the ant (A2). O3: (A2, I1) - P1 ® A2: The ant has assigned a route (A2) and documents were found from initial node (I1), so backward ant updates the short-time memory (P1) and no change in the agent state of the system is produced. O4: (A2, I2) - P3 ® An | An Î{A2 ,A3}: When the ant has a partial route (A2), it must select the next node from the neighborhood (I2), so it applies the transition rule (P3). The application of the transition rule can cause than TTL is over (A3) or not (A2). O5: (A3, I1) - P1 ® A3 Idem O3, but now the ant has TTL = 0 (A3). O6: (A3, I2) - P4 ® A4 The ant is over TTL (A3) and with neighborhood information (I2), so it applies the survival rule (P4). When P4 is applied the ant performs its activity in an extended time to live (A4). O7: (A3, I2) - P2 ® X The ant is over TTL (A3) and with neighborhood information (I2), so it decides that the path must end. In order to reach its final state (X), the forward ant must create an updating ant which performs the pheromone update (P2). O8: (A4, I1) - P1 ® A4 Idem O3, but now the ant performs it activity in an extended time to live (A4). O9: (A4, I2) - P5 ® A1 | A1 Î{A3 ,A4} In an extended TTL by the modified transition rule (P5), the ant must choose a vertex from the neighborhood (I2) like the next node in the route. The application of transition rule can cause than TTL is over (A3) or not (A2) again.
Fig. 1. AdaNAS Adaptive Model General. The adaptive plans P are the following: P1: P2:
P3:
P4: 88
The backward ant. Created when a resource is found; modifies M2, M3 and M4. The update ant. Modifies the pheromone table M1 = t when the Forward Ant reached 0 and survival rule can not to proceed. The transition rule. Selects the next node applying the inherent learning stored in pheromone trails and in the memory structure M2 = D. The survival rule. Proceeds when the learning
Articles
The general model is illustrated in Figure 1 where can be observed the transitions among states of the Forward Ant. 4. 2. Behavior Rule An ant-colony algorithm has rules that determine its behavior. These rules define why the ants construct and evaluate the solution and why the pheromone is updated and used. Although the pheromone is the main learning structure, AdaNAS has three more: D, N and H, for know the distances toward the nodes that contain in its repository matching documents. AdaNAS own several behavior rules: the transition rule, the update rules, the survival
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
rule and the modified transition rule. 4.2.1. Transition Rule The transition rule P3 considers two structures to determine the next state: t and D. This transition rule is used by an ant x that is searching the keyword t and is located in the node r. The rule is formulates in the following Equation 1:
(1) where p is a pseudo-random number, q is a algorithm parameter that defines the probability of using of the exploitation technique, G(r) is the set of neighbors nodes of r, Lx is the set of nodes previously visited by x, and Equation 2, defined by:
(2) where wd is the parameter that defines the degree importance, wi defines the distance importance toward the nearest node with matching documents (Dr,i,t), b1 intensifies the local metrics contribution (degree and distance), b2 intensifies pheromone contribution (tr,i,t), k(r, i) is a normalized degree measure expressed in Equation 3:
(3)
N° 1
2011
used in SemAnt [3], while the increment strategy is based on the proposed in NAS [7]. Both update rules are described below. Pheromone Evaporation Rule, the pheromone evaporation is a strategy whose finality is avoid that the edges can take very big values of pheromone trail causing a greedy behavior on the algorithm. Each unit time the query ant makes smaller the pheromone trail of the node where the query ant is, by multiplying the trail by the evaporation rate r, which is a number between zero and one. To avoid very low values in the pheromone the rule incorporates a second term consisting of the product rt0, where t0 is the initial pheromone value. The Equation 6 expresses mathematically the evaporation pheromone rule. (6) Pheromone Increment Rule, when a Forward Ant finishes, it must express its performance in terms of pheromone by means of an update ant whose function is to increase the quantity of pheromone depending on amount of documents found and edges traversed by Forward Ant. This is done each time that an update ant passes on one node. The Equations 7 and 8 describe the pheromone increment rule. (7) where tr,s,t is the preference of going to s when the Forward Ant is in r and is searching by keyword t, Dtr,s,t(x) is the amount of pheromone dropped on tr,s,t by a backward ant generated by the Forward Ant x and can be expressed like:
and L is the exploration technique expressed, in Equation 4: (4)
(8)
where f ({px,r,i,t½i Î G(r)}) is a roulette-wheel random selection function that chooses a node i depending on its probability px,r,i,t which indicates the probability of the ant x for moving from r to i searching by keyword t and it is defined in Equation 5:
where hits(x, s) is the amount of documents found by the Forward Ant x from s to end of its path, and hops(x, r) is the length of the trajectory traversed by the Forward Ant x from r to the final node in its route passing by s.
(5) The tables D and t were described in the previous section. The exploration strategy L is activated when p ³ q and stimulates the ants to search for new paths. In case that p < q, the exploitation strategy is selected: it prefers nodes that provide a greater amount of pheromone and better connectivity with smaller numbers of hops toward a resource. As is shown in the transition rule, b2 is the intensifier of the pheromone trail, and b1 is the intensifier of the local metrics, this means that the algorithm will be only steered by the local metrics when b2 = 0, or by the pheromone when b1 = 0. In this work the initial values are b1 = 2 and b2 = 1. 4.2.2. Update Rules There are two basic update rules in an ant colony algorithm: the evaporation and increment of pheromone. The evaporation method of AdaNAS is based on the technique
4.2.3. Survival Rules P1 (the backward ant) updates the memory structures M2 = D, M3 = N, and M4 = H. These structures are used in the survival rule (P4) to increase time to live. This survival rule can be only applied when TTL is zero. The survival rule can be expressed mathematically in terms of the structures H, D and N as see in Equation 9:
(9) where ÄTTL(x, i, t) is the increment assigned to the TTL of ant x (that is, number of additional steps that the ant will be allowed to take) when searching for resources that match to t, currently being at node i. The number of additional steps Di, w(x, i, t), t for arriving in the node w(x, i, t) is determined from the shortest paths generated by previous ants, and is taken when its associated efficiency W(x, i, t) is better than Zx which is a measure of current performance Articles
89
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
of the ant x. The auxiliary functions are shown in Equations10 and 11:
(10)
N째 1
2011
creates a backward ant y. Afterwards, it realizes the search process (lines 9-25) while it has live and has not found R documents. The search process has three sections: Evaluation of results, evaluation and application of the extension of TTL and selection of next node (lines 24-28).
(11) where G(i) is the set of neighbors of node i and Lx is the set of nodes previously visited by the ant x. The tables of hits H, of distances D, and of nodes N were explained in the previous section. The function w(x, i, t) determines which node that is neighbor of the current node i and that has not yet been visited has previously produced the best efficiency in serving a query on t, where the efficiency is measured by W(x, i, t). 4.2.4. Modified Transition Rule The modified transition rule is a special case of transition rule (see Equations 4 and 5) where b2 = 0, Wd = 0 and q = 1. This rule is greedy and provokes the replication of paths generated by previous ants. This rule takes place when TTL has been extended canceling the normal transition rule. Mathematically can be express in Equations 12 and 13, like: (12) where lm is the modified transition rule, r is the current node in the path, t is the searched keyword, Lx is the set of nodes visited by the Forward Ant x and (13) where wi is a parameter that defines the influence of Dr,i,t that is the needed distance for arriving in the known nearest node with documents with keyword t, from r passing by i and b1 is the distance intensifier.
5. AdaNASAlgorithm AdaNAS is a metaheuristic algorithm, where a set of independent agents called ants cooperate indirectly and sporadically to achieve a common goal. The algorithm has two objectives: it seeks to maximize the number of resources found by the ants and to minimize the number of steps taken by the ants. AdaNAS guides the queries toward nodes that have better connectivity using the local structural metric degree [26]; in addition, it uses the well known lookahead technique [25], which, by means of data structures, allows knowing the repository of the neighboring nodes of a specific node. The AdaNAS algorithm performs in parallel all the queries using query ants. The process done by query ant is represented in Algorithm 1. Each node has only a query ant, which generates a Forward Ant x for attending only one user query, assigning the searched keyword t to the Forward Ant. Moreover, the query ants realize periodically the local pheromone evaporation of the node where it is. In the Algorithm 2 is shown the process realized by the Forward Ant, as can be observed all Forward Ants act in parallel. In an initial phase (lines 4-8), the ant checks the local repository, and if it founds matching documents then 90
Articles
Algorithm 1: Query ant algorithm 1 in parallel for each query ant w located in the node r 2 While the system is running do 3 if the user queries to find R documents with keyword t then create ForwardAnt x(r,t,R) 4 5 activate x 6 End 7 apply pheromone evaporation 8 End 9 end of in parallel The first section, the evaluation of results (lines 10-15) implements the classical Lookahead technique. That is, the ant x located in a node r, checks the lookahead structure, that indicates how many matching documents are in each neighbor node of r. This function needs three parameters: the current node (r), the keyword (t) and the set of known nodes (known) by the ant. The set known indicates what nodes the lookahead function should ignore, because their matching documents have already taken into account. If some resource is found, the Forward Ant creates a backward ant and updates the quantity of found matching documents. The second section (lines 16-23) is evaluation and application of the extension of TTL. In this section the ant verifies if TTL reaches zero, if it is true, the ant intends to extend its life, if it can do it, it changes the normal transition rule modifying some parameters (line 21) in order to create the modified transition rule. The third section (lines 24-30) of the search process phase is the selection of the next node. Here, the transition rule (normal or modified) is applied for selecting the next node r and some structures are updated. The final phase occurs when the search process finishes; then, the Forward Ant creates an update ant z for doing the pheromone update. The Algorithm 3 presents the parallel behavior for each backward ant which inversely traverses the path given by the Forward Ant. In each node that it visits, it tries to update the structures D, H and N, which will be used for future queries (lines 7-11). The update is realized if the new values point to a nearer node (line 7). After that, it informs to ant query of the initial node of the path how many documents the Forward Ant found and which path used (line 13). The Algorithm 4 presents the concurrent behavior for each update ant which inversely traverses the path given by the Forward Ant. In each node that it visits, it updates the pheromone trail using the Equation 6 (line 5).
6. Experiments In this section, we describe the experiments we carried during the comparisons of the AdaNAS and NAS algorithms.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 25 27 28 29 30 31 32 33
Algorithm 2: Forward ant algorithm in parallel for each Forward Ant x(r,t,R) initialization: TTL = TTLmax, hops= 0 initialization: path=r, L=r, known=r Results = get_ local_ documents(r) if results > 0 then create backward ant y(path, results, t) activate y End while TTL < 0 and results < R do La_ results= look ahead(r,t,known) if la results > 0 then create backward ant y(path, la results, t) activate y results results + la results End if TTL > 0 then TTL= TTL - 1 Else if (results < R) and (DTTL(x, results, hops) > 0) then TTL= TTL + DTTL(x, results, hops) change parameters: q=1, Wd =0, b2=0 End End Hops= hops + 1 Known= known È [ ( r È G(r)) L=LÎ r r = l(x,r,t) add to path(r) End create update ant z(x, path, t) activate z kill x end of in parallel
6.1. Generation of the test data A SQRP instance is formed by three separate files: topology, repositories, and queries. We generated the experimental instances following largely those of NAS reported by Cruz et al. [7] in order to achieve comparable results. The structure of the environment in which is carried out the process described is called topology, and refers to the pattern of connections that form the nodes on the network. The generation of the topology (T) was based on the method of Barabási et al. [27] to create a scale-free network. We created topologies with 1024 nodes; the number of nodes was selected based on recommendations in the literature [3], [28]. The local repository (R) of each node was generated using “topics” obtained from ACM Computing Classification System taxonomy (ACMCCS). This database contains a total of 910 distinct topics. Also the content are scale-free: the nodes contain many documents in their repositories on the same topic (identified by keywords) and only few documents on other topics.
1 2 3 4
Algorithm 3: Backward ant algorithm initialization: hops= 0 in parallel for each backward ant y(path, results, t) for I = ïpathï - 1 to 1 do R = path(i - 1)
N° 1
2011
5 6 7 8 9 10 11 12 13 14 15
s = pathi hops= hops + 1 if Dr, s, t > hops then Dr, s, t = hops Hr, s, t = result Nr, s, t = pathh End End Send (results, path) to the query ant located in path1 kill y end of in parallel
1 2 3 4 5 6 7 8
Algorithm 4: Update ant algorithm in parallel for each update ant z(path, t, x) for i = ïpathï - 1 to 1 do R = path(i - 1) s = pathi tr, s, t= tr, s, t + Dtr, s, t(x) End kill z end of in parallel
For the generation of the queries (Q), each node was assigned a list of possible topics to search. This list is limited by the total amount of topics of the ACMCCS. During each step of the experiment, each node has a probability of 0.1 to launch a query, selecting the topic uniformly at random within the list of possible topics of the node repository. The probability distribution of Q determines how often the query will be repeated in the network. When the distribution is uniform, each query is duplicated 100 times in average. The topology and the repositories were created static, whereas the queries were launched randomly during the simulation. Each simulation was run for 15,000 queries during 500 time units, each unit has 100 ms. The average performance was studied by computing three performance measures of each 100 queries: Average hops, defined as the average amount of links traveled by a Forward Ant until its death, that is, reaching either the maximum amount of results required R or running out of TTL. Average hits, defined as the average number of resources found by each ForwardAnt until its death. Average efficiency, defined as the average of resources found per traversed edge (hits/hops). 6.2. Parameters The configuration of the algorithms used in the experimentation is shown in Tables 1 and 2. The first column is the parameter, the second column is the parameter value and the third column is a description of the parameter. These parameter values were based on recommendations of the literature [3], [6], [7], [29], [30]. 6.3. Results The goal of the experiments was to examine the effect of the strategies incorporated in the AdaNAS algorithm and determine whether there is a significant contribution to the average efficiency. The main objective of SQRP is to find a set of paths among the nodes launching the queArticles
91
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
Table 1. Parameter configuration of the NAS algorithm. PARAMETER a r b t0 q0 R TTLmax W
VALUE 0.07 0.07 2 0.009 0.9 10 10 0.5
DEFINITION Global pheromone evaporation factor Local pheromone evaporation factor Intensifier of pheromone trail Pheromone table initialization Relative importance between exploration and exploitation Maximum number of results to retrieve Initial TTL of the Forward Ants Relative importance of the resources found and TTL
Table 2, Parameter configuration of the AdaNAS algorithm. PARAMETER r b1 b2 T0 q0 R TTLmax wh wd wi
VALUE 00.07 2 1 0.009 0.9 10 10 0.5 1 1
DEFINITION Local pheromone evaporation factor Intensification of local measurements (degree and distance) in transition rule. Intensification of pheromone trail in the in the transition rule. Pheromone table initialization Relative importance between exploration and Exploitation in the transition rule. Maximum number of results to retrieve Initial TTL of the Forward Ants Relative importance of the hits and hops in the increment rule Degree's influence in the transition rule Distance's influence in the transition rule
Fig. 2. Learning evolution in terms of the number of resources found for AdaNAS and NAS algorithms. ries and the nodes containing the resources, such that the efficiency is greater, this is, the quantity of found resources is maximized and the quantity of steps given to find the resources is minimized. Figure 2 shows the average hits performed during 15,000 queries with AdaNAS and NAS algorithms on an example instance. NAS starts off approximately at 13.4 hits per query; at the end, the average hit increases to 14.7 hits per query. For AdaNAS the average hit starts at 16 and after 15,000 queries the average hit ends at 18.3. On the other hand, Figure 3 shows the average hops performed during a set of queries with NAS and AdaNAS. NAS starts approximately at 17.4 hops per query; at the end, the average hops decrease to 15.7 hops per query. For AdaNAS the average hops starts at 13.7 and after 15, 000 queries the average hops ends at 9.1. Finally, Figure 4 shows the average efficiency performed during a set of queries. NAS starts approximately at 0.76 hits per hop; at the end, it increases to 0.93 hits per hop. For AdaNAS the average efficiency starts at 1.17 hits per hop and after 15, 000 que92
Articles
ries the average efficiency ends at 2. The adaptive strategies of AdaNAS show an increment of 24.5% of found documents, but the biggest contribution is a reduction of hops in 40%, giving efficiency approximately twice better on the final performance of NAS. This observation suggests that the use of degree instead of DDC was profitable. In addition, the incorporation of the survival rule permits to improve the efficiency, because it guides the Forward Ants to nodes that can satisfy the query. Moreover, in future works it will be important to study adaptive strategies for other parameters as well as the initial algorithm parameter configuration in search of further improvement in the efficiency. Figure 5 shows the results of the different experiments applied to NAS and AdaNAS on thirty runnings for each ninety different instances generated with the characteristics described in Section 6.1. It can been seen from it that on all the instances the AdaNAS algorithm outperforms NAS. On average, AdaNAS had efficiency 81% better than NAS.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
Fig. 3. Learning evolution in terms of the length of the route taken for AdaNAS and NAS algorithms.
Fig. 4. Learning evolution in terms of the efficiency (hits/ hop) for AdaNAS and NAS algorithms.
Fig. 5. Comparison between NAS and AdaNAS experi-menting with 90 instances.
7. Conclusions For the solution of SQRP, we proposed a novel algorithm called AdaNAS that is based on existing ant-colony algorithms, which is a state of art algorithm. AdaNAS algorithm incorporates parameters adaptive control techniques to estimate a proper TTL value for dynamic text query routing. In addition, it incorporates local strategies that take advantage of the environment on local level; three functions were used to learn from past performance. This combination resulted in a lower hop count and an improved hit count, outperforming the NAS algorithm. Our experiments confirmed that the proposed techniques are more effective at improving search efficiency. Specifically the
AdaNAS algorithm in the efficiency showed an improvement of the 81% in the performance efficiency over the NAS algorithm. As future work, we plan to study more profoundly the relation among SQRP characteristics, the configuration of the algorithm and the local environment strategies employed in the learning curve of ant-colony algorithms, as well as their effect on the performance of hop and hit count measures. ACKNOWLEDGMENTS This research was supported in part by CONACYT, DGEST and IPN.
Articles
93
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
AUTHORS Claudia Gómez Santillán - Instituto Tecnológico de Ciudad Madero (ITCM). 1ro. de Mayo y Sor Juana I. de la Cruz s/n CP. 89440, Tamaulipas, México. Tel.: (52) 833 3574820 Ext. 3024 and Instituto Politécnico Nacional, Centro de Investigación en Ciencia Aplicada y Tecnología Avanzada (IPN-CICATA). Carretera Tampico-Puerto Industrial Alt., Km.14.5. Altamira, Tamps., México. Tel.: 018332600124. E-mail: cggs71@hotmail.com. Laura Cruz Reyes*, Gilberto Rivera Zarate - Instituto Tecnológico de Ciudad Madero (ITCM). 1ro. de Mayo y Sor Juana I. de la Cruz s/n CP. 89440, Tamaulipas, México. Tel.: (52) 833 3574820 Ext. 3024. E-mails: lcruzreyes@prodigy.net.mx and riveragil@gmail.com. Eustorgio Meza, Elisa Schaeffer - CIIDIT & FIME, Universidad Autónoma de Nuevo León (UANL), Av. Universidad s/n, Cd. Universitaria, CP.66450, San Nicolás de los Garza, N.L., México. Phone: (52)8113404000 Ext.1509. E-mails: elisa@yalma.fime.uanl.mx, emezac@ipn.mx. * Corresponding author
[13]
[14]
[15]
[16] [17]
[18]
[19] [20] [21]
References [1]
[2]
[3]
[4]
[5]
[6]
[7]
[8] [9]
[10]
[11] [12]
94
Sakarayan G., A Content-Oriented Approach to Topology Evolution and Search in Peer-to-Peer Systems. PhD thesis, University of Rostock, 2004. Wu L.-S., Akavipat R., Menczer F., “Adaptive query routing in peer Web search”. In: Proc. 14th International World Wide Web Conference, 2005, pp. 1074-1075. Michlmayr E., Ant Algorithms for Self-Organization in Social Networks. PhD thesis, Vienna University of Technology, 2007. Mihail M., SaberiA., Tetali P., “Random walks with look ahead in power law random graphs”, Internet Mathematics, no. 3, 2004. Tempich C., Staab S., WranikA., “REMINDIN': Semantic Query Routing in Peer-to-Peer Networks based on Social Metaphers” , in: 13th World Wide Web Conference (WWW), 2004. Dorigo M., Gambardella L.M. ”Ant colony system: A cooperative learning approach to the traveling salesman problem”, IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, 1997, pp. 53-66. Cruz L., Gómez C., Aguirre M., Schaeffer S., Turrubiates T., Ortega R., Fraire H., NAS algorithm for semantic query routing systems in complex networks. In: DCAI, Advances in Soft Computing, vol. 50, 2008, pp. 284-292. Diestel R., “Graph Theory”, Graduate Texts in Mathematics, vol. 173, Springer-Verlag: New York, USA, 2000. Costa L., Rodríguez F.A., Travieso G., Villas P.R., “Characterization of complex networks: A survey of measurements”, Advances in Physics, no. 56, 2007, pp. 167-242. Erdos P., Rényi A., On the evolution of random graphs, vol. 2, Akademiai Kiad'o, Budapest, Hungary, 1976. First publication in MTA Mat. Kut. Int. Kozl. 1960, pp. 482-525. Gilbert E., “Random graphs”, Annals of Mathematical Statistics, 30(4), Dec. 1959, pp. 1141-1144. Bollobás B., Random Graphs, Cambridge Studies in Advanced Mathematics., vol. 73, Cambridge University Press, Cambridge, UK, 2nd edition, 2001.
Articles
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29] [30]
N° 1
2011
Adamic L., Huberman B., ”Power-law distribution of the World Wide Web”. Science, no. 287(5461), 2000, p. 2115. Albert R., Jeong H., Barabási A., “Error and attack tolerance of complex networks”. Nature, no. 506, 2000, pp. 378-382. Faloutsos M., Faloutsos P., Faloutsos C., “On power-law relationship on the internet topology”. ACM SIGCOMM Computer Communication Review, 1999, pp. 251-262. Barabási A., Emergence of scaling in complex networks, Wiley VHC, 2003, pp. 69-82. Newman M. E. J., “Power laws, pareto distributions and zipf's law”, Contemporary Physics, vol. 46(5), 2005, pp. 323-351. Birattari M., The Problem of Tunning Mataheuristics as Seen From a Machine Learning Perspective. PhD thesis, Bruxelles University, 2004. Glover F., Laguna M., Tabú Search. Kluwer Academic Publishers, 1986. Holland J.H., Adaptation in natural and artificial systems. MIT Press, Cambridge, MA, USA, 1992. Amaral L., Ottino J., “Complex systems and networks: Challenges and opportunities for chemical and biological engineers”. Chemical Engineering Scientist, no. 59 2004, pp. 1653-1666. Liu L., Xiao Long J., Kwock C.C., “Autonomy oriented computing from problem solving to complex system modeling”. In: Springer Science + Business Media Inc, 2005, pp. 27-54, Wu C.-J., Yang K.-H., Ho. J.M., “AntSearch: An ant search algorithm in unstructured peer-to-peer networks”. In: ISCC, 2006, pp. 429-434. Goldberg P., Papadimitriou C., “Reducibility among equilibrium problems”. In: Proceedings of the 38th annual ACM symposium on Theory of Computing, New York, NY, USA, 2005, pp. 61-70. Michlmayr E., Pany A., Kappel G., “Using Taxo-nomies for Content-based Routing with Ants”. In: Proceedings of the Workshop on Innovations in Web Infrastruc' ture, 15th International World Wide Web Conference (WWW2006), May 2006. Ortega R., Estudio de las Propiedades Topológicas en Redes Complejas con Diferente Distribución del Grado y su Aplicación en la Búsqueda de Recursos Distribuidos. PhD thesis, Instituto Politécnico Nacional, México, 2009. (in Spanish) Barabási A., Albert R., Jeong H., “Mean-field theory for scale-free random networks”. Physical Review Letters, no. 272, 1999, pp. 173-189. Babaoglu O., Meling H., Montresor A., “Anthill: An framework for the development of agent-based peer to peer systems”. In: 22nd InternationalConference On Distributed Computing Systems.ACM, 2002. Ridge E., Design of Expirements for the Tuning of Optimization Algorithms. PhD thesis, University of York, 2007. Ridge E., Kudenko D., “Tuning the Performance of the MMAS Heuristic in Engineering Stochastic Local Search Algorithms. Designing, Implementing and Analyzing Effective Heuristics”. Lecture Notes in Computer Science, vol. 4638, T. Stutzle, M. Birattari, Eds. Berlin / Heidelberg: Springer, 2007, ISBN 978-3-540-74445-0, pp. 46-60.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
A NEW OPTIMIZATION ALGORITHM BASED ON A PARADIGM INSPIRED BY NATURE Leslie Astudillo, Patricia Melin, Oscar Castillo
Abstract: In this paper, we propose a new optimization algorithm for soft computing problems, which is inspired on a nature paradigm: the reaction methods existing on chemistry, and the way the elements combine with each other to form compounds, in other words, quantum chemistry. This paper is the first approach for the proposed method, and it presents the background, main ideas, desired goals and preliminary results in optimization. Keywords: natural computing, novel optimization method, chemical reactions paradigm.
1. Introduction Several works have proved the relevance of computing techniques to solve diverse kinds of problems, including forecasting, control and pattern recognition among others [1], [2], [3]. These techniques not only comply with their objective, but they also pro mote the creation of new ways to give solutions and improve the actual methods as well [4],[5], [6]. One of the main difficulties when designing the structure of a solution method is the tuning of the parameters; which are the key to the success of these applications. These parameters will vary depending on the complexity of the problem and the method used to find the solution; and in some cases, they stem from our own ability to conceptualize the problem itself, taking in account, the inputs of the system and the expected output values. Due to these facts, several optimization strategies based on nature paradigms have arisen. From Ant Colony Optimization, to Particle Swarm Optimization among others, these strategies had emerged as an alternative way to solve problems [7], [8], [9], [10], [11], [12], [13]. For this work, we will be observing the process in which the different elements existing in nature are created, behave and interact with each other to form chemical compounds. The structure of this paper is the following. Section 2 shows a brief description of the chemical method that inspired this investigation; section 3 describes the proposed method and first approach; section 4 shows the preliminary experiment results; in section 5 we describe the current and future work and section 6 shows some references.
nitions [14],[15]. Chemistry is the study of matter and energy and the interaction between them, including the composition, the properties, the structure, the changes which it undergoes, and the laws governing those changes. A substance is a form of matter that has a defined composition and characteristic properties. There are two kinds of substances: elements and compounds. An element is a substance that cannot be broken down into simpler substances by ordinary means. It is apparent from the wide variety of different materials in the world that there are a great many ways to combine elements. Compounds are substances formed by two or more elements combined in definite proportions through a chemical reaction. There are millions of known compounds, and thousands of new ones are discovered or synthesized each year. A chemical reaction is a change in which at least one substance changes its composition and its sets of properties; they are classified into 4 types. Type 1: combination reactions: (B+C ÂŽ BC). Acombination reaction is a reaction of two reactants to produce one product. The simplest combination reactions are the reactions of two elements to form a compound. After all, if two elements are treated with each other, they can either react or not. Type 2: decomposition reactions: (BC ÂŽ B+C). The second type of simple reaction is decomposition. This reaction is also easy to recognize. Typically, only one reactant is given. A type of energy, such as heat or electricity, may also be indicated. The reactant usually decomposes to its elements, to an element and a simpler compound, or to two simpler compounds. Binary compounds may yield two elements or an element and a simpler compound. Ternary (three-element) compounds may yield an element and a compound or two simpler compounds. These possibilities are shown in the Figure 1.
2. Chemical Paradigm In order to have a better understanding of the process that we intend to model, we present some general defi-
Fig. 1. Decomposition possibilities. Articles
95
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
Type 3: substitution reactions: (C + AB ® AC + B). Elements have varying abilities to combine. Among the most reactive metals are the alkali metals and the alkaline earth metals. On the opposite end of the scale of reactivities, among the least active metals or the most stable metals are silver and gold, prized for their lack of reactivity. Reactive means the opposite of stable, but means the same as active. When a free element reacts with a compound of different elements, the free element will replace one of the elements in the compound if the free element is more reactive than the element it replaces. In general, a free metal will replace the metal in the compound, or a free nonmetal will replace the nonmetal in the compound. A new compound and a new free element are produced. Type 4: double-substitution reactions: (AB + CD ® CB + AD). Double-substitution or double-replacement reactions, also called double-decomposition reactions or metathesis reactions, involve two ionic compounds, most often in aqueous solution. In this type of reaction, the cations simply swap anions. The reaction proceeds if a solid or a covalent compound is formed from ions in solution. All gases at room temperature are covalent. Some reactions of ionic solids plus ions in solution also occur. Otherwise, no reaction takes place. Just as with replacement reactions, double-replacement reactions may or may not proceed. They need a driving force. In replacement reactions the driving force is reactivity; here it is insolubility or co-valence.
3. Modeling the Chemical Paradigm Now that we have described the natural paradigm that we intent to mimic, the next step is to define the general structure of our optimization algorithm; which, initially will be developed in 5 phases: a combination algorithm, a decomposition algorithm, a substitution algorithm, a double-substitution algorithm and the final algorithm, which will be the combination of all the previous four. The steps to consider in this optimization method will be as follows: 1. First, we need to generate an initial pool of elements/ compounds. 2. Once we have the initial pool, we have to evaluate it. 3. Based on the previous evaluation, we will select some elements/compounds to “induce” a reaction. 4. Given the result of the reaction, we will evaluate the obtained elements/compounds. 5. Repeat the steps until the algorithm meets the criteria (desired result or maximum number of iterations is reached). In order to start testing the phases of the algorithm, we will be applying these to the following (but not restricted to) functions: De Jong's and Rosenberg's functions [16], [12].
4. Preliminary experimental results Figure 2 shows the De Jong’s first function also called the sphere model, which is continuous, convex, unimodal 96
Articles
N° 1
2011
and is represented by the equation: (1) The domain is given by: (2) And has a global minimum represented by: f(x) = 0; x(i) = 0; i = 0 : n.
(3)
Fig. 2. De Jong's First function in 2D. The fist approach to solve this equation is given by applying a minimization algorithm based on the decomposition reactions. The main idea of this particular algorithm is, given a random set of initial numbers, decompose each one into smaller numbers in a way that can be represented by a binary tree, where each ensuing node will be decomposed as well into smaller numbers, to lead the result into the minimum of the function. To start from the simplest option, in these early experiments all decomposed elements are considered to have the same value, and they are given by: Decomposed_Element(n)=Element/m
(4)
where n is the element index and m is the number of decomposed elements generated. Because the resulting values are the same for each decomposed element, only one will be selected to be evaluated in the function. Let’s consider an initial pool of 5 elements (randomly generated); each one will be decomposed in 3 sub elements throughout 10 iterations. Table 1 shows the final and average values of the best and worst result reached by the algorithm throughout 30 experiments. Table 1. Worst and best results throughout 30 experiments evaluating the first De Jong’s Function. Experiment number 1 13
Minimum Value 3.799e-14 7.829e-09
Average value 1.656e-06 0.341
Comments Best result Worst result
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
Figure 3 shows the minimized values trough the 10 iterations of experiment number 1, which reached the minimum value overall the 30 experiments.
Fig. 5. Plot of the average and standard deviations per iteration in 30 experiments evaluating the De Jong’s first function: the sphere model.
Fig. 3. Minimum value reached in experiment no. 1. In Figure 4 we can see the behavior of the algorithm along the 30 experiments, where every experiment is represented by “Cycles” of 10 iterations each.
Fig. 6. The weighted sphere model in 2D. Figure 6 shows the axis parallel hyper-ellipsoid, also known as the weighted sphere model. It a continuous, convex and unimodal function and is represented by the equation: (5) Fig. 4. Minimum values reached 30 experiments.
The domain is given by: (6)
Table 2 shows the standard deviation calculated by iteration throughout 30 experiments. And has a global minimum represented by: Table 2. Standard deviation per trial in 30 experiments evaluating the first De Jong’s Function Trial 1 2 3 4 5 6 7 8 9 10
Standard Deviation 0.769404299 0.085489367 0.009498819 0.001055424 0.000117269 1.30E-05 1.45E-06 1.61E-07 1.79E-08 1.99E-09
Figure 5 shows the plot of the average and standard deviations calculated per iteration in 30 experiments.
f(x) = 0; x(i) = 0; i = 0 : n.
(7)
The Table 3 shows the final and average values of the best and worst result reached by the algorithm throughout 30 experiments. Table 3. Worst and best results throughout 30 experiments evaluating the first De Jong’s Function. Experiment number 28 23
Minimum Value 8.85E-17 7.91-13
Average value 7.00-05 0.62
Comments Best result Worst result
Figure 7 shows the minimized values trough the 10 iterations of experiment number 28, which reached the minimum value overall the 30 experiments. Articles
97
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
Fig. 7. Minimum value reached in experiment no. 28, evaluating the weighted sphere model.
N° 1
2011
Fig. 9. Plot of the average and standard deviations per iteration in 30 experiments evaluating the weighted sphere model.
In Figure 8 we can see the behavior of the algorithm along the 30 experiments, where every experiment is represented by “Cycles” of 10 iterations each.
Fig. 10. The Rosembrock’s Valley in 2D.
Fig. 8. Minimum values reached 30 experiments evaluating weighted sphere model.
Figure 10 shows the Rosembrock’s Valley function, also known as banana function or the second function of De Jong. The global optimum lies inside a long, narrow, parabolic shaped flat valley. It is represented by the equation:
Table 4 shows the standard deviation calculated by iteration throughout 30 experiments. Table 4. Standard deviation per trial in 30 experiments evaluating the weighted sphere model.
(8) The test area is usually restricted to hypercube: (9)
Trial 1 2 3 4 5 6 7 8 9 10
Standard Deviation 1.525 0.056 0.0020 7.75e-05 2.87e-06 1.06e-07 3.93e-09 1.45e-10 5.40e-12 2.00e-13
Figure 9 shows the plot of the average and standard deviations calculated per iteration in 30 experiments, evaluating the weighted sphere model.
98
Articles
And has a global minimum represented by: f(x) = 0; x(i) = 1; i = 1 : n.
(10)
The Table 5 shows the final and average values of the best and worst result reached by the algorithm throughout 30 experiments. Table 5. Worst and best results throughout 30 experiments evaluating the Rosembrock’s Valley Function. Experiment number 24 30 17 (Iteration 2)
Minimum Value 0.99966 0.99997 6.88e-06
Average Comments Value 1.053 Best final result 1.013 Worst final result 1.23 Best result (Minimum value)
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
Figure 11 shows the minimized values trough the 10 iterations of experiment number 24, which reached the minimum final value overall the 30 experiments..
N° 1
2011
Table 6 shows the standard deviation calculated by iteration throughout 30 experiments. Table 6. Standard deviation per trial in 30 experiments evaluating the Rosembrock’s Valley Function.
Fig. 11. Minimum value of experiment no. 24, evaluating the Rosembrock’s Valley Function. In Figure 12 we can see the behavior of the algorithm along the 30 experiments, where every experiment is represented by “Cycles” of 10 iterations each.
Fig. 12. Minimum values reached in 30 experiments evaluating Rosembrock’s Valley Function. As we can see in Table 5 and Figure 12, the last value reached in each experiment, may not be the “best value” found by the algorithm. Figure 13 shows the behavior of trial with the minimum result through the 30 experiments. In this experiment, the final value was 0.9997, but the minimum value (6.88e-06) was found in iteration no. 2.
Trial 1 2 3 4 5 6 7 8 9 10
Standard Deviation 13.443 0.796 0.205 0.024 0.004 0.0014 0.0012 0.00053 0.00018 6.44e-05
Figure 14 shows the plot of the average and standard deviations calculated per iteration in 30 experiments, evaluating the Rosembrock’s Valley Function.
Fig. 14. Plot of the average and standard deviations per iteration in 30 experiments evaluating the Rosembrock’s Valley Function.
5. Conclusions In this paper, we introduced the first stage of a new optimization method that tries to mimic the chemical reactions. The Decomposition Reaction Method was applied in 3 benchmark functions to evaluate the first development phase of the optimization algorithm. This Decomposition Reaction Method by itself finds or guides the result to a certain minimum value, but, due the nature of some functions, it is necessary to introduce the second phase of this optimization method: The Combination Reactions Method, which will be able to guide the algorithm to find an optimum value when it is not necessarily the “smallest” one. At the time, more functions are being evaluated to pursue the tuning of the algorithm itself. ACKNOWLEDGMENTS The authors would like to thanks CONACYT and Tijuana Institute of Technology for the facilities and resources granted for the development of this research.
Fig. 13. Experiment 17, which had the minimum value in 30 experiments. Articles
99
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
AUTHORS Leslie Astudillo*, Patricia Melin, Oscar Castillo - Tijuana Institute of Technology, Tijuana México, E-mails: leslie.astudillo@suntroncorp.com, epmelin@hafsamx.org, ocastillo@tectijuana.mx. * Corresponding author
References [1]
[2]
[3]
[4]
[5]
[6]
[7] [8]
[9] [10]
[11]
[12] [13]
[14]
100
Hidalgo D., Melin P., Licea, G., “Optimization of Modular Neural Networks with Interval Type-2 Fuzzy Logic Integration Using an Evolutionary Method with Application to Multimodal Biometry”, Bio-inspired Hybrid Intelligent Systems for Image Analysis and Pattern Recognition, 2009, pp. 111-121. Astudillo L., Castillo O., Aguilar L., “Hybrid Control for an Autonomous Wheeled Mobile Robot Under Perturbed Torques”, Foundations of Fuzzy Logic and Soft Computing, IFSA Proceedings, 2007, pp. 594-603. Melin P., Mancilla A., Lopez M., Solano D., Soto M., Castillo O., “Pattern Recognition for Industrial Security Using the Fuzzy Sugeno Integral and Modular Neural Networks”, Advances in Soft Computing, vol. 39, no. 1, 2007, pp. 105-114. Kennedy J., Eberhart R. C., “Particle swarm optimization”, Proceedings of IEEE International Conference on Neural Networks, Piscataway, NJ, 1995, pp. 19421948. Valdez F., Melin P., “Parallel Evolutionary Computing using a cluster for Mathematical Function Optimization“, Nafips. San Diego CA, USA, June 2007, pp. 598602. Fogel D.B., “An introduction to simulated evolutionary optimization”, IEEE transactions on neural networks, vol. 5, no. 1, 1994, pp. 3-14. Man K.F., Tang K.S., Kwong S., "Genetic Algorithms: Concepts and Designs", Springer Verlag, 1999. Eberhart R. C., Kennedy J., “A new optimizer using particle swarm theory”. In: Proceedings of the 6th International Symposium on Micromachine and Human Science, Nagoya, Japan, 1995, pp. 39-43. Goldberg D., Genetic Algorithms, Addison Wesley, 1988. Angeline P. J., “Using Selection to Improve Particle Swarm Optimization”. In: Proceedings of the World Congress on Computational Intelligence, Anchorage, Alaska, IEEE, 1998, pp. 84-89. Montiel O., Castillo O., Melin P., Rodriguez, A., Sepulveda, R. “Human evolutionary model: A new approach to optimization”, Inf. Sci, no. 177(10), 2007, pp. 20752098. Haupt R.L., Haupt S.E., “Practical Genetic Algorithms”, 2nd Edition, Wiley Interscience, 2004. Rotar C., “A New Evolutionary Algorithm for Multiobjective Optimization Based on the Endocrine System”. In: Proceedings of the International Conference on Theory and Applications of Mathematics and Informatics ICTAMI,Alba Iulia, 2003. Chang R., “General Chemistry”, 5th edition, McGrawHill, 2004.
Articles
[15]
[16]
N° 1
2011
Goldberg D., “Schaum's Outline of Beginning Chemistry”, 3rd edition, Schaum's Outline Series, McGrawHill, 2009. GEATbx: Example Functions (single and multi-objective functions), http://www.geatbx.com/docu/fcnindex01.html#P89_3085.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N° 1
2011
APPLICATION OF THE BEE SWARM OPTIMIZATION BSO TO THE KNAPSACK PROBLEM Marco Aurelio Sotelo-Figueroa, Rosario Baltazar, Juan Martín Carpio
Abstract: Swarm Intelligence is the part of Artificial Intelligence based on study of actions of individuals in various decentralized systems. The optimization algorithms which are inspired from intelligent behavior of honey bees are among the most recently introduced population based techniques. In this paper, a novel hybrid algorithm based in Bees Algorithm and Particle Swarm Optimization is applied to the Knapsack Problem. The Bee Algorithm is a new population-based search algorithm inspired by the natural foraging behavior of honey bees, it performs a kind of exploitative neighborhood search combined with random explorative search to scan the solution, but the results obtained with this algorithm in the Knapsack Problem are not very good. Although the combination of BA and PSO is given by BSO, Bee Swarm Optimization, this algorithm uses the velocity vector and the collective memories of PSO and the search based on the BA and the results are much better. We use the Greedy Algorithm, which it's an approximate algorithm, to compare the results from these metaheuristics and thus be able to tell which is which gives better results. Keywords: swarm optimization, PSO, BA, BSO, Knapsack Problem.
1. Introduction Evolutionary and meta-heuristic algorithms have been extensively used as search and optimization tools during this decade in several domains from science and engineering to the video game industry, and others. Many demanding applications that involve the solution of optimization problems of high complexity, a lot of these belonging to a special class of problems called NPhard have been solved by various methods [8]. Metaheuristic algorithms are now considered among the best tools must to find good solutions with a reasonable investment of resources. As described by Eberhart and Kennedy [4] Particle Swarm Optimization or PSO algorithm is part of the Swarm Intelligence and is a metaheuristics that use the social-psychological metaphor; a population of individuals, referred to as particles, adapts by returning stochastically toward previously successful regions. The PSO simulate a society where each individual contributes with his knowledge to the society. These metaheuristics have proved their ability to deal with very complicated optimization and search problems. The behavior of a single ant, bee, termite or wasp often is too simple, but their collective and social behavior is of
paramount significance. The collective and social behavior of living creatures motivated researchers to undertake the study of today what is known as Swarm Intelligence [5]. Two fundamental concepts, self-organization and division of labor, are necessary and sufficient properties to obtain swarm intelligent behavior. The Bee Algorithm or BA [10] is also part of the Swarm Intelligence and this mimics the honey bees and who this search their food foraging. This algorithm is based on a random search on the neighborhood for combinatorial and functional optimization. The Knapsack Problem is a classical combinatorial problem [3], [15] can be described as follows: “Imagine taking a trip to which you can only carry a backpack that, logically, has a limited capacity. Given a set of items, each with a weight and a value, determine the number of each item to include in a bag so that the total weight is less than a given limit and the total value is as large as possible”, this problem can be considerate as NP-easy problem but some studies show that the Knapsack Problem is an NPHard problem [2]. Actually the Knapsack Problem can be modeled by different ways [16] for example multi-dimensional and multiple Knapsack problem [1], [14], [17], quadratic Knapsack problem [6], [13]. In the present paper we introduce the Bee Swarm Optimization or BSO. This algorithm is a hybrid metaheuristic between the PSO and the BA. The BSO use the better characteristics from both algorithms, the Social Metaphor from the PSO and the random search in the neighborhood from the BA, and give us a better result. The experiments were made on seven types of instances from uncorrelated, to uncorrelated similar weight. All these instances probe the algorithm varying the parameters of the profits and the weight. The algorithms were probed with different methods for generating the initial populations [9] like random solutions, uniformly distributed solutions and greedy base solutions. Another important characteristic is the fitness's Confidence Interval for saying what metaheuristic with who initial population is better. It was necessary to have a comparison point for the metaheuristics and was use the Greedy Algorithm [3], this is a deterministic algorithm who gives an approximate result for the Knapsack Problem.
2. Knapsack Problem The Knapsack problem [15] is the typical combinatorial problem that has been studied since many years ago and was proved that it is a NP-Hard problem [12]. The basic problem is the 0-1 Knapsack Problem or Binary Knapsack Problem and it have a search space of 2n – 1 Articles
101
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
possible solutions. The Knapsack Problem can be described as follows: “there are n objects, each of this objects have a profit and weight, and needs to select those whose sum of their benefits is maximized subject to the sum of the weight of the same objects should not exceed an amount determined”. It can be formulated mathematically by numbering each of its objects or items from 1 to n and introducing it to a vector of binary variables xj = 1,2,3,...,n, where each variable represented here will take the value 1 or 0 depending on whether it is selected or not. The solution to the Knapsack Problem is select a subset of objects from the binary vector x, solution vector, that satisfies the constraint on the equation (2) and the same time maximize the objective function on the equation (1).
S Î [1,U]; wj Î [1,V]; pj = wj + S
N° 1
2011
(7)
Subset-sum: the profits and Weight have the same value and are distributed uniformly between one and a maximum V number. wj Î [1,V]; pj = wj
(8)
Uncorrelated similar Weight: the profits are distributed uniformly between one and a maximum V1 number and the Weight is distributed uniformly between one and a maximum V2 number plus a K constant. pj Î [1,V1]; wj Î [1,V2] + K
(9)
n
z = 3 pj xj
(1)
j=1 n
3 wj xj £ c j=1
xj =
j object is selected { 0If theotherwise
(2)
where: z represents the profit. j represents the j-th object. xj indicates whether the j object is part of the solution. pj is the j-th object profit. wj is the j-th object weight. c is the volume or capacity of the knapsack. 2.1. Types of Knapsack Problem Instances The Knapsack Problem is affected by the relationship between the profit and the weight of the objects; these types of instances are the following: Uncorrelated: the profits and Weight are distributed uniformly between one and a maximum V number. pj Î [1,V]; wj Î [1,V]
(3)
Weakly correlated: the Weight is distributed uniformly between one and a maximum V number and the profits are distributed uniformly around the weight and an R ratio. wj Î [1,V]; pj Î [wj - R, wj + R]
(4)
Strongly correlated: the Weight is uniformly distributed between one and a maximum V number; the profits are the Weight plus one K constant. wj Î [1,V]; pj = wj + K
(5)
Inverse strongly correlated: the profits are distributed uniformly between one and a maximum V number and the Weight is the profits plus one K constant. pj Î [1,V]; wj = pj + K
Articles
This algorithm gives a intuitive approach considering the profit and weight of each item, it is know as efficiency and it's based on the Equation (10) and the main is to try to put the items with highest efficiency into the Knapsack. It is necessary sort all the items based on the efficiency, using the Equation (11), before to apply the Greedy to the problem. ej =
pj wj
p0 p p ³ 1 ³ ... ³ n w0 w1 wn
(10)
(11)
4. BeeAlgorithm (BA) The Bee Algorithm [10] or BA is a bio-inspired metaheuristic behavior of honey bees and how they searching for plant to obtain the necessary pollen for honey production. A colony of bees search in a large territory looking for new sources of food and begins to thrive on discovering new food sources. When these sources have much pollen are visited by large numbers of bees and when the pollen decreases the number of bees collected from these sources decreases too. When season for recollecting pollen start, the colony sent so many bees, which are called scouts bees to reconnoiter randomly the territory to inform at the colony where are the best food sources. Once the harvesting season starts the colony maintains a certain percentage of their scout bees in order to detect new and better sources of food. When scout bees have returned to the colony and found a better food source than the currently is using the colony, makes the Dance by means of which transmits the exact position of the source food and then the colony began to send more bees to the food source.
(6)
Almost strongly correlated: the Weight is distributed uniformly between one and a maximum V number and the profits are the Weight plus one random number between one and a maximum S number.
102
3. The Greedy Algorithm
Fig. 1. The BA Algorithm applied to the 0-1 Knapsack Problem.
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
5. Particle Swarm Optimization (PSO) The Particle Swarm Optimization [4], [7] or PSO is a Bio-inspired metaheuristic in flocks of birds or schools of fish. It was developed by J. Kennedy and R. Eberthart based on a concept called social metaphor, this metaheuristics simulates a society where all individuals contribute their knowledge to obtain a better solution, there are three factors that influence for change in status or behavior of an individual: The Knowledge of the environment or adaptation which speaks of the importance given to the experience of the individual. His Experience or local memory is the importance given to the best result found by the individual. The Experience of their neighbors or Global memory referred to how important it is the best result I have obtained their neighbors or other individuals. In this metaheuristic each individual is called particle and moves through a multidimensional space that represents the social space or search space depends on the dimension of space which depends on the variables used to represent the problem. For the update of each particle using something called velocity vector which tells them how fast it will move the particle in each of the dimensions, the method for updating the speed of PSO is given by equation (12), and it is updating by the equation (13).
N° 1
2011
get better results from the PSO and use the exploration and a search radius from the BA to indicate which is where they look for a better result. The first thing that the BSO do is update the position of the particles through the velocity vector and then select a specified number of particles, in this case it is proposed to select the best as being the new food supply that the scout bees discovered, and conducted a search of the area enclosed by the radius search and if there are a better solution in this area than the same particle around which are looking for then the particle is replaced with the position of the best solution found in the area. For this metaheuristic to work properly you need a way to determine what the next and earlier particle position, if we cannot determine this then it is impossible to apply this metaheuristic because you would not know what the next and previous particle to search in that area. We defined the next and before solutions like binary operations, adding and subtracting one number to the solution vector. For example if we have a Knapsack Problem with 5 elements the solution vector will have 5 spaces and in each space can be 0 or 1, we can see the example of the next and before solution vector in the Figure 3.
Fig. 3. Example of the previous and next solution vector. vi = j0* vi + j1(xi - BGlobal) + j2(xi - BLocal)
(12)
xi = xi + vi
(13)
The algorithm of the BSO applied to the Knapsack Problem implemented in the present paper is the following.
where: vi is the velocity of the i-th particle j0 is adjustment factor to the environment. j1 is the memory coefficient in the neighborhood. j2 is the coefficient memory. xi is the position of the i-th particle. BGlobal is the best position found so far by all particles. BLocal is the best position found by the i-th particle
Fig. 4. The BSO Algorithm applied to the 0-1 Knapsack Problem.
7. Experiments Fig. 2. The PSO Algorithm applied to the 0-1 Knapsack Problem.
6. Bee Swarm Optimization (BSO) The Bee Swarm Optimization or BSO is a hybrid metaheuristic population between the PSO and the BA. The main idea of BSO is based on taking the best of each metaheuristics to obtain better results than they would obtain. The BSO use the velocity vector and the way to updating it, equation (12), and applies the social metaphor to
To Test the BSO was used the Generator of Knapsack Test Instances [11]; it requires the number of elements and the coefficients range to generate a test instance. Were generate the seven types of test instances described, and was use the same parameters for each metaheuristic to find which of this are better for the Knapsack Problem. Each metaheuristic was run 100 times for obtaining their average and standard deviation, and was use that data for calculating the Confidence Interval at the 97% of confidence. We use 3 different Initial Population, this because is importance [9] of start with a good solution the metaheuArticles
103
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
ristics, the initial population use were the followings: Greedy: we use the GreedyAlgorithm. Uniformly Distributed: we use the number generator uniformly distributed. Random: we use the random number generator of Java. We determinate the size of the Knapsack obtaining the average size of the instance and considerate it average like the size of the elements and multiply it by 20 for considerate 20 elements. Table 1. Parameters used in the Generator of Knapsack Test Instances. Parameters Elements Coefficients Range
Values 50 [0,100]
Table 2. Parameters used in the BA. Parameters Elements Iterations Elements for Neighborhood Search Radio
Values 50 100 10 3
2011
Table 4. Parameters used in the BSO. Parameters Elements Iterations Elements for Neighborhood Search Radio j0 j1 j3
Values 50 100 10 3 1 0.5 0.8
8. Results In this work we show the results obtained by testing each Type Instances with the different metaheuristics and the three different kinds of initial population, we show the Average profit, the Best profit and Worst profit, their fitness's Standard Deviation for each metaheuristic and their fitness's Confidence Interval. We also show the metaheuristics behavior through their graphic. In the graphics the red squares represent the results obtained by the Greedy Algorithm, the blue dots represent the Initial Population based on the Greedy, the green triangles represent the Initial Population Uniformly Distributed and the yellow diamonds represent the Initial Population Random. 8.1. Uncorrelated To test the Uncorrelated instance the Knapsack size was defined at 1037. We can see the results of each metaheuristic with different initial populations in the Table 5 and their behavior in the Figures 5, 6 and 7. The result provide by the Greedy Algorithm was 1751 and as can be seen the BA gives the worst results, always equal or under the Greedy's result, while the PSO and BSO gives better results but the Standard Deviation from the BSO is smaller than the PSO and theirAverage from the BSO is better.
Table 3. Parameters used in the PSO. Parameters Elements Iterations j0 j1 j3
N° 1
Values 50 100 1 0.5 0.8
Table 5. Results obtained from the Uncorrelated Knapsack Problem. Metaheuristic PSO
BA
BSO
104
Articles
Initial Population Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random
Average 1753.79 1751.39 1750.96 1751 1397.43 1440.25 1753.52 1753.07 1752.65
Best 1758 1758 1758 1751 1522 1550 1758 1758 1758
Worst 1751 1725 1716 1751 1291 1296 1751 1733 1733
s 2.29 6.51 6.3 0 51.08 50.66 2.1 4.12 4.74
Confidence Interval [1753.08, 1754.49] [1749.39, 1753.38] [1749.02, 1752.89] [1751.00, 1751.00] [1381.75, 1413.10] [1424.70, 1455.79] [1752.87, 1754.16] [1751.80, 1754.33] [1751.19, 1754.10]
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 5,
N째 1
2011
Fig. 5. Graphic behaviors of BA metaheuristics apply to the Uncorrelated problem.
Fig. 6. Graphic behaviors of PSO metaheuristics apply to the Uncorrelated problem.
Fig. 7. Graphic behaviors of BSO metaheuristics apply to the Uncorrelated problem.
Articles
105
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
8.2. Weakly correlated To test the Weakly Correlated instance, the Knapsack size was defined at 914. We can see the results of each metaheuristic with different initial populations in the Table 6 and their behavior in the Figures 8, 9 and 10. The result provide by the Greedy Algorithm was 1044 and as we can see the BA gives the worst results, only with the initial population based on the Greedy improve the Greedy's result, the BSO and PSO gives better results but the BSO improve the PSO's results and gives a smaller Standard Deviation. Table 6. Results obtained from the Weakly Correlated Knapsack Problem. Metaheuristic PSO
BA
BSO
Initial Population Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random
Average 1049.83 1040.34 1040.03 1046 965.14 973.1 1050.87 1045.18 1045.19
Best 1053 1053 1053 1046 987 1006 1053 1053 1053
Worst 1046 1027 1023 1046 941 957 1046 1038 1033
Fig. 8. Graphic behaviors of BA metaheuristics apply to the weakly correlated problem.
Fig. 9. Graphic behaviors of PSO metaheuristics apply to the weakly correlated problem.
106
Articles
s 2.11 5.52 6.14 0 9.23 9.46 1.77 3.71 4.14
Confidence Interval [1049.18, 1050.48] [1038.64, 1042.03] [1038.14, 1041.91] [1046, 1046] [962.30, 967.97] [970.19,976.00] [1050.32, 1051.41] [1044.04, 1046.31] [1043.91, 1046.46]
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
Fig. 10. Graphic behaviors of BSO metaheuristics apply to the weakly correlated problem. 8.3. Strongly correlated To test the Strongly Correlated instance, the Knapsack size was defined at 1026. We can see the results of each metaheuristic with different initial populations in the Table 7 and their behavior on the Figures 11, 12 and 13. The result provide by the Greedy Algorithm was1332 and as we can see the BA gives the worst results, only with the initial population based on the Greedy improve the Greedy's result, the BSO and PSO gives better results but the BSO improve the PSO's results and gives a smaller Standard Deviation. Table 7. Results obtained from the Strongly Correlated Knapsack Problem. Metaheuristic PSO
BA
BSO
Initial Population Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random
Average 1345.82 1328.83 1331.78 1332 1265.39 1276.78 1345.8 1337.16 1337.24
Best 1346 1343 1336 1332 1295 1296 1346 1346 1346
Worst 1342 1314 1317 1332 1245 1260 1342 1326 1331
s 0.67 6.28 4.69 0 10.42 7.99 0.8 3.46 3.47
Confidence Interval [1345.61, 1346.02] [1326.90, 1330.75] [1330.33, 1333.22] [1332, 1332] [1262.19, 1268.58] [1274.32, 1279.23] [1345.55, 1346.04] [1336.09, 1338.22] [1336.17, 1338.30]
Fig. 11. Graphic behaviors of BA metaheuristics apply to the strongly correlated problem.
Articles
107
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
Fig. 12. Graphic behaviors of PSO metaheuristics apply to the strongly correlated problem.
Fig. 13. Graphic behaviors of BSO metaheuristics apply to the strongly correlated problem. 8.4. Inverse strongly correlated To test the Inverse Strongly Correlated instance was define the Knapsack size at 1182. We can see the results of each metaheuristic with different initial populations in the Table 8 and their behavior in the Figures 14, 15 and 16. The result provide by the Greedy Algorithm was 1051 and we can see the BA gives the worst results, only with the initial population based on the Greedy improve the Greedy's result, the BSO and PSO gives better results but the BSO improve the PSO's results and gives a smaller Standard Deviation. Table 8. Results obtained from the Inverse Strongly Correlated Knapsack Problem. Metaheuristic PSO
BA
BSO
108
Articles
Initial Population Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random
Average 1051.34 1036.42 1034.65 1051 1011.7 1003.55 1051.87 1044.81 1044.01
Best 1052 1052 1049 1051 1036 1021 1052 1052 1052
Worst 1051 1017 1021 1051 998 990 1051 1038 1033
s 0.47 5.86 5.82 0 7.06 5.84 0.33 3.99 3.79
Confidence Interval [1051.19, 1051.48] [1034.62, 1038.21] [1032.86, 1036.43] [1051, 1051] [1009.53, 1013.86] [1001.75, 1005.34] [1051.76, 1051.97] [1043.58, 1046.03] [1042.84, 1045.17]
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 5,
N째 1
2011
Fig. 14. Graphic behaviors of BA metaheuristics apply to the inverse strongly correlated problem.
Fig. 15. Graphic behaviors of PSO metaheuristics apply to the inverse strongly correlated problem.
Fig. 16. Graphic behaviors of BSO metaheuristics apply to the inverse strongly correlated problem.
Articles
109
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
8.5. Almost strongly correlated To test the Almost Strongly Correlated instance was define the Knapsack size at 124. We can see the results of each metaheuristic with different initial populations in the Table 9 and their behavior in the Figures 17, 18 and 19. The result provide by the Greedy Algorithm was 1524 as we can see the BA gives the worst results, only with the initial population based on the Greedy improve the Greedy's result, the BSO and PSO gives better results but the average from the PSO is better than the BSO while the Standard Deviation from the BSO is smaller than the PSO. Table 9. Results obtained from the Almost Strongly Correlated Knapsack Problem. Metaheuristic PSO
BA
BSO
Initial Population Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random
Average 1554.62 1538.19 1541.33 1552 1491.92 1501.51 1554.97 1550.06 1550.84
Best 1555 1555 1555 1552 1508 1517 1555 1555 1555
Worst 1552 1521 1529 1552 1474 1488 1554 1542 1543
Fig. 17. Graphic behaviors of BA metaheuristics apply to the almost correlated problem.
Fig. 18. Graphic behaviors of PSO metaheuristics apply to the almost correlated problem.
110
Articles
s 0.78 6.39 5.54 0 7.38 6.73 0.17 5.01 4.83
Confidence Interval [1554.37, 1554.86] [1536.22, 1540.15] [1539.62, 1543.03] [1552, 1552] [1489.65, 1494.18] [1499.44, 1503.57] [1554.91, 1555.02] [1548.52, 1551.59] [1549.35, 1552.32]
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
Fig. 19. Graphic behaviors of BSO metaheuristics apply to the almost correlated problem. 8.6. Subset-sum To test the Subset-num instance was define the Knapsack size at 960. We can see the results of each metaheuristic with different initial populations in the Table 10 and their behavior in the Figures 20, 21 and 22. The result provide by the Greedy Algorithm was 949 and in this instance the PSO and the BSO gives the same results with Standard Deviation zero, better that the Greedy's result, and the BAapproaching to this results. Table 10. Results obtained from the Subset-sum Knapsack Problem. Metaheuristic PSO
BA
BSO
Initial Population Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random
Average 960 960 960 959.82 959.85 959.99 960 960 960
Best 960 960 960 960 960 960 960 960 960
Worst 960 960 960 959 958 959 960 960 960
s 0 0 0 0.38 0.38 0.1 0 0 0
Confidence Interval [960, 960] [960, 960] [960, 960] [959.70, 959.93] [959.73, 959.96] [959.95, 960.02] [960, 960] [960, 960] [960, 960]
Fig. 20. Graphic behaviors of BA metaheuristics apply to the subset-num problem.
Articles
111
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
N째 1
2011
Fig. 21. Graphic behaviors of PSO metaheuristics apply to the subset-num problem.
Fig. 22. Graphic behaviors of BSO metaheuristics apply to the subset-num problem. 8.7. Uncorrelated Similar Weight To test the Uncorrelated Similar Weight instance was define the Knapsack size at 200118. We can see the results of each metaheuristic with different initial populations in the Table 11 and their behavior in the Figures 23, 24 and 25. The result provide by the Greedy Algorithm was 1558 as we can see the BA gives the worst results, only with the initial population based on the Greedy equals the Greedy's result, the BSO and PSO gives better results but the average from the BSO is better than the PSO while the Standard Deviation from the PSO is smaller than the BSO. Table 11. Results obtained from the Uncorrelated Similar Weight Knapsack Problem. Metaheuristic PSO
BA
BSO
112
Articles
Initial Population Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random Greedy Uniformly Distributed Random
Average 1563.79 1562.18 1561.38 1558 1303.6 1312.81 1563.72 1562.22 1562.28
Best 1564 1564 1564 1558 1406 1404 1564 1564 1564
Worst 1558 1548 1551 1558 1243 1242 1558 1551 1548
s 1.03 3.15 3.63 0 35.45 35.8 1.23 3.35 3.26
Confidence Interval [1563.47, 1564.10] [1561.21, 1563.14] [1560.26, 1562.49] [1558, 1558] [1292.71, 1314.48] [1301.82, 1323.79] [1563.34, 1564.09] [1561.19, 1563.24] [1561.27, 1563.28]
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 5,
N째 1
2011
Fig. 23. Graphic behaviors of BA metaheuristics apply to the uncorrelated similar weight problem.
Fig. 24. Graphic behaviors of PSO metaheuristics apply to the uncorrelated similar weight problem.
Fig. 25. Graphic behaviors of BSO metaheuristics apply to the uncorrelated similar weight problem.
Articles
113
VOLUME 5,
Journal of Automation, Mobile Robotics & Intelligent Systems
9. Conclusions There are many metaheuristics to solve the Knapsack Problem; in this work we introduce the Bee Swarm Optimization which is a hybrid metaheuristic between the Bee Algorithm and the Particle Swarm Optimization. The experiments were designed with the same parameters for the three metaheuristics to give them the same characteristics in order to be equal between them, and the initial population was generated by a pseudorandom number generator, the generator provide by Java, a uniformly distributed generator and a greedy generator, using the GreedyAlgorithm. After applying the BSO, with the three initial population, to the different instances tests we can see that these metaheuristics give a small Confidence Interval and in most cases its Confidence Interval is better that the PSO's and BA's. In the worst case the Confidence Interval is the same that the PSO's. In the Uncorrelated instances, Subset-sum and Uncorrelated similar weight, the results were similar between PSO and BSO, and in the Weakly correlated instances and Strongly correlated the results were better for BSO than for PSO. Finally in the Inverse Strongly correlated and Almost Strongly Correlated the results were much better for BSO than for PSO. We can see in all the graphics the metaheuristic behavior, and we can observe that the BA is the worst metaheuristic because it always yields a highly variable result, because of this it's Standard Deviation is so high, only with the Greedy base initial population gives the same result that the provide by the Greedy Algorithm or improve it a little. The PSO yields more consistent results, it's graphs show that this metaheuristic in the most of the cases gives good results. We can conclude that have a good initial population for the metaheuristics is essential to have good results and we can say that the BSO is an effective metaheuristic to solve the Knapsack Problem, with all the initial population used, because each time that it's runned, it gives a good solution, and this solution is better that the solutions obtained by the other metaheuristics. Overall the results present a low Standard Deviation and thus a short Confidence Interval.
[3] [4]
[5] [6]
[7] [8]
[9]
[10]
[11] [12]
[13]
[14]
[15]
[16]
[17]
AUTHORS Marco Aurelio Sotelo-Figueroa*, Rosario Baltazar, Juan Martín Carpio - Instituto Tecnológico de León, Av. Tecnológico S/N, 37290, León, Guanajuato, México. E-mails: masotelof@gmail.com, charobalmx@yahoo.com.mx, jmcarpio61@hotmail.com . * Corresponding author
References [1]
[2]
114
Forrest J.J.H., Kalagnanam J., Ladanyi L., “A ColumnGeneration Approach to the Multiple Knapsack Problem with Color Constraints”, INFORMS Journal on Computing, 2006, pp. 129-134. Garey Michael R., David S. Johnson, Computers and Intractability: A Guide to the Theory of NP-CompletenessI, 1979.
Articles
N° 1
2011
Kellerer H., Pferschy U., Pisinger D., Knapsack Problems, Stringer, Berlin, 2004. Kennedy J., Eberhart R.C., “Particle Swarm Optimization”. In:, IEEE International Conference Neural Networks, vol. 4, 1995, pp. 1942-1948. Kennedy J., Eberhart R.C., Swarm Intelligence, Academic Press, EUA, 2001. Kiwiel K. C., “Breakpoint searching algorithms for the continuous quadratic knapsack problem”, Math. Program, Spriger-Verlag, 2008, pp. 473-491. Maurice C., Particle Swarm Optimization, ISTE Ldt, USA, 2006. McDuff-Spears W., Using neural networks and genetic algorithms as Heuristics for NP-complete problems, Thesis of Master of Science in Computer Science, George Mason University, Virginia, 1989. Parsopoulus K.E., Vrahatis M.N., Initializing the particle swarm optimizer using nonlinear simplex method, Advances in Intelligent Systems, Fuzzy Systems, Evolutionary Computation, 2002, pp. 216-221. Pham D., Ghanbarzadeh A., Koç E., Otris S., Rahim S., Zaidi M., “The bee algorithm - a novel tool for complex optimization problems”, Intelligent Production Machines and Systems, 2006. Pisinger D., “Core Problems in Knapsack Algorithms”, Operation Research, vol. 47, 1999, pp. 570-575. Pisinger D., “Where Are The Hard Knapsack problems?”, Computers & Operation Research, vol. 32, 2005, pp. 2271-2282. Pisinger D., Rasmussen, A. Sandvik R., “Solution of Large Quadratic Knapsack Problems Through Aggressive Reduction, INFORMS Journal on Computing, INFORMS, 2007, pp. 280-290. Shahriar A. Z. M. et al., “A multiprocessor based heuristic for multi-dimensional multiple-choice knapsack problem”, J. Supercomput, 2008, pp. 257-280. Silvano M., Toth P., Knapsack Problem, Algorithms and Computer Implementations, John Wiley and Sons , New York USA, 1990. Yamada, T., Watanabe K., and Kataoka, S., Algorithms to solve the knapsack constrained maximum spanning tree problem, International Journal of Computer Mathematics, Taylor & Francis Ltd, 2005, pp. 23-34. Zemel E., “An O(n) Algorithm for the linear multiple choice Knapsack problem and related problems”, Information Processing Problems, North Holland, 1984, pp. 123-128.