pISSN 1897-8649 (PRINT) / eISSN 2080-2145 (ONLINE)
VOLUME 7
N째 4
2013
www.jamris.org
Journal of Automation, mobile robotics & Intelligent Systems
Editor-in-Chief Janusz Kacprzyk (Systems Research Institute, Polish Academy of Sciences; PIAP, Poland)
Associate Editors: Jacek Salach (Warsaw University of Technology, Poland) Maciej Trojnacki (Warsaw University of Technology and PIAP, Poland)
Co-Editors: Oscar Castillo (Tijuana Institute of Technology, Mexico) Dimitar Filev (Research & Advanced Engineering, Ford Motor Company, USA) Kaoru Hirota (Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Japan) Witold Pedrycz (ECERF, University of Alberta, Canada) Roman Szewczyk (PIAP, Warsaw University of Technology, Poland)
Statistical Editor: Małgorzata Kaliczynska (PIAP, Poland)
Executive Editor: Anna Ladan aladan@piap.pl
Editorial Office: Industrial Research Institute for Automation and Measurements PIAP Al. Jerozolimskie 202, 02-486 Warsaw, POLAND Tel. +48-22-8740109, office@jamris.org
Copyright and reprint permissions Executive Editor The reference version of the journal is printed
Editorial Board: Chairman: Janusz Kacprzyk (Polish Academy of Sciences; PIAP, Poland)
Patricia Melin (Tijuana Institute of Technology, Mexico)
Mariusz Andrzejczak (BUMAR, Poland)
Fazel Naghdy (University of Wollongong, Australia)
Plamen Angelov (Lancaster University, UK)
Zbigniew Nahorski (Polish Academy of Science, Poland)
Zenn Bien (Korea Advanced Institute of Science and Technology, Korea)
Antoni Niederlinski (Silesian University of Technology, Poland)
Adam Borkowski (Polish Academy of Sciences, Poland)
Witold Pedrycz (University of Alberta, Canada)
Wolfgang Borutzky (Fachhochschule Bonn-Rhein-Sieg, Germany)
Duc Truong Pham (Cardiff University, UK)
Chin Chen Chang (Feng Chia University, Taiwan)
Lech Polkowski (Polish-Japanese Institute of Information Technology, Poland)
Jorge Manuel Miranda Dias (University of Coimbra, Portugal)
Alain Pruski (University of Metz, France)
Bogdan Gabrys (Bournemouth University, UK)
Leszek Rutkowski (Czestochowa University of Technology, Poland)
Jan Jablkowski (PIAP, Poland)
Klaus Schilling (Julius-Maximilians-University Würzburg, Germany)
Stanisław Kaczanowski (PIAP, Poland)
Ryszard Tadeusiewicz (AGH University of Science and Technology in Cracow, Poland)
Tadeusz Kaczorek (Warsaw University of Technology, Poland)
Stanisław Tarasiewicz (University of Laval, Canada)
Marian P. Kazmierkowski (Warsaw University of Technology, Poland)
Piotr Tatjewski (Warsaw University of Technology, Poland)
Józef Korbicz (University of Zielona Góra, Poland)
Władysław Torbicz (Polish Academy of Sciences, Poland)
Krzysztof Kozłowski (Poznan University of Technology, Poland)
Leszek Trybus (Rzeszów University of Technology, Poland)
Eckart Kramer (Fachhochschule Eberswalde, Germany)
René Wamkeue (University of Québec, Canada)
Piotr Kulczycki (Cracow University of Technology, Poland)
Janusz Zalewski (Florida Gulf Coast University, USA)
Andrew Kusiak (University of Iowa, USA)
Marek Zaremba (University of Québec, Canada)
Mark Last (Ben–Gurion University of the Negev, Israel)
Teresa Zielinska (Warsaw University of Technology, Poland)
Tadeusz Missala (PIAP, Poland)
Publisher: Industrial Research Institute for Automation and Measurements PIAP
If in doubt about the proper edition of contributions, please contact the Executive Editor. Articles are reviewed, excluding advertisements and descriptions of products. All rights reserved © Articles
1
JOURNAL of AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS VOLUME 7, N° 4, 2013 DOI: 10.1431/JAMRIS_4-2013
CONTENTS 34
3
Editorial Cezary Zieliński, Krzysztof Tchoń 5
On the Application of Elastic Band Method to Repeatable Inverse Kinematics in Robot Manipulators Ignacy Dulęba, Michał Opałka DOI: 10.1431/JAMRIS_4-2013/5 13
Impedance Controllers for Electric-Driven Robots Edward Jezierski, Artur Gmerek DOI: 10.1431/JAMRIS_4-2013/13 21
Safe Strategy of Door Opening with Impendance Controlled Manipulator Tomasz Winiarski, Konrad Banachowicz, Maciej Stefańczyk DOI: 10.1431/JAMRIS_4-2013/21 27
3D Camera and Lidar Utilization for Mobile Robot Navigation Maciej Stefańczyk, Konrad Banachowicz, Michał Walęcki and Tomasz Winiarski
DOI: 10.1431/JAMRIS_4-2013/28
2
Articles
Optimization-Based Approach for Motion Planning of a Robot Walking on Rough Terrain Dominik Belter DOI: 10.1431/JAMRIS_4-2013/35 42
Experimental Verification of a Walking Robot Self-Localization System with the Kinect Sensor Michał Nowicki, Piotr Skrzypczyński DOI: 10.1431/JAMRIS_4-2013/43 52
Extensive Feature Set Approach in Facial Expression Recognition in Static Images Mateusz Żarkowski DOI: 10.1431/JAMRIS_4-2013/53 59
Speech Emotion Recognition System for Social Robots Łukasz Juszkiewicz DOI: 10.1431/JAMRIS_4-2013/59
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N° 4
2013
Editorial
Interaction of Robots with the Environment The intrinsic property of robots is their ability to interact with the environment. The type of the environment in which robots operate can be employed as one of many classification criteria distinguishing them into broad classes, i.e. industrial, service or field robots. This distinction is implied by the degree of structuring that the environment exhibits. Industrial robots operate in very structured environment, like in the factories, service robots populate offices and homes, where the location of furniture or people changes, while field robots operate in forests or fields where the structure is hard to define whatsoever. The structure of the environment in which the robot operates implies how it interacts with its surroundings. This issue of Journal of Automation, Mobile Robotics and Intelligent Systems (JAMRIS) is devoted to the subject of how robots interact with the environment depending on its structure. A closer inspection of this topic should lead to the deduction of commonalities influencing the design of future robot control and perception systems. The idea of publishing an issue of JAMRIS devoted to interaction of robots with the environment emerged in the discussions during the 12th National Conference on Robotics organized by the Department of Fundamental Cybernetics and Robotics, Institute of Computer Engineering, Control and Robotics, Wrocław University of Technology in Świeradów-Zdrój, from the 12th to the16th of September 2012. This conference showed that the Polish robotics community has produced significant results in this field, and thus the Program Committee decided that it would be beneficial, if those achievements would be put together within one publication, so that they could complement each other. Hence, a selected group of authors working on diverse aspects of robot-environment interaction was invited to submit papers describing their research results. Based on the conference presentations, the papers contained in this issue of JAMRIS are their extended and more comprehensive English versions, subjected to the regular JAMRIS review procedure. Gratitude should be expressed to all of the reviewers who provided in depth comments enabling many clarifications and overall improvement of the contents of the papers presented in this topical issue of JAMRIS. This volume of JAMRIS is composed of 8 papers that provide insight into four aspects of robot-environment interaction: • interaction of redundant robots executing repeatable tasks within the factory environment, • physical interaction of manipulators with rigid objects in diverse environments, including home and office settings, • interaction of mobile robots or walking machines with semi-structured and unstructured environments, • interaction of robot companions with people. The first paper of this volume is devoted to the interaction of redundant robots with a well structured environment, in which usually repeatable tasks are executed. Ignacy Dulęba and Michał Opałka authored the paper On Application of Elastic Band Method to Repeatable Inverse Kinematics in Robot Manipulators, in which they tackle the problem of generating closed loop trajectories for an end-effector motion of a redundant manipulator, in such a way that the initial and the terminal manipulator configurations will be the same. This problem arises when a redundant manipulator has to execute repeatable motions in the operational and the respective configuration space. For that purpose the elastic band method is used to design a repeatable inverse kinematics algorithm for robot manipulators. The final solution is obtained through optimization techniques. By simulating two robots the authors compare their solution with a standard pseudo-inverse Jacobian algorithm, which does not preserve repeatability of inverse kinematics. Next two papers are devoted to manipulators physically interacting with rigid objects. This requires force and torque control, which can be obtained through changing the mechanical impedance properties of the robot. The papers present both how such controllers should be designed and the utilization of such controllers in task execution. Edward Jezierski and Artur Gmerek in the paper entitled Impedance Controllers for Electric-Driven Robots provide an insight into the design of robot actuation control exhibiting compliance. Based on the information Articles
3
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N° 4
2013
about approaching an obstacle, obtained from a proximity sensor, the mechanical impedance of the manipulator is adjusted, thus inducing into the device the required compliance, hence reducing the collision impact. Experimental results obtained by controlling a brushless DC motor and simulation of a 2 dof manipulator illustrate and validate the proposed control algorithms. The article entitled Safe Strategy of Door Opening with Impedance Controlled Manipulator, by Tomasz Winiarski, Konrad Banachowicz and Maciej Stefańczyk, concentrates on a single activity of service robots, namely their ability to open doors of rooms or cupboards. To that end impedance control law taking into account force and velocity constraints imposed by the manipulated object was employed. Those constraints are due the safety of the performed operation and directly influence the control law. The structure and operation of the designed controller is described formally. The conducted experiments validate the proposed control strategy. The third group of papers focuses on the interaction of diverse mobile robots with their environments. As their autonomous navigation requires both adequate perception of the environment and motion planning, those are the subjects investigated by those papers. The results of research conducted by Maciej Stefańczyk, Konrad Banachowicz, Michał Walęcki and Tomasz Winiarski are presented in the paper 3D Camera and Lidar Utilization for Mobile Robot Navigation. The authors deal with a navigation system based on the Kinect sensor and a laser scanner capable of detecting obstacles in indoor environments. The control system behaviour is specified in terms of transition functions. The localization of the robot relative to the map is performed by an Extended Kalman filter taking as its arguments three sources of data: wheel encoders, gyroscope, and the laser scanner. Global robot position relative to the provided approximate map is calculated, using a particle filter. The functioning of the system is verified in a cluttered office environment in which the mobile robot operates. The paper Optimization-Based Approach for Motion Planning of a Robot Walking on Rough Terrain authored by Dominik Belter, describes an algorithm generating leg-end and body trajectories for a hexapod robot walking on an uneven terrain. For that purpose it uses Rapidly Exploring Random Tree Connect and Particle Swarm Optimization algorithms. Static stability of the machine and motion constraints imposed by obstacles are taken into account. The quality of the elaborated algorithms was tested by simulation. The work of Michał Nowicki and Piotr Skrzypczyński entitled Experimental Verification of a Walking Robot SelfLocalization System with the Kinect Sensor concerns the identification of egomotion of a RGB-D sensor mounted on a hexapod robot. The presented algorithm relies purely on the depth information, i.e. a point cloud. The considerations fall into a broad category related to the design of Simultaneous Localization and Mapping algorithms. A two-stage point cloud matching procedure exploiting salient features in the range data is employed. It first uses feature-based matching procedure utilising Normal Aligned Radial Feature detectors/descriptors followed by Iterative Closest Points algorithm. The fourth group of papers deals with natural interaction of robots with people who inhabit the environment, thus from the point of view of an autonomous robot are simply a part of it. Acceptable interaction of robots and humans will be possible, only if they can socialize, thus the study of emotions is a prerequisite to this. Hence, the identification of emotions from facial expressions as well as speech tone and prosody is the topic of the next two papers. Mateusz Żarkowski in the paper entitled Extensive Feature Set Approach in Facial Expression Recognition in Static Images concentrates on facial emotion recognition of a person interacting with a robot. The system recognizes seven human emotions. The system relies on feature extraction and selection, which facilitates the definition of new features and adding them to the feature set. Face tracking and parametrization is performed by FaceTracker software, while feature selection is done by data mining software. The paper Speech Emotion Recognition System for Social Robots, authored by Łukasz Juszkiewicz, focuses on social robots, e.g. robot companions. Emotions of the person interacting with the robot are recognised using global acoustic features of speech. Four phases of Polish or German speech processing are described, namely: calculation of speech parameters, acoustic feature extraction, feature selection and classification. The paper makes an interesting observation that purely acoustic recognition system has problems with distinguishing anger and joy, thus could result in a grave misjudgement of response by a social robot. All of the mentioned particular aspects of robot-environment interaction are at the forefront of the currently ongoing research into this subject. Each of the papers gives a valuable insight into a particular problem, providing its formulation and deriving a solution. This selection of papers reveals the wide scope and diversity of contemporary robotics. Cezary Zieliński Krzysztof Tchoń
4
Articles
Journal of Automation, Mobile Robotics & Intelligent Systems
O
A
VOLUME 7,
E K
B R
M M
R
N◦ 4
2013
I
Submi ed: 20th January 2013; accepted: 4th June 2013
Ignacy Duleba, Michal Opalka DOI: 10.14313/JAMRIS_4-2013/5 Abstract: In this paper an idea of the elas c band method was exploited to design a repeatable inverse kinema cs algorithm for robot manipulators. The method incorporates an op miza on process at many stages of its performance and admits some extensions. Performance of the algorithm was illustrated on models of the three DOF planar pendulum and the PUMA robot. A comparison with a standard pseudo-inverse Jacobian algorithm, which does not preserve repeatability of inverse kinema cs, is also provided. Keywords: inverse kinema cs, repeatability, algorithm
1. Introduc on Robot manipulators are frequently used to perform repeatable tasks (assembling, drilling, painting) on a factory loor. From a technological point of view, it is desirable to follow exactly the same trajectory by a manipulator when performing a repeatable task. In this case the problem of avoiding obstacles can be signi icantly reduced as only one trajectory should be tested for the collision avoidance. Therefore, one of the most popular tasks in robot manipulators is a repeatable inverse kinematic task. The task is to ind a loop (closed path) in a con iguration space that corresponds to a prescribed cycle of the end-effector in a taskspace [1]. This task is meaningful only for redundant manipulators because for non-redundant manipulators a solution of the inverse kinematic task is determined uniquely if only a trajectory does not meet any singular con iguration. A standard approach to construct repeatable inverse kinematic algorithms relies on extending original (redundant) kinematics by some extra functions (n − m items) to get extended kinematics which is non-redundant [10]. Outside singular con igurations of the original kinematics, the added functions should preserve full rank of the square Jacobian matrix of the extended kinematics (in other words, they should not introduce extra singular con igurations not present in the original kinematics). This constraint is not too restrictive thus there exist many functions to ful ill the aforementioned minimal requirement. So, naturally, one wants to get an optimal set of functions with respect to a prescribed quality function [5]. This line was exploited in Karpinska’s PhD thesis [4]. In this paper we follow a quite different approach that does not add any extra functions and avoids a
computationally expensive problem of their optimization. The proposed approach is based on the elastic band method, developed by Quinlan and Khatib [8], which was primarily designed to optimize a path of a mobile robot [2]. The idea of this method is the following: an initial admissible path is assumed, then it is deformed iteratively to optimize a given quality function until a stop condition is satis ied. The deformation was originally based on a potential ield method that tried to avoid obstacles, by maximizing a distance from the path to obstacles, and make the path short and smooth. This idea is very close to a more general approach known in mathematics as a continuation method [9]. The continuation method construct a inal solution also in an iterative process. The irst approximation (maybe not admissible, i.e. not satisfying all requirements imposed on the original task) of the solution should be known (and usually it is simple to determine). Then, a sequence of solutions is constructed which decrease a certain quality function that measures how far the current solution is from that one which satis ies all requirements. In the limit a solution of the original task is determined which satis ies all requirements imposed. In our adaptation of the elastic band method to repeatable inverse kinematic tasks a role of the deformed path will play a cyclic trajectory in the con iguration space. It will be modi ied in such a way that the modi ied trajectory remains cyclic and its image, via forward kinematics, is closer to the prescribed cyclic path in the taskspace. In this paper we use the term trajectory as a synonym for a function in a con iguration space parametrized with the same variable that parametrize the prescribed path in the taskspace. Usually the trajectory is described as a function of time. The paper, being the extended version of [3], is organized as follows. In Section 2 the repeatable inverse kinematic task is de ined formally. In this section an algorithm based on the elastic band method to solve the task is discussed with details. Simulation results of the proposed algorithm are collected in Section 3. Section 4 concludes the paper.
2. Elas c band method in repeatable inverse kinema c task Forward kinematics of a robot manipulator is a mapping [11] k : Q ∋ q → x = k(q) ∈ X ⊂ SE(3),
(1) 5
Journal of Automation, Mobile Robotics & Intelligent Systems
where q is a con iguration living in the con iguration space Q and x denotes a point in the taskspace X. For redundant manipulators dimensionallity of the coniguration space is larger than dimensionallity of the taskspace, dim Q = n > m = dim X.
(2)
With the kinematics (1) a Jacobian matrix is associated and given by J(q) = ∂k/∂q. Let a loop (a closed path) {x(s), s ∈ [0, smax ], x(0) = x(smax )}
(3)
be given in the taskspace and also an initial con iguration q0 is known corresponding to the start point of the loop, k(q0 ) = x0 . The task of repeatable inverse kinematics is to ind a trajectory q(s) in the con iguration space satisfying ∀s∈[0,smax ] k(q(s)) = x(s), q(0) = q(smax ) = q0 . (4) Additionally, the sum of squares of lengths of the subtrajectories in the con iguration space ∫smax( )T ( ) ∫smax ∂q ∂q ∂q ∂q ds = ⟨ , ⟩ds. (5) ∂s ∂s ∂s ∂s 0
0
should be minimized. In (5) T stands for transposition, and ⟨·, ·⟩ introduces the inner product. We prefer to evaluate a trajectory according to Eq. (5) as it is faster to compute than to determine the total length of the trajectory. The basic algorithm exploited in solving the aforementioned task is the Newton algorithm designed to solve the inverse kinematics when the goal is to reach a single point in the taskspace. The algorithm is described by the iteration process [7, 11] with a possible optimization in the null space of the Jacobian qi+1 = qi + ξ1 · J # (qi ) (xf − k(qi )) + ( ) +ξ2 I − J # (qi )J(qi ) ∂f∂q(q) |q=qi ,
(6)
where J # = J T (JJ T )−1 is the pseudo-inverse of the Jacobian matrix, q0 is the initial con iguration of the manipulator, xf denotes a current goal point in the taskspace, I is the (n × n) identity matrix and a scalar function f (q) is to be optimized (the optimization can be switched-off by setting the value of ξ2 to 0. Below main steps of the algorithm based on the elastic band method applied to repeatable inverse kinematic task are presented Step 1 Select an initial loop in the con iguration space q(s), s ∈ [0, smax ],
(q(0) = q(smax )).
The loop becomes a current trajectory. Step 2 De ine an error function that assigns to a cyclic trajectory q(·) accumulated distance of its image (via forward kinematics) to the traced path x(·) in the taskspace ∫smax ||x(s) − k(q(s))||ds, (7) err(q(·)) = s=0
where || · || is an assumed norm. 6
VOLUME 7,
N◦ 4
2013
Step 3 Compute the error (7) for the current trajectory. Step 4 If the error is below an acceptable threshold, complete the computations and output the resulting current trajectory. Otherwise progress with Step 5. Step 5 Modify the current trajectory (either at a single point or on an interval) preserving cyclicity of the trajectory with the aim to decreased the value of error (7). The modi ied trajectory becomes the new current trajectory. Continue with Step 3. More details concerning the algorithm follow. - A few terms and notations extensively used later on are introduced: - a node point is any point placed on the given path x(·), - a node con iguration is a con iguration which image via forward kinematics is a node point, - a node con iguration q when included into a trajectory q(s) gets the same argument s as the corresponding node point x(s). The argument s of x(s) is determined uniquely, - (q(s), x(s)) = N ewton(q0 , x(s), f (q)) denotes that the Newton algorithm (6) with the initial coniguration q0 , the ixed goal point x(s) (as s is ixed) and minimizing the quality function f (q) was run and produced as its output the pair node con iguration - node point (q(s), x(s)). - A selection of the initial (passing through q0 ) cyclic trajectory is unrestricted. However, its selection impacts both the speed of convergence as well as its optimality. The best natural choice (and simplest one as well) is a trivial loop, i.e. ∀s ∈ [0, smax ] q(s) = q0 . - Consecutive node con igurations are interpolated linearly, although higher order polynomials are also possible. - An invariant of the algorithm (characteristics that does not change from one iteration to another) is the cyclicity of the current trajectory. Initially there is a trivial loop q(0) → q(smax ) = q(0). After adding a single node con iguration, say q(smax /2) corresponding to x(smax /2), the closed loops is the following q(0) → q(smax /2) → q(smax ) = q(0), etc. - The norm used in Eq. (7) depends on the nature of the taskspace. When all coordinates are positional, the Euclidean norm will be the most proper. When the taskspace includes also angle coordinates some norm inherited from the special Euclidean group SE(3) [12] should be used. To simplify computations, the integral (7) can be replaced by a sum of local errors for arguments s between currently existing node con igurations. - The key step of the algorithm, Step 5, can be implemented at least in two possible ways: Scheme 1: on a prescribed path x(·) the furthest node point x(s⋆ ) from the x(0) is determined.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N◦ 4
2013
a
Then, the Newton algorithm is run (q(s⋆ ), x(s⋆ )) = N ewton(q(0), x(s⋆ ), f (q)). where f (q) = ||q − q(0)||. The following computations are de ined recursively. Let us assume that a sequence S = ((q(si ), x(si )), si < si+1 ) of pairs node con iguration-node points has been already determined. After the irst stage we have S = ((q(0), x(0)), (q(s⋆ ), x(s⋆ )), (q(smax ), x(smax ))). Note that the sequence is cyclic (i.e. the last pair in the sequence is the same as the irst one; the only difference is that the argument is equal to smax instead of 0). In each (or some) of the sub-intervals [si , si+1 ] the furthest node x(s⋆ ) from the line x(si )x(si+1 ) is selected. The Newton algorithm is run (q(s⋆ ), x(s⋆ )) = N ewton(qini , x(s⋆ ), f (q))
3
3
2 1
k(q0) 2
3 3
b
k(q ) 0
initialized at the con iguration qini = λq(si ) + (1 − λ)q(si+1 ), with λ=
s⋆ − si ∈ (0, 1). si+1 − si
(8)
(9)
qini is the weighted sum of the two neighboring node con igurations while the optimized function is given as f (q) = ||q − q(si )|| + ||q − q(si+1 )||. ⋆
(10)
⋆
The resulting pair (q(s ), x(s )) is added to S and the S is sorted to get ∀i si < si+1 . This idea is illustrated graphically in Fig. 1a. Scheme 2: let a sequence of node points be given (x(s1 ), . . . , x(sk )) with 0 < si < si+1 < smax . Simultaneously, a set of i ∈ {1, . . . , k} Newton algorithms is run N ewton(q(0), x(si ), f (q)) with the quality function, cf. Fig. 2 f (q) = ||q − q(sji )|| + ||q − q(sji+1 )||,
Fig. 1. a – genera on of node points with Scheme 1, (numbers correspond to a phase of adding the nodes), b – genera on of node points with Scheme 2 x(sm) x(sl )
x(sr )
k q(sm)
k
k
q(sr )
q (s m)
q(s l)
0
Fig. 2. Evolu on of configura ons from q0 (sm ) to q(sm ) to reach the point k(q(sm )) = x(sm )
(11)
where q(sji ) denotes the con iguration obtained in the j-iteration of the Newton algorithm run for the i-the node point. Obviously, not all tasks i ∈ {1, . . . , k} will be completed after the same number of iterations. If the i-th task has been inished after ji iterations, it is assumed that ∀j > ji : q(sji ) = q(sji i ). Graphically this idea is visualized in Fig. 1b. The two schemes rely on the same idea: generate a new con iguration to approach the required node point while increasing the length of trajectory as small as possible. However, the irst scheme realizes it sequentially while the second one – in parallel. The procedure of generation new node con igurations is repeated until their number is large enough to considered the resulting trajectory to be the solution of the repeatable inverse kinematic task. To this
aim let us de ine for a set of node con igurations SQ = {(q(si )), = 1, . . . , K}, extracted from the sequence S the distance d d(k(SQ ), x(·)) =( maxi=1,K−1 ||k
q(si )+q(si+1 ) 2
)
−x
(
si +si+1 2
)
||. (12)
If the condition d(k(SQ ), x(·)) ≤ δ,
(13)
(where δ denotes an assumed acceptable error) is satis ied, the loop in the con iguration space has been found (with the linear interpolation of consecutive node con igurations as an output). Otherwise in those segments where condition (13) was violated, computation progresses according to a selected scheme. One more aspect of the algorithm (6) with optimization in the null space of the Jacobian matrix (ξ2 ̸= 7
Journal of Automation, Mobile Robotics & Intelligent Systems
0) should be considered because this algorithm is run many times and it greatly impact overall computational complexity. The simplest choice of coef icients ξ1 , ξ2 is to set their values small and constant. However, probably this is not the best possible choice. To support this claim let us notice that 1) computational scheme (6) is valid only for a small con iguration increase ∆qi = ξ1 ∆qi1 + ξ2 ∆qi2 , because Eq. (6) is based on linearization of kinematics around a current con iguration 2) the nature of components ∆qi1 = J # (qi )(xf − k(qi ))
(14)
N◦ 4
VOLUME 7,
2013
3. Simula ons An illustration of the elastic band method to repeatable inverse kinematic task was performed on two robots. The irst model is the 3-dof planar pendulum visualized in Fig. 3 with its kinematics given by [
x y
]
[ =
a1 c1 + a2 c12 + a3 c123 a1 s1 + a2 s12 + a3 s123
] ,
(18)
where ai denote lengths of manipulator’s links. A standard robotic convention is exploited to abbreviate trigonometric functions, c12 = cos(q1 + q2 ), s123 = sin(q1 + q2 + q3 ). In all simulations all lengths of links of the planar pendulum were assumed equal to 1.
and ( ) ∂f |q=qi ∆qi2 = I − J # (qi )J(qi ) ∂q
(15)
are quite different although both generate values in Rn . They values may strongly depend on a stage of computations. For example the irst component (14) takes smaller and smaller values as a point corresponding to a current con iguration gets closer and closer to the goal point in the taskspace. The irst component (14) is responsible for convergence of the algorithm (6) so it should have got higher priority than the other (15), responsible for optimization of f (q). Therefore in practical implementation of the Newton algorithm (6) we propose the following procedure: let us assume that the length of admissible change of con iguration in a single iteration is equal to ∆. In a current con iguration qi , the components ∆qi1 , ∆qi2 are computed. Then, the values of ξ1 , ξ2 are set to get ||∆qi || = ∆. Finally, one dimensional optimization process is invoked with the quality function f (qi + ξ1 ∆qi1 + ξ2 ∆qi2 ) and the independent variable ξ2 . ξ1 (ξ2 ) is computed from ||ξ1 ∆qi1 + ξ2 ∆qi2 || = ∆
q3 a2 y
q
a3 2
a1 q1 x
Fig. 3. Kinema c structure of the 3-dof planar pendulum As the second model, the industrial robot PUMA was chosen (Fig. 4). PUMA’s positional kinematics (x, y, z) is given by [6] c1 A − s1 (d2 + d6 s4 s5 ) x y = s1 A + c1 (d2 + d6 s4 s5 ) −a2 s2 + d4 c23 + d6 (c5 c23 − c4 s5 s23 ) z (19) where A = a2 c2 + d4 s23 + d6 (c4 s5 c23 + c5 s23 ) and geometric parameters are equal to a2 = 0.432 m, d2 = 0.0745 m, d4 = 0.432 m, d6 = 0.056 m.
(16)
while the convergence condition ||k(qi + ξ1 ∆qi1 + ξ2 ∆qi2 ) − xf || < ||k(qi ) − xf ||. (17) must hold. If the condition (17) is not satis ied (i.e. there does not exist admissible ξ2 ), the value of ∆ should be decreased and once again the procedure is run. Obviously, in a close vicinity of the goal point the value of ∆ is decreased obligatory, ∆ = ||k(qi ) − xf ||. Because forward kinematics k(q) as well as the quality function f (q) are continuous functions of q, the aforementioned optimization process is convergent outside singular con igurations (where the matrix J # becomes ill-posed), and the convergence is relatively fast because in one dimensional optimization only computationally cheap kinematics is invoked. It is worth noticing that in a close vicinity of singular con igurations a robust inverse pseudo-inverse should be used. Temporarily, J # is replaced there by J T (JJ T + γI)−1 with a small parameter γ. Consequently, the loop in the con iguration space can be generated even in this case. 8
Fig. 4. Kinema c structure of the PUMA robot In order to compare the performance of both schemes two closed-loops in the taskspace were designed (cf. Fig. 5). The irst path (Path 1)
Journal of Automation, Mobile Robotics & Intelligent Systems
(x1 (s), y1 (s), z1 (s)) is a circle x1 (s) = xc + R cos(2πs), y1 (s) = yc + R sin(2πs), z (s) = z + R cos(2πs), s ∈ [0, 1]. 1 c
N◦ 4
VOLUME 7,
2013
a q@°D 150
(20)
The second path (Path 2) (x2 (s), y2 (s), z2 (s)) is a Lissajous-like curve x2 (s) = xc + R sin(2πs), (21) y2 (s) = yc + R sin(πs) − 41 R cos(8πs) z (s) = z + R cos(2πs), s ∈ [0, 1] 2 c For both loops the radius is equal to R = 0.9 and the center point is placed at (xc , yc , zc ) = (1, 1, 0). For the 3-dof pendulum the third component of the vector (x, y, z) is neglected everywhere.
q3
100 q2 50
q1
0.2
0.4
0.6
0.8
1.0
s@-D
b q@°D 100
q3
q2 50
a y 1.5
q1
0.5
0.2
0.4
0.6
0.8
1.0
s@-D
-50
c q@°D
z 0.5
100
q3
q2 50
0.5 x
1.5
q1
b
0.2
0.4
0.6
0.8
1.0
s@-D
-50
z 0.5 0.5 x 1.5
1.5 y
Fig. 5. a – Path 1 – the circle, b – Lissajous-like Path 2 An admissible error to reach each consecutive goal point in the Newton algorithm was set to 0.001, while the maximal change of con iguration in a single iteration was equal to ∆ = 0.5o . The parameter ξ1 of the basic Newton algorithm (6) was derived from ∆, i.e. ξ1 = ∆qi1 /∆, cf. Eq. (14). When the null-space optimization was preformed (cf. Eqns. (16-17)), scales ξ1 , ξ2 of the lengths ∆qi1 , ∆qi2 , Eqns. (14-15), were determined in one dimensional optimization process, preserving the inal length of con iguration change equal to ∆. Two versions of the elastic band method were run. For Path 1 and Path 2, initial con igurations were selected as follows q1 (0) = (−18.96◦ , 37.93◦ , 70.54◦ )T
Fig. 6. Trajectories of the 3-dof pendulum generated for Path 1 with: a) the pseudo-inverse Jacobian method, b) the elas c band method (Scheme 1), c) the elas c band method (Scheme 2)
and q2 (0) = (−26.05◦ , 58.80◦ , 104.92◦ )T , respectively, which corresponds to the initial point of the cyclic path (x(0), y(0))T1,2 . The results for the 3-dof pendulum are visualized in Figs. 6-7(b-c) while the stroboscopic view of several postures of the manipulator is depicted in Fig. 8. Tab. 1. Numerical results for the 3-dof pendulum method pseudo-inverse Scheme 1 Scheme 2 method pseudo-inverse Scheme 1 Scheme 2
Path 1 lenght [o ] num. of pts. 253.05 201 241.41 201 241.08 257 Path 2 lenght [o ] num. of pts. 253.78 201 254.48 201 254.04 257
time [s] 0.10 2.10 0.50 time [s] 0.11 2.30 0.59 9
Journal of Automation, Mobile Robotics & Intelligent Systems
a
VOLUME 7,
N◦ 4
2013
a y 3
q@°D 100 q3
80 60
2
q2
40 20 q1 0.2
0.4
0.6
0.8
1.0
s@-D
1
-20
b 2
Final conf. x 3
2
Final conf. x 3
q@°D
-1 100 q3
80
Initial conf. -1
60 q2 40
b y 3
20 q1 0.2
0.4
0.6
0.8
1.0
s@-D
-20
c
2 q@°D
100 q3
80
1
60 q2
40 20
-1 q1 0.2
0.4
0.6
0.8
1.0
s@-D
-20
Initial conf. -1
Fig. 7. Trajectories of the 3-dof pendulum generated for Path 2 with: a) the pseudo-inverse Jacobian method, b) the elas c band method (Scheme 1), c) the elas c band method (Scheme 2)
To compare ef iciency of the elastic band method with the Jacobian pseudo-inverse method (i.e. the algorithm (6) with ξ2 = 0) some more tests were carried out. The irst task of the pseudo-inverse Jacobian method was run with the initial con iguration q0 and the goal xf = x(∆s) where ∆s = 0.005. Consecutive goals were selected as xf = x(i∆s), i = 2, . . . , smax /∆s and the node con iguration from the (i − 1)-st task became an initial one for the i-th task. The pseudo-inverse algorithm does not guarantee repeatability. Final con igurations for Path 1 and Path 2 q1 (1) = (−6.03◦ , 12.30◦ , 88.81◦ )T , q2 (1) = (−24.32◦ , 56.56◦ , 106.96◦ )T , respectively, did not meet the initial con igurations q0 . The length of resulting trajectories, computed as ∑K−1 k=0 ||q(sk+1 ) − q(sk )|| where K is the inal number of node con igurations, number of iterations to complete the algorithms and the elapsed time (algorithms were implemented in the Mathematica package and run on 2 × 3.3 GHz computer) for Scheme 1 and 2, 10
c y 3
2
1
2
-1
Final conf. x 3
Initial conf. -1
Fig. 8. Stroboscopic view of the 3-dof pendulum following Path 2 with: a) the pseudo-inverse Jacobian method, b) the elas c band method (Scheme 1), c) the elas c band method (Scheme 2)
Journal of Automation, Mobile Robotics & Intelligent Systems
as well as for the pseudo-inverse Jacobian method, are gathered in Table 1. The results seem to be rather unexpected as more constrained method (the repeatable inverse) generated shorter (or almost the same length) trajectories as unconstrained method (pseudo-inverse). To explain this phenomenon let us notice that the pseudoinverse method optimizes a change of con iguration locally (from one iteration to another) while the elastic band method exploits global distribution of consecutive node points. For this particular robot and Path 1, it appears that the global knowledge in constrained task impacts the inal trajectory more than a local optimization in unconstrained task. It can be easily noticed, cf. Table 1) that the pseudo-inverse method without optimization within the null space of the Jacobian matrix is the fastest among methods tested and Scheme 2 generated results much faster than Scheme 1. Finally, two schemes were tested on the PUMA robot using cyclic Path 1 and Path 2. Once again, a comparison with the pseudo-inverse Jacobian method was preformed. For Path 1, the initial con iguration was chosen as q(0) = (−0.84◦ , 62.40◦ , −67.05◦ , 46.58◦ , 44.06◦ , 45◦ )T while the inal con iguration was computed as q(1) = (9.29◦ , 66.17◦ , −64.37◦ , −24.97◦ , 15.93◦ , 45◦ )T . The pseudo-inverse method run for Path 2 also did not satisfy the repeatability condition as q(0) = (−11.25◦ , 57.87◦ , −78.52◦ , 41.93◦ , 34.17◦ , 45◦ )T and q(1) = (−6.67◦ , 61.80◦ , −77.17◦ , 43.16◦ , 21.58◦ , 45◦ )T . Numerical characteristics were gathered in Tab. 2 while the computed trajectories were plotted in Figs. 9-10. As expected, the pseudo-inverse method did not generate a closed-loop trajectory in the con iguration space. When an elastic band method was applied, for both of the schemes, the resulting trajectory remained cyclic. The ordering of the tested methods, with respect to the elapsed time to complete prescribed tasks, observed for the model of 3-dof pendulum remains valid also for the PUMA robot. One can notice that the length of trajectories for the 3-dof planar pendulum is almost the same for both initial paths given in R2 while lengths of trajectories for the PUMA robot differ signi icantly where paths are given in R3 . This observation should attract attention of path designers to appropriately locate the path within a taskspace (usually the shape of the path is determined by technological requirements, but its location not necessary).
4. Conclusions In this paper an adaptation of the elastic band method to repeatable inverse kinematic task was presented. Advantages of the proposed approach include: an application of known and simple methods (the Newton algorithm with an optimization in the null space of the Jacobian matrix) simplicity of kinematic computations which allow to implement this algorithm also in on-line regime, admittance of many variants of changing the current cyclic trajectory as well
N◦ 4
VOLUME 7,
2013
a q@°D 150 100
q2
50
q6
q5 q1 0.2
0.4
0.6 q4
0.8
1.0
s@-D
-50 q3
b q@°D 150 100
q2 q5
50
q6
q1 0.2
0.4
q4 0.6
0.8
1.0
s@-D
-50 q3
c q@°D 150 100
q2
q5
50
q6 q1 0.2
0.4
0.6 q4
0.8
1.0
s@-D
-50 q3
Fig. 9. Trajectories of the PUMA robot generated for Path 1 with: a) the Jacobian pseudo-inverse method, b) the elas c band method (Scheme 1), c) the elas c band method (Scheme 2) Tab. 2. Numerical results for the PUMA robot method pseudo-inverse Scheme 1 Scheme 2 method pseudo-inverse Scheme 1 Scheme 2
Path 1 lenght [o ] num. of pts. 347.59 201 384.79 201 344.26 257 Path 2 lenght [o ] num. of pts. 236.62 201 249.12 201 237.84 257
time [s] 0.39 9.06 1.62 time [s] 0.37 7.14 1.47
as optimization at many stages of the algorithm (the Newton algorithm, selection the number and distribution of node points). This algorithm admits also extension in optimizing the initial node con iguration (in the current version is was assumed as a given initial data). One more way to extend the proposed algorithm is to adapt it to obstacle-cluttered environements, possibly, 11
Journal of Automation, Mobile Robotics & Intelligent Systems
a
q2 q4
50
q5
q6
q1 0.2
0.4
0.6
0.8
1.0
s@-D
q3
b
q5 q4
50
[3]
q6
q1 0.2
0.4
0.6
0.8
1.0
s@-D
-50 q3
q@°D
[5] Klein C.A., Chu-Jeng C., Ahmed S., “A new formulation of the extended Jacobian method and its use in mapping algorithmic singularities for kinematically redundant manipulators”, IEEE Trans. Robot. Autom., vol. 11, 1995, 50–55.
q2
50
q4
q5
q6
q1 0.2
0.4
0.6
0.8
1.0
s@-D
-50 q3
Fig. 10. Trajectories of the PUMA robot generated for Path 2 with: a) the Jacobian pseudo-inverse method, b) the elas c band method (Scheme 1), c) the elas c band method (Scheme 2) via rede ining quality function f to punish approaching to obstacles.
AUTHORS Ignacy Duleba∗ – Institute of Computer Engineering, Control and Robotics Wroclaw University of Technology, 50–372 Wroclaw, Janiszewski St. 11/17, e-mail: ignacy.duleba@pwr.wroc.pl.
12
Duleba I, Opałka M., Using the elastic band method to repeatable inverse kinematic algorithm for manipulators, Science Works, Warsaw Univ. of Technology, Electronics series, 2012, vol. 182, 2012, 351–356, in Polish.
[4] Karpiń ska J., Approximation of algorithms for robot motion planning. PhD thesis, Wroclaw University of Technology, 2012, in Polish.
c 100
REFERENCES
[2] Duleba I., Methods and algorithms of motion planning for manipulators and mobile robots, EXIT, Academic Science Publisher, Warsaw, 2001, in Polish.
q@°D q2
2013
[1] Chiaverini S., Oriolo G., Walker I.D., Handbook of Robotics, chapter Kinematically redundant manipulators, Springer Verlag, Berlin, 2008, 245–268.
-50
100
N◦ 4
Michal Opalka – Institute of Computer Engineering, Control and Robotics Wroclaw University of Technology, 50–372 Wroclaw, Janiszewski St. 11/17, e-mail: michal.opalka@pwr.wroc.pl. ∗ Corresponding author
q@°D 100
VOLUME 7,
[6] Lee C., Robot arm kinematics, dynamics, and control, Computer, vol. 15 (12), 1982, 62–80. [7] Nakamura Y., Advanced Robotics: Redundancy and Optimization. Addison Wesley, New York, 1991. [8] Quinlan S., Khatib O., “Elastic bands: Connecting path and control”, IEEE Int. Conf. on Robotics and Automation, vol. 2, 1993, 802–807. [9] Richter S., DeCarlo R., “Continuation methods: Theory and applications”, IEEE Tran. on Automatic Control, vol. 28, no. 6, 1983, 660–665. [10] Roberts R., Maciejewski A.A., “Repeatable generalized inverse control strategies for kinematically redundant manipulators”, IEEE Tran. on Automatic Control, vol. 38, 1993, 689–699. [11] Spong M., Vidyasagar M., Introduction to robotics. Robot Dynamics and Control. MIT Press, Cambridge, 1989. [12] Tchon K., Duleba I., “De inition of a kinematic metric for robot manipulators”, Journal of Robotic Systems, vol. 11, no. 3, 1994, 211–221.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N° 4
2013
Impedance Controllers for Electric-Driven Robots
Impedance Controllers for Electric-Driven Robots Submitted: 15th January 2013; accepted: 3rd June 2013 Edward Jezierski, Artur Gmerek
Submitted: 15th January 2013; accepted: 3rd June 2013
Edward Jezierski, Artur Gmerek
Abstract: This article presents a proposal of impedance controller, which is able to infulence the compliance of a robot, based on information obtained from a lowlevel controller about robot interaction with the environment. The proposed system consists of a lowlevel controller with proximity sensor, based on which mechanical impedance is adjusted. The intention of the creators was to develop a universal impedance regulator in the sense of hardware and software layers. Because of TCP/IP architecture, the designed regulator can be easily adapted to different robot controllers. Experiments were conducted on a real 1-DOF manipulator driven by BLDC motor, as well as using simulation on a 2DOF planar robot. Keywords: impedance controller, admittance controller, 2DOF manipulator, BLDC drive
1. Introduction Nowadays, in an industrial environment, we often deal with positional robot control. This type of control takes place when the manipulator is highly rigid, thus the kinematic chain is characterized by high impedance. However sometimes is needed to soften a kinematic chain. In this situation impedance control can be used. A robot’s mechanical impedance is usually equated with the impedance of its mechanical part. The concept of impedance control was first introduced to the field of robotics by Hogan [3]. If we assume that we are utilizing the analogy of velocitycurrent [5, 6], then mechanical impedance can be defined with a greater precision. Namely, after the adaption of an incremental linear dynamical model in the vicinity of a chosen operating point, the mechanical impedance of the kinematic chain is defined as the Laplace transform of the force exerted by the end-effector of the manipulator to the Laplace transform of effectors’ velocity. Proper control of a robot’s actuators can change the impedance of its kinematic chain. It is relatively easily manageable in the case of pneumatic or hydraulic actuators [2]. Their rigidity is naturally related to energy-transferring media. In the case of electric drives, it should be utilized to control kinematic chain stiffening or softening with the help of feedback mechanisms. In this process, it is necessary to anticipate how the manipulator will interact with the environment, as changes in impedance cannot be infinitely fast. This is due to the fact that electric actuators are stiff according to the configuration of drives (usually high velocity and high gear ratio).
In most cases interaction with the environment must be detected early enough to enable the controller to respond effectively. This can be achieved by gathering information about the environment, obtained from the robot’s external sensors. The article proposes a system that makes it easy and efficient – to connect the manipulator controller with a proximity sensor. It consists of a low-level driver which works with the designed computer system. The data collected by the sensor is sent to the impedance controller software which processes the data and controls an electric drive associated with controller in order to obtain desired flexibility. The manipulator can thus be prepared to interact with an external object to avoid harmful collisions. Experiments were done on robot with one degree of freedom (DOF), as well as using simulation on a 2DOF planar robot. Impedance control is currently under development in a growing number of research centers. Both stiffness and damping adjusters can be viewed as primitive impedance drivers. There are only a few studies describe how to interact with the control system with the use of external sensors in order to optimize manipulator impedance. Interaction with the environment is mostly affected by the use of force sensors [1, 7]. However, force sensors are effective only when the manipulator’s movements are sufficiently slow. If manipulator velocity is high, potential collisions with a rigid environment should be detected in advance. This can be accomplished with the help of proximity sensors, radars or vision systems. An example of this type of solution is described in [9]. The authors used information from the vision system to impact manipulator impedance. Impedance was switched abruptly when the robot approached an object.
2. Impedance Controller
The manipulator’s desirable properties are determined by the environment and the features of the robot controller. Usually it is assumed that if the robot is close to a person or detects an undesirable object on its planned trajectory, the susceptibility of the kinematic chain should increase. When the sensor detects an approaching object, the controller changes the manipulator’s impedance, depending on the distances between the robot’s arm and an object. The system presented in this article consists of a brushless DC motor (BLDC) which drives the robot link, an inductive proximity sensor with a low-level controller and a computer system that processes the data and controls the motor. The computer system Articles
13
Journal of Automation, Mobile Robotics & Intelligent Systems
s and le the
consists of two programs. The first captures and processes the low-level driver, while and the consists ofdata twofrom programs. The first captures second controls the motor (Fig. 1). processes data from the low-level driver, while the
VOLUME 7,
N° 4
2013
second controls the motor (Fig. 1). second controls the motor (Fig. 1).
Fig. 1. The overall block diagram of the control system for BLDC motor
Fig. 1. The overall block diagram of the control system for BLDC motor ystem for BLDC motor
ulator bed as (1)
elocity s, � – es and ctor of an be amper (2)
– is uation
�
(3)
))) + (�) equa-
2.1 The Impedance Driver Description of the dynamics of the manipulator 2.1�The Impedance Driver � �
�
��
�� � + � the � )(�� � � (� can+be � (�� � �� � ) +as in (contact with environment �Description � �of the�dynamics��� of the�described manipulator � follows: ) +described (� � � with ))) +the �environment �����(�) + �(�, � � � (5) � contact in can ��be as �
�
�
�
���
follows: (1) �(�)�� + �(�, �� )�� + �(�) + �(�, �� ) � � � ���� The equation can be used to control the robot by ) � � � feedforward (1) �(�)�� + �(�,of �� )��“the + �(�) + �(�, ��torque ���� the means computed where � is an inertia matrix, � – a matrix of velocity control”. Equation (5) shows that the response of couplings, a vector of gravitational forces, � – where � is �an–inertia matrix, � – a matrix of velocity a conventional driver’s impedance depends on the acouplings, friction vector, � – a vector of control torques and � – a vector of gravitational forces, � as – value of acceleration, velocity and position, as well – a vector external The vector of forces, ���� athe friction vector, – of a vector of forces. control torques and value of the�force controller, set controller paexternal� forces acting on the manipulator can be – a vector forces. The vector of forces, ���(�, ) and the object’s dynamic rameters � andof�external �of approximated by�acting meanson the mass-spring-damper external forces the manipulator can be parameters (�, �, �, �). system, it gives: approximated by means of the mass-spring-damper However, the equation fulfills its function only system, it gives: when the manipulator actively interacts the (2) ����� � �(�� � �� � ) + �� (�� � �� � ) + �� (� � �with �) environment. Due to the fact that the force can be (�� �manipulator � ��when �� ) (2) �� � ) + �� (� �establishes �� ��� � �(�� � ) + ��the recorded only where � – is the environment’s inertia, �� – is contact with the environment it was decided to use is stiffness. Transforming equation damping and ��the where – is environment’s �� – for is virtual � force instead of actual force. inertia, The equation (2) to determine join acceleration we get: is stiffness. Transforming equation damping and � � the modified force can be expressed as follows: (2) to determine join acceleration we get: �� � �� � � ��� (���� + �� (�� � �� � ) (�) ���� � �� (3) +� ��� (� � �� )) (�� (���� + �� � �� � ) �� � �� � � � �� )) based on infor(3) +�� (� �position where a is an estimated substitutting (3) to (1) we get: mation obtained from the position sensor, and substitutting (3) proportionality to (1) we get: parameter (obtained � ���m� is the �� (���� + �� (�� � �� � ) + �� (� � �� ))) + �(�)(�� � � �methods). by heuristic ℎ(�) + �(�, �� ) + ���� �� (�) � �����)��(�+ �(�)(�� � �(�, ��� + �� (�� � �� � ) + � � (� � �� ))) + The �(�, equation �� )�� +shows ℎ(�) +that �(�,the �� ) reaction + ���� � of � the ma(�) nipulator does not depend on torque value but is In the case of the a 1DOF manipulator, last equaassociated with distance from thethe object to the tion can be written as: manipulator. proportionality factor hasequabeen In the case The of a 1DOF manipulator, the last selected towritten provideas:an appropriate response when tion can be 14 Articles approaching the outer arm to the object.
�
�
( �� � � + �� �� )(�� � � ��� (���� + �� (�� � �� � ) + � �
� �
� � �� ����� + )(�� (��(�, � ���� )�+ ))) � + �� �� )�+ ���� (5) ��((� ��� + ��� + � (�� � �����(�) � � � � � � �
�� (� � �� ))) + �� �����(�) + �(�, �� ) + ���� � � (5) � The equation can be used to control the robot by the The means of “the feedforward equation can computed be used to torque control the robot by control”. Equation (5) shows that the response of the means of “the computed torque feedforward acontrol”. conventional driver’s impedance depends on the Equation (5) shows that the response of value of acceleration, velocity and position, as well as a conventional driver’s impedance depends on the the value of the forcevelocity controller, set controller value of acceleration, and position, as wellpaas and � ) and the object’s dynamic rameters (�, � � the value of the� force controller, set controller paparameters (�,��, �, �). rameters (�, � and �� ) and the object’s dynamic However, the equation fulfills its function only parameters (�, �, �, �). when the manipulator actively with only the However, the equation fulfillsinteracts its function environment. Due to the fact that the force can be when the manipulator actively interacts with the recorded only when the manipulator establishes environment. Due to the fact that the force can be contact thewhen environment it was decided to use recordedwith only the manipulator establishes virtual force instead of actual force. The equation for contact with the environment it was decided to use the modified force can be expressed as follows: virtual force instead of actual force. The equation for the modified force can be expressed as follows: ���� � ��
(�)
(�) ���� � �� where a is an estimated position based on information from the position sensor, and where aobtained is an estimated position based on infor� ���m� is the proportionality parameter (obtained mation obtained from the position sensor, and by heuristic methods). � ���m� is the proportionality parameter (obtained
by heuristic methods). The equation shows that the reaction of the manipulator does not depend value butmais The equation shows that on thetorque reaction of the associated with the distance from the object to the nipulator does not depend on torque value but is manipulator. The factor has tobeen associated with theproportionality distance from the object the selected to provide an appropriate response manipulator. The proportionality factor has when been approaching the outer to the object. selected to provide anarm appropriate response when approaching the outer arm to the object.
Journal of Automation, Mobile Robotics & Intelligent Systems
3. 3. 3.
Implementation of Hardware and Software of Hardware Implementation Implementation of Hardware The of the project was to design a system and aim Software thatand willSoftware enable efficient implementation of imped-
The aim of the project was to design a system ance control of with the help ofwas different typesaofsystem physproject to design thatThe willaim enablethe efficient implementation of impedical hardware. Problems with the integration of varithat will enable ofofimpedance control withefficient the helpimplementation of different types physous systems occur frequently. In the case being deance control with the help of different types ofofphysical hardware. Problems with the integration variscribed, a BLDCProblems electric motor wasintegration controlledofinvarireal ical hardware. with the ous systems occur frequently. In the case being detimesystems using the Python interpreter, which ous occur frequently. In the casecooperates being described, a BLDC electric motor was controlled in real with a data acquisition card, as was wellcontrolled as drivers in which scribed, a BLDC electric motor real time using the Python interpreter, which cooperates are dedicated to work interpreter, with C compiled programs or time Python cooperates with ausing data the acquisition card, as wellwhich as drivers which other compilers. It was card, therefore necessary to develwith a data acquisition as well as drivers which are dedicated to work with C compiled programs or op which would ensure proper programs operation or of aresoftware dedicated to It work C compiled other compilers. was with therefore necessary to develthe controller. other compilers. It would was therefore necessary to develop software which ensure proper operation of The problem ofwould software integration occurs very op software which ensure proper operation of the controller. often when creating systems of greater complexity. the controller. The problem of software integration occurs very There are severalofmethods to combine various proproblem very oftenThe when creatingsoftware systemsintegration of greater occurs complexity. grams to form a common system on platforms such often creating systems greatervarious complexity. There when are several methods to of combine proThere several methodssystem to combine various such programs are to form a common on platforms grams to form a common system on platforms such
VOLUME 7,
N° 4
the popular ATmega8 Atmel microcontrollers. ers communicated with the computer via a the popular ATmega8 Atmel microcontrollers. UART protocol. the popular ATmega8 Atmel ers communicated with the microcontrollers. computer via a ers communicated with the computer via a UART protocol. 4. Experiments and conclusions UART protocol.
4.1 Experimentsand on Object with One Degree 4. Experiments conclusions 4. Experiments of Freedom and conclusions
2013
DrivUSBDrivDrivUSBUSB-
4.1 Experiments on Object with One Degree 4.1One Experiments Object withwas One Degree of the finalon experiments carried out on of Freedom of Freedom the model with one degree of freedom (Fig. 3).
One of the final experiments was carried out on In order determine the optimal control paof with thetofinal experiments was carried the One model one degree of freedom (Fig. 3). out on rameters several experiments were performed to the model withtoone degree ofthe freedom (Fig. 3). In order determine optimal control paillustrate how changes in stiffness and damping coefIn order to determine the were optimal control parameters several experiments performed to ficients affect the experiments system’s properties. When the rameters several were performed to illustrate how changes in stiffness and damping coefmanipulator decreases, itand no damping longer tracks illustrate howstiffness changes in stiffness coefficients affect the system’s properties. When maniputhe desired trajectory, as opposed to situations when ficients affect the system’s lator stiffness decreases, itproperties. no longer When tracksmaniputhe derigidity is high.decreases, This principle was usedtracks to determine lator stiffness it no longer the desired trajectory, as opposed to situations when rigidthe values of stiffness and damping parameters that sired trajectory, as opposed to situations when rigidity is high. This principle was used to determine the cause a desired change was in the operation. ity is high. This principle usedrobot’s to determine the values of stiffness and damping parameters that values of stiffness and damping parameters that cause a desired change in the robot’s operation. cause a desired change in the robot’s operation.
Fig. 2. The example of communication between server, written in Python and compiled C-client Fig. 2. The example of communication between server, written in Python and compiled C-client Fig. 2. The example of communication between server, written in Python and compiled C-client as Windows or Linux. One of the common ways is to implement multithreading functionality, what means as Windows or Linux. One of the common ways is to running several routines same time andisexas Windows or Linux. Oneat of the the common ways to implement multithreading functionality, what means changing data via, for example, a swap file. implement multithreading functionality, whatand means running several routines at the same time exA common resource in this case time may and be exin running the same changingseveral data routines via, for atexample, a swap file. achanging computer’s internal memory or on a hard disk via, for example, swapbe file. A common data resource in this case amay in drive. However, this type of cooperation between A common resource in this case in a computer’s internal memory or on may a hardbedisk programsinternal makes itmemory difficult to implement it disk coratwo computer’s or on a hard drive. However, this type of cooperation between rectly on the various systems. In addition, drive. However, this operating of cooperation between two programs makes ittype difficult to implement it corone of the project’s assumptions was that computer two programs makes it difficult to implement it correctly on the various operating systems. In addition, programs are various easily integratable. rectly on the operating systems. In addition, one of the project’s assumptions was that computer Due to project’s the above-mentioned issues, a technique one of the assumptions was that computer programs are easily integratable. of combining two programs using TCP/IP was used. programs are easily integratable. Due to the above-mentioned issues, a technique One Due of the acts as a server andathe second to programs thetwo above-mentioned of combining programs usingissues, TCP/IP technique was used. as client. Both server and client can swap data. Itused. was of combining two programs TCP/IP wassecond One of the programs acts as ausing server and the decided that the server is a program written in PyOne of the programs acts client as a server and data. the second as client. Both server and can swap It was thon. It controls the motor and enables connection as client.that Boththe server andisclient can swap data. It decided server a program written inwas Pywith client. isiswritten in C++ and in is Pyredecided thatThe the client server a and program written thon. It controls the motor enables connection sponsible for processing data from theconnection proximity thon. It controls the motor and enables with client. The client is written in C++ and is resensor. with client.for The client is written in C++ is responsible processing data from the and proximity Experiments were performed using several sponsible for processing data from the proximity sensor. types of acquisition cards in order to demonstrate sensor. Experiments were performed using several the versatility of the solution. We have successfully Experiments were several types of acquisition cardsperformed in order tousing demonstrate used cards manufactured, inter alia, by Advantech types of acquisition cards in order to demonstrate the versatility of the solution. We have successfully (USB-4704, PCI-1710), as wellWe as drivers based on the of the solution. have usedversatility cards manufactured, inter alia, bysuccessfully Advantech used cards manufactured, interasalia, by Advantech (USB-4704, PCI-1710), as well drivers based on (USB-4704, PCI-1710), as well as drivers based on
Experiments consisted of altering the value of stiffness and damping parameters when motor was at Experiments consisted of altering the value of stiffaExperiments steady-stateconsisted �� � ������. of altering the motor value of stiffness and damping parameters when was at ness and damping parameters when motor was at a steady-state �� � ������. a steady-state �� � ������.
Fig. 3. The laboratory position with the BLDC motor Fig. 3. The laboratory position with the BLDC motor Fig. 3. The laboratory position with the BLDC motor Articles
15
Journal of Automation, Mobile Robotics & Intelligent Systems
Fig. 4. Graph Graph of ofdesired desiredvirtual virtualspeed speedofofthe manipulator manipulator when when changing the damping damping coefficient. The Thevirtual virtuFig. 4. changing Graph of the desired virtualcoefficient. speed of manipulator al speed speed waswas 300300 RPM. During During the experiments the experiments mothe when changing theRPM. damping coefficient. Thethe virtual motor tor was was rotating rotating speed was 300 RPM. During the experiments the mo-
One of conducted experiments was to suddenly change the force which acts on the manipulator; meaning the distance betweenVOLUME the object the 7, N° and 4 2013 proximity sensor, while also suddenly changing attenuation values (Fig. 5). From the graph below it the system so rigid that it does not respond to the can be concluded that a high damping value makes operating force. If the damping is small, the driver the system so rigid that it does not respond to the reacts causing even a change in the rotaoperating force. If the damping is direction small, theofdriver tion. In some situation it is the manipulator’s most reacts causing even a change in the direction of rotadesirable response. tion. In some situation it is the manipulator’s most The second experiment was to increase the madesirable response. nipulator’s impedance whenwas thetoobject is closer to The second experiment increase the mathe robot. The graph shows that when the value of nipulator’s impedance when the object is closer to the virtual force is in the range of 0.18 to 0.19 Nm the robot. The graph shows that when the value of (the time interval from to of 47.5 s), tothe motor the virtual force is in the 45 range 0.18 0.19 Nm stops. In some cases this is the most optimal re(the time interval from 45 to 47.5 s), the motor sponse. crossing this threshold, theoptimal motor bestops. InAfter some cases this is the most regins to After slowly rotatethis in threshold, the opposite direction sponse. crossing the motor be(Fig. the object moved away from the motor, gins 6). to Ifslowly rotate in the opposite direction velocity would return to its original value. (Fig. 6). If the object moved away from the motor, velocity would return to its original value.
Fig. 7. Manipulator with 2 degrees of freedom modeled in Sim-Mech
tor was rotating
Fig. 6. Changes in velocity when approaching the object to sensorinwhen distance the objectthe to the Fig. 6. the Changes velocity whenfrom approaching obsensor is less thanwhen 0.5 cm, the motor ject to the sensor distance fromdirection the objectoftorotathe tion changes sensor is less than 0.5 cm, the motor direction of rotation changes Fig. 5. Velocity of the manipulator with discrete changes of virtualofforces distance of object from Fig. 5. Velocity the (the manipulator with discrete the proximity sensor) and changes in controller’s changes of virtual forces (the distance of object from damping parameter. Theand virtual velocityinwas 300 RPM the proximity sensor) changes controller’s damping parameter. The virtual velocity was 300 RPM One of conducted experiments was to suddenly change theconducted force which acts on the One of experiments wasmanipulator; to suddenly meaning the distance between the and the change the force which acts on theobject manipulator; proximity sensor, whilebetween also suddenly changing atmeaning the distance the object and the tenuation values (Fig. 5). From the graph below it proximity sensor, while also suddenly changing atcan be concluded that a high damping value makes tenuation values (Fig. 5). From the graph below it can be concluded that a high damping value makes
4.2 Simulation Study of Cartesian Impedance 4.2Controller Simulation Study of Cartesian Impedance Second experiment was conducted in Cartesian Controller
space in order to test what properties control Second experiment wasthe conducted in of Cartesian system are, when impedance controller processes space in order to test what the properties of control task variables. These parts ofcontroller experiments were system are, when impedance processes done on 2 DOF planar manipulator in Matlabtask variables. These parts of experiments were Simulink Manipulator was modeled in done on environment. 2 DOF planar manipulator in Matlabsecond edition of Sim-Mechanics toolbox (Fig. 7). Simulink environment. Manipulator was modeled in second edition of Sim-Mechanics toolbox (Fig. 7).
Fig. 7. Manipulator with 2 degrees of freedom modeled in Sim-Mechanics environment 16
Fig. 7. Manipulator with 2 degrees of freedom modeled in Sim-Mechanics environment Articles
done Simu seco
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N° 4
2013
The dynamic model of the manipulator was simplified to mass points placed in the ends of rigid TheThis dynamic model the manipulator was links. can be doneof without limitation tosimthe plified to mass points placed in the ends of generality of the results achieved. The dynamicsrigid of a links. This can done without limitation to the manipulator can be be written as: generality of the results achieved. The dynamics of a manipulator can be written as:
��� �� �� ����� ��� + ���� ������ � � � + �� (��� + ��� �� ����� + ��� ) �� �� (�� ����� + �� ) ��� �� � + � �+ � �� � ��� ������ �� �� (��� ����� + �� ) �� �� ��� � � � � � � � � � � � + � (� + �� � ����� + �� ) �� �� (�� ������ + �� ) ��� � ��� �� �� ����� ��� + ��� ������ cos (�� + �� ) ��������� � � � +� � �+ �(�� �+ � �)��� ����� �+ � �� ��+� ���� � =� � � � � ��� �� (��� ������ + ���) � �� �� �� ���� ����� ���������� �� �� �� ��� cos(�� + �� ) ���������� �� (� + �� )��� ����� + �� ��� cos (�� + �� ) �+� � = �� � � � ���������� �� ��� cos(�� + �� ) �
Forward kinematics of the manipulator is at the form: Forward��kinematics manipulator is at the = �� ����� of + �the � cos(θ� + θ� ) form: (8) + ���� sin(θ cos(θ�� + + θθ��)) ���� = = �������� ������ + (8) = �� ����� +is��given sin(θby: �� kinematics � + θ� ) while inverse while inverse kinematics is by:�� ���� + ��� �� ���given � �� = ������ � �� �� � � �� ��� + ������ � � � � �� = ������ ��� ���� ���� ���� ��� �� � � + ����� � � � � = ������ � �
�� = ������ �
��� ���� ���� ��� ���� ���� ���� ��� ���� ����
�� ��
� + ����� � � ��
(9) (9)
�
�� �� ����� �
�
The other subsystems of control were modeled in standard Simulink environment. The view of the Thesystem other subsystems whole is presentedofincontrol Figs. 8 were and 9.modeled in standard Simulink environment. The view of the whole system is presented in Figs. 8 and 9.
(�)
From (8) Jacobian matrix of the manipulator can be obtained: From (8) Jacobian matrix of the manipulator can be obtained: ��� ��� � � �� �� �(�) = ����� ���� � = ���� ��� � �� �� � �(�) = ���� ���� � = ���� ��� � � (�� + �� ) ����sin �� ����� � �� sin (�� +���� � )� �� � � � (��) �� ����� + �� cos (�� + �� ) �� cos (�� + �� ) ��� ����� � �� sin (�� + �� ) ��� sin (�� + �� ) � � (��) �� ���� �� cos (�� +Jacobian �� ) ��iscos (�� form: + �� ) � +an and finally inverse of the and finally an inverse Jacobian is of the form:
�� cos (�� + �� ) �� sin (�� + �� ) � ��� ����� � �� cos (�� + �� ) ��� ����� � �� sin (�� + �� ) � cos (� + � ) � sin (� + � ) � � � � � � ��� (�) = � � �� �� ����� ��� ����� � �� cos (�� + �� ) ��� ����� � �� sin (�� + �� ) ��� (�) =
(�)
(��) (��)
Fig. 8. The overall scheme of Simulink simulation Fig. 8. The overall scheme of Simulink simulation Articles
17
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N째 4
2013
Fig. 9. The view of decoupling model subsystem (model of manipulator control by the designated torque method). Model was decoupled with PD regulator Fig. 9. The view of decoupling model subsystem (model of manipulator control by the designated torque method). Fig. 9. view of model Fig. 9. The The of decoupling decoupling model subsystem subsystem (model (model of of manipulator manipulator control control by by the the designated designated torque torque method). method). Model was view decoupled with PD regulator Model was decoupled with PD regulator Model was decoupled with PD regulator
Fig. 10. Simulink scheme of admittance controller
10. scheme of controller Fig. 10. Fig. Simulink scheme of admittance controller Fig. 10. Simulink Simulink scheme of admittance admittance controller
It can be seen from Fig. 8, that the impedance
It be seen from 8, the It can can be from seen from Fig. 8, that that theinimpedance impedance controller was applied Y-coordinate manipulaIt can be seen Fig.in8,Fig. that the impedance controller was applied in Y-coordinate in manipulacontroller was applied in Y-coordinate in manipulator control system, while in the X-axis a standard PD controller was applied inwhile Y-coordinate in manipulator control system, in the X-axis aa standard PD tor control system, while in the X-axis standard PD regulator was used. tor control system, while in the X-axis a standard PD regulator was regulator was used. used. Experiments were designed to see how the maregulator was used. Experiments see maExperiments were designed to see how how the the manipulator behaveswere withdesigned different to impedance controlExperiments were designed to see how the controlmanipulator behaves with different impedance nipulator behaves with different controller settings. Simulation consistedimpedance of controlling the nipulator behaves with different impedance controller settings. consisted of the ler settings. Simulation consistedtrajectory. of controlling controlling the 2DOF robotSimulation on the desired During 2DOF robot the trajectory. During ler settings. Simulation consisted of controlling thewas 2DOF robotof on on the desired desired trajectory. During 70 seconds simulation high external force seconds of simulation high external force 2DOF 70 robot on the desired trajectory. During 70 seconds of simulation high external force was was applied in Y-axis. This situation can be likened to applied in Y-axis. This situation can be likened 70 seconds of in simulation high external was to applied Y-axis. This situation can force be likened to a collision with an obstacle (Fig. 11). collision an obstacle appliedaain Y-axis.with This can11). be likened to collision with an situation obstacle (Fig. (Fig. 11). a collision with an obstacle (Fig. 11). 18
Articles
Fig. 11. Schematic representation of 2DOF manipulaFig. 11. Schematic representation of Fig. 11. force Schematic of 2DOF 2DOF manipulamanipulator and actingrepresentation in Y axis tor and force acting in Y axis Fig. representation of 2DOF manipulator11. andSchematic force acting in Y axis
tor and force acting in Y axis
Journal of Automation, Mobile Robotics & Intelligent Systems
The results of research were presented on below graphs. Three runs show the trajectory tracking in The results of research were presented on below case of Three rigid runs manipulator, impedance graphs. show thewhen trajectory trackingwas in results research were on below lower and very of low impedance ofpresented theimpedance manipulator. caseThe of rigid manipulator, when was graphs. Three show the trajectory tracking in lower veryruns low impedance the manipulator. Theand results of research wereofpresented on below case of rigid manipulator, when impedance was graphs. Three runs show the trajectory tracking in lower and very low impedance of the manipulator. case of rigid manipulator, when impedance was lower and very low impedance of the manipulator.
VOLUME 7,
N° 4
2013
In the last case, as it was presented in previous experiments, the manipulator reacts byinsubjecting In the last case, as it was presented previous the exforce. Disturbances that occurred end of the periments, the manipulator reacts at by the subjecting In the Disturbances last case, itthat wasoccurred presented exworkspace are as associated with non-stationary of force. at in theprevious end of the periments, the manipulator reacts by subjecting the boundary conditions. workspace are associated with non-stationary of In the last case, as it was presented in previous exforce. Disturbances that occurred at the end of the boundary conditions. periments, the manipulator reacts by subjecting the workspace are associated with non-stationary of force. Disturbances that occurred at the end of the boundary conditions. workspace are associated with non-stationary of boundary conditions.
Fig. 12. The illustration of changing the impedance of the manipulator. The figure from the left presents the decreased valueillustration of impedance (left figure very rigid of manipulator, middleThe figure – impedance of controller slightly Fig. 12. The of changing the –impedance the manipulator. figure from the left presents the dereduced,value right figure – very low value of impedance) creased of impedance (left figure – very rigid manipulator, middle figure – impedance of controller slightly Fig. 12. The of low changing theimpedance) impedance of the manipulator. The figure from the left presents the dereduced, rightillustration figure – very value of creased value of impedance (left figure – very rigid of manipulator, middleThe figure – impedance of controller slightly Fig. 12. The illustration of changing the impedance the manipulator. figure from the left presents the dereduced, right figure – very low value of impedance) creased value of impedance (left figure – very rigid manipulator, middle figure – impedance of controller slightly reduced, right figure – very low value of impedance) Articles
19
Journal of Automation, Mobile Robotics & Intelligent Systems
5. Future Research During experimentation it was discovered that choosing the most adequate parameters for the impedance controller is a complicated process. A set of parameters is closely related to the numerical methods used to solve differential equations, with a limited speed of calculation, as well as the selection of the scaling factors for information obtained from the position sensors. This is probably one of the main reasons why impedance/admittance controllers are rarely applied in practice. Upcoming research will therefore be associated with the creation of an adaptive model, probably based on the methods of artificial intelligence, by means of which it will be possible to effectively select appropriate controller settings, depending on the desired behavior of the robot. The next scheduled project will also involve the integration of an impedance controller to control robots with higher degrees of freedom. Acknowledgements This work is financially supported by the Ministry of Science and Higher Education of Poland (Grant No. N N514 469339). Note: This is an expanded version of the paper presented during The Polish National Conference on Robotics, 2012. AUTHORS Edward Jezierski*, Artur Gmerek Institute of Automatic Control, Lodz University of Technology, email: edward.jezierski@p.lodz.pl tel. 0-42 631-25-40 *Corresponding author
References
20
[1] Caccavale F., Siciliano B., “Six-DOF impedance control based on angle/axis representations”, IEEE Transactions on Robotics and Automation, vol. 15, no. 2, 1999, 289–300. [2] Granosik G., Jezierski E., Kaczmarski M., “Modelling, simulation and control of pneumatic jumping robot”. In: Proc. European Robotics Symposium 2008. Springer Tracts in Advanced Robotics, Berlin: Springer 2008, 155–164. [3] Hogan N., “Impedance control: An approach to manipulation”, Parts I, II, III, ASME Journal of Dynamic Systems, Measurements, and Control, 1985, vol. 107, 1–23. [4] Jezierski E., „Od sterowania pozycyjnego robotów do sterowania impedancyjnego”. In: Tchoń K. (ed.) Postępy robotyki – sterowanie, percepcja, komunikacja, Warszawa: WKiŁ, 2006, 13–36. (In Polish) [5] Jezierski E., “On electrical analogues of mechanical systems and their using in analysis of Articles
VOLUME 7,
N° 4
2013
robot dynamics”. In: Kozłowski K.R. (ed.), Robot Motion and Control – Recent Developments, Berlin: Springer, Berlin 2006, 391–404. [6] Jezierski E., Dynamika robotów, WNT, Warszawa, 2006. (In Polish) [7] Ott C., Albu-Schäffer A., Kugi A., Hirzinger G., “On the passivity-based impedance control of flexible joint robots”, IEEE Transactions on Robotics, vol. 24, no. 2, 2008, 416–429. [8] Siciliano B., Sciavicco L., Villani L., Oriolo G., Robotics – modelling, planning and control, Springer, 2009. [9] Twuji T., Akamatsu H., Hatagi M. and Kaneko M., “Vision-Based Impedance Control for Robot Manipulators”. In: IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 1997. [10] Winiarski T., Zieliński C., „Podstawy sterowania siłowego w robotach”, PAR, no. 6, 2008, 5–10. (in Polish) [11] Jezierski E., Gmerek A., ”Admittance control of a 1-DoF robotic arm actuated by BLDC motor”. In: 17th International Conference on Methods and Models in Automation and Robotics (MMAR), 2012, 633–638.
Journal of Automation, Mobile Robotics & Intelligent Systems
S I
VOLUME 7,
S
D C
N◦ 4
2013
O M
Submi ed: 20th January 2013; accepted: 4th June 2013
Tomasz Winiarski, Konrad Banachowicz, Maciej Stefańczyk DOI: 10.14313/JAMRIS_4-2013/21 Abstract: Modern robo c systems are able to localize doors and its handle or knob, grasp the handle and open the door. Service robots need to open doors and drawers to autonomously operate in human environment. The mechanical proper es of doors lead to incorpora on of force and velocity constraints into the control law, to avoid the environment damage. In the ar cle the impedance control law was expanded with these factors to achieve safe behavior of direct (explicit) posi on-force controller of KUKA LWR4+ robot that opens the door. Keywords: manipula on, impedance control, service robots
1. Introduc on Recent years have resulted in a number of robotic manipulator structures [4, 9] mounted on mobile bases. The main scienti ic motivation for their creation was to build a hardware and software base for service robots operating in human environment. The research covers many issues related to human’s everyday existence, such as working in the kitchen [11] or movement between rooms [7]. Both of these tasks require opening doors. Depending on the needs it can be a little door in the kitchen cabinet as well as a massive room door. One of the irst works in this matter considers door opening with a four degrees of freedom robotic manipulator [8]. The control law assumed that the manipulator exerts the desired general force set in its end–effector and system measures the position in the same coordinate system simultaneously. Taking into account the real trajectory of the end–effector, the control law parameters were chosen to generate the desired force that opens the door. Modern applications [4, 7, 9, 11] are much more comprehensive and take into account several aspects e.g.: location of the door, location of the handle or knob, grip generation and grip execution. It is worth noting that the inal outcome of the experiments presented so far related to door opening is comparable for various structures of the controllers. The work [4] presents robotic system that is controlled indirectly and general force readings came not from a six-axis sensor mounted in the manipulator wrist [15] but from sensors located in the gripper ingers phalanges. The control law applied in the direction of door opening was constructed by superposition of reference velocity and force error depen-
dent correction velocity. Similarly, the robot described in [11] had indirect control structure, but in this case the force feeling was the fusion of measurements from a six-axis transducer placed in a manipulator wrist and force data from two inger gripper phalanges. PR2 Robot [7] control system does not measure the general force in manipulator wrist, but, on the basis of velocity command speci ied in the end–effector, computes the additional desired joints torque component for gravitationally balanced manipulator. System described in [9] utilized one of the irst generations of directly controlled DLR-LWR robot, where the manipulator tip contact force computation is based on the torques measured in joints. It is impossible to pass in silence over the fact that the opening of doors and drawers is also important in medical research. For example [13] studied opening a drawer by healthy people and patients with impaired cerebellum. It turns out that patients with dysfunction of the cerebellum have a different drawer opening strategy than healthy patients. Typically, ingers’ clamping force on the handle is proportional to the pulling force to avoid handle slip. Similarly, healthy patients, shortly after the end of the movement is felt, more strongly tighten the handle so as not to lose their grip. Patients with cerebellar dysfunction also perform the whole tasks, but clench tighter in the irst phase and sometimes lose the grip at the end of the motion, when the drawer reaches mechanical limitations. The forces exerted by the man and the speed of movement during similar operations are well known and tested. It results in the construction of doors and drawers, in particular, the durability of handles, hinges etc. There are software drivers [10, 16] for robots performing service tasks, as well as methods of image acquisition and analysis [6] necessary to detect objects (e.g. handles or doors [3]). The latter task can also utilize fused information from depth sensors and color cameras [14], which can further improve detection results. So far, there was a lack of control law relating to some of the top-imposed limitations resulting from the door construction adapted for use by humans. The handle has a limited resistance, so it is important to limit the force with which the door is pulled. On the other hand, a rotational speed of the door can not be arbitrarily high, because the accumulated kinetic energy may be potentially dangerous in the case of encountering an obstacle (e.g., human being). In this paper we present in a formal way a robotic system (sections 2, 3), where the control law allows 21
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
both opening a door with previously unknown kinematic and dynamic models, taking into account the constraints derived from safety requirements. Moreover, the algorithm allows to open the door without prior knowledge of opening direction and without a known position of the hinge (the same algorithm opens both the „right” and „left” doors). In the experimental stage (section 4) the door and manipulator end–effector were rigidly connected to test the control law by exposing the system to the fast-growing strain. The paper inishes with conclusions in section 5.
2. Nota on The controller description is formal, hence it starts with the notation. Position of frame Q relative to U can be expressed as either homogeneous matrix U Q T or a column vector U r. In the latter case, the position in Q Cartesian coordinates is extended by a rotation angle around the axis represented by directional vector. The rotational angle and the directional vector (after multiplication) are aggregated into three coordinates. The operator A transforms this column vector into a homogeneous matrix, and A−1 de ines an inverse transformation: A(U Q r) =
U QT
, A−1 (U QT ) =
U Qr
(1)
Manipulator con iguration can also be expressed with a joint position vector q. [ ]T The column vector U r˙ = U v T , Uω T of size 6 × 1 represents generalized velocity of frame U moving relative to frame 0 expressed in U . It consists of linear velocity v and rotational velocity ω: =
U
3. Controller Research has been done on the system based on KUKA LWR4+ robots. To present the system structure, the formal notation presented earlier in [6, 16] was used in a modi ied form. The graphical representation ( ig. 1) was supplemented by a direct de inition of the data exchanged within a single agent (between its control subsystem c, virtual effector e and real effector E). The transition functions that generate outputs based on inputs and internal state are de ined for particular components and indicate inputs and outputs for the described component: [S]tx are inputs, [S]ty are outputs, where S is the input/output value and t means discrete time index. Other variables (the internal state of the agent in particular) are written without the additional symbols.
c EF
e
Kj τd qd
The column vector U F of the same size 6×1 expresses generalized force and consists of a 3 element force vector and a torque vector of the same size. In this case, force is applied to the origin of the U coordinate frame, and is expressed in the same frame. U Some other useful transformations U Q ξF , Q ξV express generalized force or velocity in one coordinate frame relative to the other, ixed to it: U
F=
U Q F, Q ξF
U
r˙ =
U Q r˙ Q ξV
(3)
In the case when free vectors are used (such as increments of position, orientation, velocity or force) there exists a need to express them relatively to the frame, with an orientation other than the one in which it was formerly expressed. In this case, one can use the C (U r) ˙ notation, in which generalized velocity of U frame in relation to 0 is expressed in frame C. For the transformation purpose the matrix ξ∗ , [16] is used: C U
( r) ˙ =
C U U ξ∗ r˙
(4)
In the notation below, d means desired value, and m is the measured value (if d or m are placed in the bottom right index). Similarly, the bottom right index with square brackets denotes a coordinate associated with 22
d
d E T W
E T W
Kc
EF
Cartesian Impedance Controller
(2)
r˙
2013
a vector or a matrix. We employ the convention of using x, y and z to indicate the linear parts and ax, ay, az to indicate the rotational parts. In ig. 2 some selected coordinate frames are marked: 0 – base, W – wrist, E – end–effector (task frame).
Bj U 0 (U r) ˙
N◦ 4
WT m 0
J
Bc
Kc Bc
E (D r˙
DT 0
d)
t
Trajectory Generator WT m 0
WT m 0 WF m
E Fig. 1. The General structure of the designed singleagent controller: E - real effector, e - virtual effector, c - control subsystem. The two components presented are an impedance controller in task space and a trajectory generator for the impedance controller The detailed description of the system is presented in the following part of the article. In section 3.1 the real effector E (KUKA LWR4+ arm) is described with its industrial controller adapted to work with research systems. Later on, section 3.2 characterizes the developed research controller (virtual effector e) implemented in Orocos [2] system. This particular framework was chosen because of the availability of communication interface with KUKA and since its structure is universal (so it can be used in many works, going beyond the scope of this project). Further part of the paper (section 3.3) presents a strategy for opening a door, especially the developed control law embedded in control subsystem c. This is implemented in ROS [10] system, which was chosen due to the simplicity of the formulation of tasks using scripting
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N◦ 4
2013
ile that de ines the connection between them. The individual components are responsible for communication within the entire system (including the use of FRI), internal diagnostics, as well as the implementation of the control law and trajectory generation. In this article, we focus on the latter two aspects.
Fig. 2. KUKA LWR4+ arm with its coordinate frames and transforma on between them, while execu ng the task of opening the door languages and ready-to-use communication interface with Orocos system.
Trajectory generator component Two key components used in the research controller are trajectory generator and impedance controller. Trajectory generator receives commands from the control subsystem c in discrete time instants t. On the other side, in each step ι of the real effector regulator E, trajectory generator has to prepare the command for impedance controller in external space (implemented in cartesian impedance controller component). For two consecutive time instants i, the interval is de ined by [t]ix . In the initial phase, the desired position is copied from the measured position (6):
3.1. Industrial controller KUKA LWR4+ KUKA LWR4+ is a serial, lightweight (16kg), redundant robotic arm with seven degrees of freedom. Its construction ( ig. 2) is similar to human arm in terms of size, lifting capacity and manipulation capabilities, which enables it to safely operate in human environment. Just like typical industrial robots, LWR4+ is equipped with motor encoders, but it also has sensors of joint position and torque. It can be controlled in joint space and external space, moreover the desired torques can be set in joints. KUKA provided the possibility to use LWR4+ with external controllers, particularly for research purposes. The controller is equipped with Fast Research Interface (FRI) [12] and uses Ethernet and UDP protocols, that allow to read robot’s state and to send control commands. In this work the impedance control law extended by desired torque is used for joints control (5), in the same way as parallel position-force controller in external space extends position regulators with the desired force [16]. κ κ + ) + [Bj ]ιx q˙m τcκ+1 = [Kj ]ιx ([qd ]ιx − qm ι ι ι [τd ]ιx + f (qm , q˙m , q¨m )
(5)
The control law above presents an external control loop of real effector (∆κ ≈ 0.3ms), with desired torque τcκ as output, which is then used as an input to the internal loop with torque controller running with the frequency of tens of kHz. Some variables are updated less frequently, typically with the period of ∆ι = 1ms. Input values for extended impedance controller are as follows: [Kj ]ιx - stiffness, [Bj ]ιx - damping, [qd ]ιx - desired position, [τd ]ιx - desired torque. The impedance controller needs the measured posiκ κ tion qm and velocity q˙m , and the dynamic model itι ι ι self f (qm , q˙m , q¨m ) . The structure of this regulator as well as its stability analysis were presented in [1]. 3.2. KUKA LWR4+ research controller The research controller (virtual effector e) consists of a number of Orocos components and an XML
[D0 T ]ιy0 = E0 T ιm0
(6)
It is assumed that stiffness, damping, desired force and the geometric matrix of the tool are simply copied (7): [Kc ]ιy = [Kc ]ix ,
[Bc ]ιy = [Bc ]ix ,
W ι i [E F d ]ιy = [E F d ]ix , [W E T ]y = [ E T ]x
(7)
Desired position [D0 T ]ιy is computed based on deE sired velocity [ ( D r˙ d )]ix and previous control (8) according to the presented notation: ( ) ι E D i [D0 T ]ιy = D0 T (ι−1) A (D E ξ ∗ [ ( r˙ d )]x )∆ι
(8)
ι where matrix D E ξ ∗ is computed in the way presented in [16] based on the matrix E0 T ιm from (9) and matrix D0 T (ι−1) . 0 ι ET m
i = [W0 Tm ]ιx [W E T ]x
(9)
Impedance controller component Cartesian impedance controller component implements control law E in task-related space. Column vector D r representing the length of the virtual spring (10) is computed taking into account the measured effector position E0 T m and the desired position D0 T . E ι Dr
= A−1 (E0 T ιm [D0 T ]ιx )
(10)
E
Velocity r˙m of the end–effector (11) is computed as:
0 ι A−1 (E0 T m E T m) (11) ∆ι In the next step, the force E F c (12) is computed E taking into account the spring length D r, stiffness Kc , E end-effector velocity r˙m , damping Bc and desired force E F d . The force E F c is then transformed to the wrist frame (13). (ι−1)
E
r˙m ι =
E
F ιc
= [Kc ]ιx
W
Fcι
=
W ι E ξF
E ι Dr + E ι Fc
[Bc ]ιx
E ι r˙m
+ [E F d ]ιx(12) (13) 23
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
W ι ι where W E ξ F is based on [ E T ]x . Finally, according to (14) joint torques vector τd is computed using Jacobian J.
[τd ]ιy = [J T ]ιx
W
F ιc
{ P () =
3.3. Door opening strategy The control subsystem c includes outermost control loop running over the existing impedance controller implemented in the virtual effector e. This solution enables the incorporation of the door opening tasks in a larger application, consisting of multiple phases of motion, by adjusting the impedance characteristics for smooth transition from the movement without any contact with the environment to the contact phase, as it is possible in the pneumatic-powered robots [5]. i The matrix [W E T ]y de ines the pose of the manipulator tool. It depends on the mechanical construction of the end–effector and it is sent to the virtual effector. This matrix does not change during task execution. Similarly, the general set force [E F d ]iy is always zero (i.e. all of the force and torque coordinates are zeros). In addition, the time [t]iy of the single trajectory segment execution is sent. The other output parameters are de ined in terms of the direction of motion in the task space. For directions x, y, ax, ay and az it is assumed that the manipulator should be compliant and move in order to eliminate tension, so the corresponding coordinates of the reference velocity, stiffness and damping (15) become zero.
E
[ ( D r˙ d[x,y,ax,ay,az] )]iy = 0, [Kc[x,y,ax,ay,az] ]iy [Bc[x,y,ax,ay,az] ]iy
= 0,
=0 (15)
Towards the z axis control law is more elaboE rated, since it sets the velocity [ ( D r˙ d[z] )]iy taking into account the measured position E0 T im , the maximum i speed r˙[z] , the measured force E Fm[z] and two force thresholds: braking threshold F0 and motion termination threshold Fl . The stiffness [Kc[z] ]iy and damping [Bc[z] ]iy are also de ined. In each iteration, the controller irst checks the completion of the predicate of the movement (16), (17). The predicate P () returns true, if the force measured along the z axis of the task coordinate system exceeds the threshold value (door is blocked) or the dot product of the vector that is normal to the plane of the door at the beginning of the motion and the same vector at the current time instant is less than zero (the door turn angle exceeded 90 degrees). 24
2013
N i = [E0 T im[1][3] , E0 T im[2][3] , E0 T im[3][3] ]
(14)
Stiffness and damping in the output of the component are set to zero [Kj ]ιy = 0, [Bj ]ιy = 0, in order to command the joint level controller implemented in the real effector E (5) to work as a torque controller with compensation of both gravity and partially dynamics.
N◦ 4
(16) true if N · N ≤ 0 ∨ f alse otherwise i0
i
[Fm[z] ]ix
≥ Fl (17)
If the predicate P () is satis ied, the desired veE locity [ ( D r˙ d[z] )]iy = 0. The robot motion is not commanded, but the manipulator end–effector is still impedance controlled in z axis. Otherwise, the velocity is determined according to the formula (18), so as to continuously slow down the robot, starting from the moment when the measured force values exceed the braking threshold F0 . E
[ ( D r˙ d[z] )]iy = r˙[z] ) ( E i F m[z] −F0 r˙[z] 1.0 − Fl −F0 0
for
E
F im[z] ≤ F0
for F0 < E F im[z] ≤ Fl for
E
i Fm[z] > Fl
(18)
4. Experiments The veri ication of the strategy of door opening consists of a series of experiments conducted on the test-bed presented in igure 2 and video1 . The end–effector of the manipulator is rigidly ixed to the kitchen cabinet door. The experiments consisted of alternating closing and opening the door for different sets of controller parameters and limitations of the contact force. The control subsystem time period was constantly set to [t]iy = 10ms. In addition, door motion was disturbed by an obstacle in the form of heavy object standing in its way. Experiments have con irmed the correctness of the approach. For the further presentation the door opening case was chosen ( ig. 3), where the following coef icients where chosen experimentally. The velocity limit was set as r˙[z] = 0.04 m s , the braking threshold F0 = 10N , the force N limit Fl = 20N , stiffness [Kc[z] ]iy = 4000 m and dampNs i ing [Bc[z] ]y = 80 m . Initially, the manipulator is slightly pushing the door along with the z axis of the current task coordinate system (it can be seen at the beginning of the forces graph), because the door opening was preceded by its closure, which ended up with a little stretch of impedance controller „spring” while the door was closed. Then, at the time instant a, the motion is ordered, the force along the z axis changes direction, and inally, after crossing the static friction of hinge, door begins to move as illustrated by the signi icant increase in speed along the z axis. The slight oscillation of the forces is visible while the manipulator is in motion. The force in the z direction, which moves the door, is clearly dominant over the values of the forces in the xy plane, where manipulator is compliant. The velocities in the xy plane are adequately
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
−5
z
− 10
E
r˙m [m/s]
− 15 0.01 0.00 − 0.01 − 0.02 − 0.03 − 0.04 − 0.05 − 0.06
2013
tation Engineering, Warsaw University of Technology, e-mail: tmwiniarski@gmail.com, www: http://robotics.ia.pw.edu.pl.
xy
0
E
F m [N]
5
N◦ 4
a
4
6
8
10
b 12
14
c
xy z a
4
6
8
10 time [s]
b 12
14
c
Fig. 3. The measured forces and veloci es recorded during door opening small. At the time instant b the door encounter an obstacle in the form of a box weighing about 3kg, which lies directly on the surface below the closet. The force along the z axis is growing rapidly, but in this case does not exceed the limit value. The speed drops dramatically and starts to oscillate, while the manipulator is pulling the doors with the obstacle. Later, after passing through the phase of the oscillation caused by the impedance controller behaviour, the measured velocity stabilizes slightly below the limit. At this stage, noticeable higher force is seen, with which the manipulator pushes the door against the resistance of the obstacle. Movement stops in time instant c, when the desired angle is reached. The force in the z direction is non-zero, as the impedance controller „spring” is stretched due to non-zero static friction inside the door hinge and static friction of the obstacle with the ground.
5. Conclusions In the article a research oriented controller of KUKA LWR4+ robotic manipulator, that executes a safe door opening strategy, is presented in a formal way. This approach takes into account the limitations of both the speed of the door, from its handle point of view, and the contact force recorded in the end–effector of the manipulator. Experimental veriication con irmed the correct behavior of the system. Future research will rely on the solution already worked out and will include the selection of the controller parameters (stiffness and damping) in the face of the experimentally identi ied constraints on the speed and strength of the contact. The work will begin with the study of the method of opening the door by a representative group of people, and then the implementation of a similar strategy for the robot.
Notes 1 http://vimeo.com/rcprg/door-opening-stiff-contact
AUTHOR Tomasz Winiarski, Konrad Banachowicz, Maciej Stefańczyk – Institute of Control and Compu-
ACKNOWLEDGEMENTS The authors gratefully acknowledge the support of this work by The National Centre for Research and Development grant PBS1/A3/8/2012. Tomasz Winiarski has been supported by the European Union in the framework of European Social Fund through the Warsaw University of Technology Development Programme.
REFERENCES [1] A. Albu-Schäffer, C. Ott, and G. Hirzinger, “A uniied passivity-based control framework for position, torque and impedance control of lexible joint robots”, The International Journal of Robotics Research, vol. 26, no. 1, 2007, pp. 23–39. [2] H. Bruyninckx, “Open robot control software: the orocos project”. In: International Conference on Robotics and Automation (ICRA), vol. 3, 2001, pp. 2523–2528. [3] M. Chacon-Murguia, R. Sandoval-Rodriguez, and C. Guerrero-Saucedo, “Fusion of door and corner features for scene recognition”, Journal of Automation Mobile Robotics and Intelligent Systems, vol. 5, no. 1, 2011, pp. 68–76. [4] W. Chung, C. Rhee, Y. Shim, H. Lee, and S. Park, “Door-opening control of a service robot using the multi ingered robot hand”, IEEE Transactions on Industrial Electronics, vol. 56, no. 10, 2009, pp. 3975–3984. [5] E. Jezierski, G. Granosik, and M. Kaczmarski, “Impedance control of jumping robot (in Polish)”, Publishing House of Warsaw University of Technology, Elctronics, vol. 166, 2008, pp. 185–194. [6] T. Kornuta, T. Bem, and T. Winiarski, “Utilization of the FraDIA for development of robotic vision subsystems on the example of checkers’ playing robot”, Machine GRAPHICS & VISION, 2013, (accepted for publication). [7] W. Meeussen, M. Wise, S. Glaser, S. Chitta, C. McGann, P. Mihelich, E. Marder-Eppstein, M. Muja, V. Eruhimov, T. Foote, et al., “Autonomous door opening and plugging in with a personal robot”. In: IEEE International Conference on Robotics and Automation (ICRA), 2010, pp. 729–736. [8] G. Niemeyer and J. Slotine, “A simple strategy for opening an unknown door”. In: International Conference on Robotics and Automation (ICRA), vol. 2, 1997, pp. 1448–1453. [9] C. Ott, B. Bäuml, C. Borst, and G. Hirzinger, “Employing cartesian impedance control for the opening of a door: A case study in mobile manipulation”. In: IEEE/RSJ international conference on 25
Journal of Automation, Mobile Robotics & Intelligent Systems
intelligent robots and systems workshop on mobile manipulators: Basic techniques, new trends & applications, 2005. [10] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler, and A. Ng, “ROS: an open-source Robot Operating System”. In: Proceedings of the Open-Source Software workshop at the International Conference on Robotics and Automation (ICRA), 2009. [11] A. Schmid, N. Gorges, D. Goger, and H. Worn, “Opening a door with a humanoid robot using multi-sensory tactile feedback”. In: International Conference on Robotics and Automation (ICRA), 2008, pp. 285–291. [12] G. Schreiber, A. Stemmer, and R. Bischoff, “The fast research interface for the kuka lightweight robot”. In: IEEE ICRA Workshop on Innovative Robot Control Architectures for Demanding (Research) Applications–How to Modify and Enhance Commercial Controllers. Anchorage, 2010. [13] D. Serrien and M. Wiesendanger, “Grip-load force coordination in cerebellar patients”, Experimental brain research, vol. 128, no. 1, 1999, pp. 76–80. [14] M. Stefańczyk and W. Kasprzak, “Multimodal segmentation of dense depth maps and associated color information”. In: Proceedings of the International Conference on Computer Vision and Graphics, vol. 7594, 2012, pp. 626–632. [15] T. Winiarski and A. Woźniak, “Indirect force control development procedure”, Robotica, vol. 31, 2013, pp. 465–478. [16] C. Zieliński and T. Winiarski, “Motion generation in the MRROC++ robot programming framework”, International Journal of Robotics Research, vol. 29, no. 4, 2010, pp. 386–413.
26
VOLUME 7,
N◦ 4
2013
VOLUME7,7, N° N◦4 4 2013 2013 VOLUME
Journal of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
3D ������ ��� ����� ����������� ��� M����� R���� ���������� �u����ed: 20th January 2013; accepted: 4th June 2013
Maciej Stefańczyk, Konrad Banachowicz, Michał Walęcki, Tomasz Winiarski DOI: 10.14313/JAMRIS_4-2013/28
�� ��e �et�od o� syste� s�ec��ca�o�
Abstract:
Design of robot control systems requires a speci�ication method that would facilitate its subsequent implementation. A multitude of robot controller architectures was proposed in the past, e.g. the subsumption architecture [2], the blackboard architecture [17] or an integrated architecture presented in [1]. In this paper we utilize a method based on earlier works fostering decomposing system into agents and description of the behaviour of each agent in terms of transition functions [19, 21, 22]. An embodied agent, being a central tenet of this method, is de�ined as any device or program having the ability to perceive its surroundings to subsequently in�luence the environment state, that may communicate with other agents and has an internal imperative to achieve its goal. In recent works [6, 20] the authors facilitated the inner structure of an embodied agent a, distinguishing �ive types of internal subsystems in two groups: agent’s corporeal body (com-
�he ar�cle presents a naviga�on system based on 3D camera and laser scanner capable of detec�ng a �ide range of obstacles in indoor environment. First, existing methods of 3D scene data ac�uisi�on are presented. �hen the ne� naviga�on system gathering data from various sensors (e.g. 3D cameras and laser scanners) is described in a formal �ay, as �ell as exemplary applica�ons that verify the approach. Keywords: mobile robo�cs, naviga�on, ���-D sensing
�� ��trod�c�o� An autonomous mobile robot navigation requires robust methods of environment analysis and obstacle detection. Typically, to achieve this goal, whole sets of various sensors are used, some of them giving pointlike readings (e.g. infrared or ultrasound range �inders) [13], some scanning in a plane (e.g. lidars). That kind of sensor con�iguration allows a safe movement of the robot in an unknown environment, but there is still a broad list of obstacles, that are hard to detect and avoid, most of them being either small (but dangerous) things laying on ground level or things hanging from top, that can hit upper parts of the robot maneuvering beneath them.
Current solutions, based on the aforementioned sets of sensors, have some shortcomings. Sometimes there is necessity of special preparation of the environment to make it reachable for robots (by, for example, clear marking of dangerous zones), restriction of common space used simultaneously by people and robots or making some strong assumptions about the environment (like riding only on smooth, �lat surface without bumps or depressions [8]).
28
In the article a navigation method is presented that, thanks to utilization of data from both lidar and 3D camera, overcomes the most of the above mentioned shortcomings and is capable of detecting a wide range of obstacles in indoor environment. The robot control system (sec. 2, 3) is formally speci�ied for the clearness of presentation and to simplify its further re-usage. On the other hand, presented control system still allows one to apply existing methods of securing robot surrounding, giving also possibility for easy fusion of data from diversity of sensors (which was proven in experiments described in section 4).
R1
Real receptor
Real receptor
MS MSKinect Kinect r y
r y
R1 r1
MS MSKinect Kinectdriver driver c y
r y
r2
R3 Angular speed
R x
r2
Virtual receptor
r1
c y
r3
Point cloud
r x
r x 1
c
r3
IMU IMUinterface interface
r2
Point cloud
r3
Virtual receptor
Lidar Lidardriver driver c y
R3
Gyroscope Gyroscope
Range data
R x
r1
Virtual receptor
Real receptor
R2
Depth image
R x
R2
SICK SICKLMS100 LMS100
IMU readings
r x
c2
c3
c
Control subsystem
Data Dataaggregation, aggregation,Localization, Localization,Navigation Navigation e y
c1
e x
e1
c y
Linear and angular speed
c x
c1 Odometry information
e1
e1
Virtual effector
Mobile Mobilebase basecontroller controller E y
e1
E x
E1
e y
Motor velocities
e x
e1 Encoder readings
Real effector
E1
E1
Mobile Mobilebase basehardware hardware
Fig. 1. Control system structure
Articles
27
VOLUME VOLUME 7, 7, N°N◦4 4 2013
Journal Journal of ofAutomation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
posed of its effector E and receptor R) and control system (containing virtual effector e, virtual receptor r and a control subsystem c). Virtual effectors e and virtual receptors r form a hardware abstraction layer, presenting to the control subsystem both the state of effectors and receptors readings in a form that is convenient from the point of view of control. They also make it easy to replace effectors or receptors without changes in a control subsystem, as long as new modules are compatible with the created interface. All those subsystems communicate with each other via buffers and, besides that, it is assumed that they possess internal memory. Hence the evolution of the state of each of those subsystems is de�ined in terms of a transition function, transforming data taken from input buffers and the internal memory into values written to output buffers (and back to the internal memory as well) and sent subsequently to the associated subsystems. Aside of the brie�ly described decomposition the authors also introduced data �low diagrams, supplementing the transition functions in the task of description of the behaviour of each of the subsystems constituting the control system. They also introduced a consistent notation, simplifying the description of such systems, as follows. A one-letter symbol located in the center (i.e. E, R, e, r, c) designates a subsystem. To reference its subcomponents or to single out the state of this subsystem at a certain instant of time extra indices around the central symbol are placed. A left superscript designates the referenced buffer of the subsystem or in the case of a function – its type. A right superscript designates the time instant at which the state is being considered. A left subscript tells us whether this is an input (x) or an output (y) buffer. A right subscript refers to the numbers of: a subcomponent (if we have only one agent and one control system, the rest of original symbols can be omitted), an internal memory cell or the ordinal number of a function. In the following we will utilize this method for the description of the designed control system of Elektron [12] mobile robot.
ϯ͘ 'ĞŶĞƌĂů ƐLJƐƚĞŵ ƐƉĞĐŝĮĐĂƟŽŶ
28
The control system of Elektron mobile robot was designed in a way that simpli�ies its usage not only in a variety of applications on this speci�ic robot, but also on different hardware platforms, without many assumptions about hardware structure. Whole system was designed as a single agent (depicted on �ig. 1), without any communication buffers with other agents, with a control subsystem c as its vital part. Aside from controlling the system, to adapt the system to a particular robotic platform, one must provide virtual receptors and effectors implementation, that would interface the control system and real hardware. The analysis of 3D scene acquisition methods [11] was a basis of choice of the adequate set of sensors for this task. Virtual receptors responsible for obstacle detection (in described application r1 and r2 ) gather data from either MS Kinect or SICK lidar and convert those readings into common representation (a cloud of 3D Articles
points). The transition functions of both modules are responsible only for changing the representation, so they are not described in detail. In short, in both virtual receptors the data read from input buffer R x rk is transfered to cy rk . Virtual receptor r3 is responsible for communicating with a gyroscope and generating messages of current angular speed. Buffer cy r3 contains this speed expressed in SI units (rad/s). On the other side of control system is virtual effector, that translates commands from common format (linear and angular speed in cx e1 ) to velocity values for both motors (E y e1 ) and returns current robot position (wheel odometry cy e1 ) calculated from readings from wheelmounted encoders (E x e1 ). The control subsystem is divided into a set of modules responsible for different tasks: (i) robot localization (sec. 3.1) (local provided by f1 transfer function and global by f3 ), (ii) data �iltering (sec. 3.2) (f2 ), (iii) obstacle mapping (sec. 3.2) and path planning (sec. 3.3) (f4 ). Those parts are independent on the hardware used, so the whole control system is portable between platforms. On the other hand, each of its subparts can be freely replaced with other (conforming to proper interface), so path planning algorithms, data �iltering, environment mapping etc. can be either tuned or even replaced by other implementations, more suitable for speci�ic task (without an impact on the whole structure of the agent). ϯ͘ϭ͘ ZŽďŽƚ ůŽĐĂůŝnjĂƟŽŶ
There are three independent sources producing information about our mobile robot relative localization. Wheel odometry (available through ex c1 buffer), based on incremental optical encoders and calculation of the position using matched consecutive laser scans [3], give full information about robot position and orientation (x, y, θ). Gyroscope readings (rx c3 ), on the other hand, can be used only to calculate a change in the orientation. All three sources return data with different frequencies, but Extended Kalman Filter [15], which is widely used in robotics, helps to incorporate all available readings into �inal estimation of local robot position. Transition function f1 performing corrected local localization is depicted on �ig. 2.
f1
e i x 1
c
r x
c2
i
r x
c i3
Extended Extended Kalman Kalman filter filter
Polar Polarscan scan matching matching S
c i 1
c
Li
&ŝŐ͘ Ϯ͘ >ŽĐĂů ƌŽďŽƚ ůŽĐĂůŝnjĂƟŽŶ ƵƐŝŶŐ <ĂůŵĂŶ ĮůƚĞƌ Global robot position (relative to provided map) is calculated using a particle �ilter with laser scans as input data [16]. Additionally, current estimated position described earlier can be used in order to speed up calculations. It’s worth mentioning, that provided map can be only a rough representation of the environment, containing only walls and other, most impor-
29
VOLUME7,7, N° N◦4 4 2013 2013 VOLUME
Journal of Journal of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
tant features, like doors or unmovable obstacles. Additional obstacles, that appear in laser readings, have virtually no impact on localization results, as long as they don’t cover most of the robot’s �ield of vision. Fig. 3 shows the structure of transition function f3 executing global localization. c
r x
f3
i
c1 AMCL AMCL localization localization
c i2 c
c
i +1
c4
i
c3
Fig. �. �ransi�on f�n��on of a global lo�ali�a�on on pro� vided map 3.2. Mapping of the obstacles During a robot movement in an environment, which is not fully known (or sometimes even totally unknown), detecting and mapping of a current con�iguration of obstacles plays a vital role. In a static environment it’s suf�icient only to mark obstacles on a map after detecting them, but when obstacles can move or disappear, it’s clearing obstacles from map what is essential (this part is much harder to do, since it’s hard to tell whether obstacle really disappeared or we just can’t detect it at the moment). The whole process of obstacle detection and mapping consists of consecutive steps of data aggregation from available sensors, �iltering it, marking obstacles in an intermediate, 3D representation and �inally the creation of 2D costmap for a motion planner.
30
3� �ep�esenta�on of the en�i�on�ent Out of the mentioned steps, the intermediate 3D representation of the environment had to be considered �irst, because it affects all the other tasks. Methods based on geometric modeling of surroundings [14] are successfully used in tasks like a creation of 3D plans of buildings or open spaces, additionally giving possibility to �ilter out moving objects from �inal result. Octrees [5] are also widely used as a base structure when implementing obstacle maps (in this case the biggest advantage is a memory usage ef�iciency). Both approaches have a common drawback, which is methods of removing information about obstacles from them. In octrees, when a cell is marked as used, it’s hard to perform inverse operation, leading to consistent results. An intermediate solution, containing features from both two- and three-dimensional representations is layered maps. In this case, the whole space is split into horizontal layers, each representing a �ixed range of values from a vertical axis (usually z axis). On each layer, an information can be stored either geometrically [10] or in �ixed rectangular grid (voxel representation). In the described system the key part is robust obstacle marking and clearing in close robot surrounding and there is less need for accurate mapping of whole corridors (because dynamic character
of environment long-range maps with old information about obstacles could have negative impact on global path planners). Taking into account the mentioned facts, a voxel representation (layered map with discrete rectangular grid) has been chosen as a base structure for an obstacle map implementation.
�ata agg�ega�on an� �lte�ing The �irst step of processing sensor readings and creating a motion costmap is data aggregation itself. In the presented solution there are two sources giving information about obstacles – SICK LMS100 lidar and Microsoft Kinect. Both sensors produce high amount of data, which can easily lead to overloading the processing unit. Taking into account rather slow movement speed, to detect obstacles only 5 readings per second are used from lidar and 2 per second from Kinect. Data from sensors c x rk (for k = 1, 2) is throttled to desired frequency, giving throttled pointcloud T Pk . r i x k
c
Frequency Frequency limitation limitation
T
P
f2
Outlier Outlier removal removal
i k
i
F
Pk
Deppresion Deppresion detection detection
c i +1 2
c
Fig. 4. Filtering readings from available sensors This pointcloud is then passed through the series of �ilters (�ig. 4), which remove incorrect points and produce new, virtual ones. Incorrect points, being sensor noise, could lead to detecting non-existing obstacles, which sometimes make the goal unreachable. To remove that kind of noise, every data point that has too few neighbours in its surrounding is removed (both sensors produce dense sets of points, so there should be no solitaire ones), producing �iltered cloud F Pk . Intersection of rays with base plane
Osensor
P''
P'
z=10cm z=0
P Real point (sensor reading)
Virtual point (generated)
Fig. �. �ete��ng dangero�s slopes Afterwards, when false readings are removed, the next �ilter processes the cloud of points in order to detect possibly dangerous �loor slopes (like stairs leading down or holes in the ground). The mechanism is simple, but it gives very good results. All points having z coordinate over set threshold (e.g. −0.02m) pass through this �ilter without any change, the rest is re�lected by xy plane, thus point P = (x, y, z) becomes P ′ = (x, y, −z). This way a „virtual wall” is produced (�ig. 5) at the distance of the closest detected depressed obstacle, but it’s still possibly far away from Articles
29
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal Journal of ofAutomation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
real beginning of dangerous zone, so in next step another set of virtual points is produced. A line, connecting detected, negative point P with sensor origin O = (ox , oy , oz ), intersects with ground plane at some coordinates (x′′ , y ′′ , 0) (1). A point P ′′ is produced by raising the intersection point 10cm above the ground, so it creates a virtual obstacle for robot. Those virtual points, in result, force path planners to avoid detected, dangerous zones, making the whole system safer. A �iltered cloud is then stored in the internal memory of the control subsystem, in c c2 . P ′′ = (ox + (x − ox ) · s, oy + (y − oy) · s, 0.1) with s =
oz oz −z .
(1)
Voxel map All the points that passed the �iltering stage, are put into an occupancy map creating module. Its internal representation is based on voxel grid (three dimensional cuboid grid). The manipulation of grid parameters allows to balance between higher resolution (leading to better mapping and allowing robot to ride through smaller gaps) and lower computational load (for coarse grained maps) leading to faster path computation. A large size of cells, on the other hand, could make some places unreachable, because even small obstacles, like thin table legs, are represented by whole cells being marked as occupied. When a preprocessed pointcloud is passed to this stage, prior to marking new obstacles, a removal procedure is executed. For every point in current measurement the same algorithm is used – every cell, that intersects with line connecting measured point with sensor origin, is marked as free. Afterwards all cells, in which points from current measurement are placed, are marked as occupied (this task is much simpler than the clearing stage). The order of those two passes is important – if both clearing and marking were done in a single pass (i.e. for every point clearing and marking executed at sight), a situation would be possible, where just created obstacles are cleared by consecutive processed points from the same pointcloud. A voxel representation and the described algorithm allows an easy incorporation of readings from multitude of sensors, in this case lidar and Kinect are used without any change in algorithm, with one important difference between them – the points from Kinect are used for both clearing and marking, lidar, returning only planar readings, is used only for marking purposes.
Final costmap After marking all obstacles in an intermediate voxel grid, all cells are projected onto a two dimensional costmap. All columns containing occupied cells are marked as occupied, otherwise, the column’s value depends on a number of unknown cells. If more than 4 cells are unknown, then the entire column is considered unknown, in every other case it is marked as free to move. After marking, all the obstacles are in�lated by speci�ied value, creating safety zones around objects, 30
Articles
which gradually increase the movement cost for cells. Lowering in�lation radius makes robot go closer to obstacles, but can possibly lead to a situation, in which robot will ride along the wall so close, that turning is impossible without hitting it. Larger radius, on the other hand, will make some places unreachable.
Fig. 6. Local costmap. In the background a global map is visible, darker cells mean occupied cells, safety zones are marked light gray and white cells are empty zones. �ddi�onally, laser scan readings are marked as white points. A local costmap has a �ixed size and its origin is placed always in the origin of the robot’s mobile base (�ig. 6). In unstructured environments with dynamically changing con�iguration of obstacles keeping a map of all obstacles may cause some troubles, that were mentioned in earlier sections. A smaller costmap has also much better memory ef�iciency, because it has a �ixed size instead of continuously growing as robot proceeds. 3.3. Path planning
When a goal is commanded to the robot (for example operator sets it manually), a global path planner is executed. This module is exchangeable, so any implementation following the interface rules can be used at this point (so it’s one of few places, where future experiments using already created hardware and software platform can be carried on, even concerning such complicated ones as a chaotic space exploration [4]). In test applications Dijkstra algorithm was used as a global planning method, searching for a path in a graph built upon a current global map. Initially, only the global map is used (in the case when no obstacles are detected yet) and if a plan can’t be executed (for example a door is closed), currently detected obstacles are included in a global path planning. When a global plan G R is created, its parts are sent to the mobile base controller, where a local path planner (using Dynamic Window Approach) tries to command the robot in a way that follows the given trajectory part the best. The local planner uses only a local costmap around the robot L M and current odometry readings ex c1 . Parameters for the local planner are set so that each move ends as near as possible to the followed trajectory. After experiments with this module and tweaking parameters values, the robot followed the given global path with rather high average speed (approx. 17cm/s, which is 75% of its maximum speed), smoothly avoiding obstacles. General struc-
31
VOLUME N◦4 4 2013 2013 VOLUME 7, 7, N°
Journal of of Automation, Automation, Mobile Journal MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
c
i
c3
c
ci4
c
i 1
c
e x
c
i 2
c
Global Global planner planner
GR
i
c1
f4
i
Local Local planner planner Obstacle Obstacle mapping mapping
L
M
i
e y
i +1
c1
Emergency Emergency behaviours behaviours
�i�. �. �eneral structure of na�i�a�on subs�stem ture of navigation subsystem is depicted on �ig. 7.
�.4. �e������ �ete���n �n� s���in�
When the robot moves through an environment, where obstacles are potentially a-going, even best planning algorithms couldn’t avoid being stuck in certain situations (contrary to static environments). When that kind of situation emerges, a series of possible solutions could be applied. First of all, of course, robot stops. Then all obstacles further that set threshold (half a meter) are cleared from the costmap and robot makes a full turn (360°, if possible) to refresh the obstacle con�iguration. If at this point the robot is still stuck, more aggressive algorithms are used, until all possible solution methods are tried. If at any point the robot can continue following its route, the deadlock is over.
4. Experiments
32
The experiments were conducted using Elektron mobile robot [12], equipped with control hardware developed according to the methodology presented in [18]. As a set of research oriented controllers, it provides access to all the essential parameters and since the communication protcol is open, it allows to replace any hardware module or plug an additional one. To make the high level software portable, the authors decided to base the control system implementation on ROS software framework [9], which is widely used in robotics. In general, virtual effector e1 and virtual receptors r1..3 are implemented as separate nodes, whilst control subsystem c is divided into multiple nodes, one for each transition function fi . Operations inside transition functions (i.e. outlier removal or local planner) are implemented as nodelets running inside corresponding transition functions’ node. Communication between nodes and nodelets (both transmission buffers between subsystems and internal transmissions) are mapped as ROS topics. The experiments were conducted in of�ice and workshop rooms, where many objects were constantly carried over the whole place. Those experiments aimed at con�irmation of propriety of chosen approach. In some experiments robot moved through the corridor, where students moved chairs and tables from place to place particularly often (�ig. 8), doors
�i�. �. �le�tron robot na�i�a�n� in tar�et en�ironment were opened and closed, and, of course, like in [7], there were many people walking along (�ig. 8). According to the assumptions, robot had to safely drive through avoiding all obstacles and try to reach a given goal. The basic scenario of each experiment was the same – the robot had to reach a goal (set by operator) autonomously and safely, without any collision with obstacles. �arious sensor con�igurations were tested� only Kinect, only lidar and a fusion of data from both. Each experiment was repeated a number of times to calculate average statistics. In cases when only one sensor was used, the advantage of lidar over Kinect in terms of overall movement speed was visible. A wide �ield of view of Sick LMS100 (270°) makes it possible to plan the path faster, without the need for too many robot in-place rotation, which was necessary with narrow angle of 57° Kinect sensor. Those rotations were used in order to �ind obstacles at the sides of the robot. On the other hand, Kinect detected many obstacles that were invisible for lidar, like of�ice chairs bottoms, which allowed setting much lower critical distance to obstacles during the ride (20cm in case with lidar, 5cm for Kinect). This in turn made much more goals reachable for the robot, which precisely maneuvered through narrow spots (like almost closed doors). The biggest disadvantage of Kinect in terms of obstacle detection was inability to detect narrow table legs, which was potentially dangerous. When both data sources were enabled at the same time, the drawbacks of each of them were eliminated by other sensor capabilities, which improved both overall robot speed and precision of obstacle avoidance. Table 1 presents summarized statistics of conducted experiments. Tab. 1. Summarized results for 20 repeats of each experiments Sensors
Avg. speed
Lidar Kinect Kinect + lidar
16cm/s 12cm/s 17cm/s
Obstacle crit. dist. 20cm 5cm 5cm
The key issue of the control system of a mobile robot operating in common space with humans is safety. During the tests, there were many people working and walking along, in�luencing robot actions Articles
31
Journal of Automation, Mobile Robotics & Intelligent Systems Journal of Automation, Mobile Robotics & Intelligent Systems
(sometimes even on purpose). This led to new obstacles appearing and, in extreme cases when robot was surrounded by them, system stopped movement and, after a few rotations to �ind a clear path, aborted the goal. In every case, the system properly recognized stairs (when using Kinect) and, if commanded to go to a point unreachable because of them, the system returned an appropriate error message.
5. Conclusions
In an unstructured and dynamic environment, like of�ice and laboratory spaces, three dimensional sensors seem to be the most appealing method for fast and reliable detection of a wide variety of obstacles. Augmenting this data with wide-angle readings from lidar further improves the detection results, making it possible to avoid almost every solid obstacle (e.g. small, laying on ground, hanging from top, or narrow legs). Additional CPU load caused by the processing of three-dimensional point clouds is compensated by an extended set of detectable obstacles, which makes the whole system robust and safe. The presented system can be used in unstructured and cluttered environments without their prior preparations, in particular there is no need either to remove any obstacles, or mark safe zones. Thanks to this, a robot can be quickly deployed in a new environment and during its work those places can be simultaneously used by people. It makes a range of a new applications possible (e.g. mobile robot assistant of the man) as the future work.
Acknowledgements
The authors gratefully acknowledge the support of this work by The National Centre for Research and Development grant PBS1/A3/8/2012. Tomasz Winiarski has been supported by the European Union in the framework of European Social Fund through the Warsaw University of Technology Development Programme. AUTHORS Maciej Stefańczyk∗ – Institute of Control and Computation Engineering, Warsaw University of Technology, 00–665 Warszawa, ul. Nowowiejska 15/19, e-mail: stefanczyk.maciek@gmail.com. Konrad Banachowicz – Bionik Robotics Scienti�ic Club, Warsaw University of Technology, 00–665 Warsaw, Nowowiejska 15/19, e-mail: konradb3@gmail.com, www: http://bionik.ia.pw.edu.pl. Michał Walęcki – Institute of Control and Computation Engineering, Warsaw University of Technology, 00–665 Warsaw, Nowowiejska 15/19, e-mail: mwalecki@elka.pw.edu.pl. Tomasz Winiarski – Institute of Control and Computation Engineering, Warsaw University of Technology, 00–665 Warsaw, Nowowiejska 15/19, e-mail: tmwiniarski@gmail.com. ∗
32
Corresponding author Articles
VOLUME 7, N° ◦4 2013 VOLUME 7, N 4 2013
REFERENCES [1] R. Alami, R. Chatila, S. Fleury, M. M. Ghallab, and F. Ingrand, “An architecture for autonomy”, Int. J. of Robotics Research, vol. 17, no. 4, 1998, pp. 315–337.
[2] R. A. Brooks, “A robust layered control system for a mobile robot”, IEEE Journal of Robotics and Automation, vol. 2, no. 1, 1986, pp. 14–23.
[3] A. Diosi and L. Kleeman, “Laser Scan Matching in Polar Coordinates with Application to SLAM”. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005.
[4] A. A. Fahmy, “Implementation of the chaotic mobile robot for the complex missions”, Journal of Automation Mobile Robotics and Intelligent Systems, vol. 6, no. 2, 2012, pp. 8–12.
[5] D. Jung and K. Gupta, “Octree-based hierarchical distance maps for collision detection”. In: International Conference on Robotics and Automation (ICRA), vol. 1, 1996, pp. 454–459.
[6] T. Kornuta and C. Zieliński, “Behavior-based control system of a robot actively recognizing hand postures”. In: 15th IEEE International Conference on Advanced Robotics, ICAR, 2011, pp. 265–270. [7] B. Kreczmer, “Mobile robot navigation in the presence of moving obstacles (in polish)”, Advances in Robotics. Robot controll with surrounding perception., 2005, pp. 177–186.
[8] E. Marder-Eppstein, E. Berger, T. Foote, B. Gerkey, and K. Konolige, “The of�ice marathon: Robust navigation in an indoor of�ice environment”. In: International Conference on Robotics and Automation (ICRA), 2010, pp. 300–307. [9] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, and A. Y. Ng, “Ros: an opensource robot operating system”. In: ICRA workshop on open source software, vol. 3, no. 3.2, 2009.
[10] P. Skrzypczynski, “2D and 3D world modelling using optical scanner data”, Intelligent robots: sensing, modeling, and planning, vol. 27, 1997, p. 211.
[11] M. Stefańczyk and W. Kasprzak, “Multimodal segmentation of dense depth maps and associated color information”. In: Proceedings of the International Conference on Computer Vision and Graphics, vol. 7594, 2012, pp. 626–632. [12] W. Szynkiewicz, R. Chojecki, A. Rydzewski, M. Majchrowski, and P. Trojanek. “Modular mobile robot – Elektron (in Polish)”. In: K. Tchoń, ed., Advances in Robotics: Control, Perception and Communication, pp. 265–274. Transport and Communication Publishers, Warsaw, 2006.
[13] S. Thongchai, S. Suksakulchai, D. Wilkes, and N. Sarkar, “Sonar behavior-based fuzzy control for a mobile robot”. In: Systems, Man, and Cybernetics, 2000 IEEE International Conference on, vol. 5, 2000, pp. 3532–3537.
33
Journal of Automation, Journal Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
[14] S. Thrun, W. Burgard, and D. Fox, “A real-time algorithm for mobile robot mapping with applications to multi-robot and 3d mapping”. In: International Conference on Robotics and Automation (ICRA), vol. 1, 2000, pp. 321–328.
[15] S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics, The MIT Press, 2005.
[16] S. Thrun, D. Fox, W. Burgard, and F. Dellaert, “Robust monte carlo localization for mobile robots”, Arti�icial intelligence, vol. 128, no. 1, 2001, pp. 99–141. [17] H. Van Brussel, R. Moreas, A. Zaatri, and M. Nuttin, “A behaviour-based blackboard architecture for mobile robots”. In: Industrial Electronics Society, 1998. IECON’98. Proceedings of the 24th Annual Conference of the IEEE, vol. 4, 1998, pp. 2162–2167.
[18] M. Walęcki, K. Banachowicz, and T. Winiarski, “Research oriented motor controllers for robotic applications”. In: K. Kozłowski, ed., Robot Motion and Control 2011 (LNCiS) Lecture VOLUMENotes 7, N◦ in 4 Con2013 trol & Information Sciences, vol. 422, 2012, pp. 193–203.
VOLUME7,7, N° N◦ 44 VOLUME
2013 2013
[19] C. Zieliński and T. Winiarski, “Motion generation in the MRROC++ robot programming framework”, International Journal of Robotics Research, vol. 29, no. 4, 2010, pp. 386–413. [20] C. Zieliński, T. Kornuta, and M. Boryń, “Speci�ication of robotic systems on an example of visual servoing”. In: 10th International IFAC Symposium on Robot Control (SYROCO 2012), vol. 10, no. 1, 2012, pp. 45–50.
[21] C. Zieliński, T. Kornuta, P. Trojanek, and T. Winiarski, “Method of Designing Autonomous Mobile Robot Control Systems. Part 1: Theoretical Introduction (in Polish)”, Pomiary Automatyka Robotyka, no. 9, 2011, pp. 84–87. [22] C. Zieliński, T. Kornuta, P. Trojanek, and T. Winiarski, “Method of Designing Autonomous Mobile Robot Control Systems. Part 2: An Example (in Polish)”, Pomiary Automatyka Robotyka, no. 10, 2011, pp. 84–91.
[19] C. Zieliński and T. Winiarski, “Motion generation in the MRROC++ robot programming framework”, International Journal of Robotics Research, vol. 29, no. 4, 2010, pp. 386–413.
e allicanteration
[20] C. Zieliński, T. Kornuta, and M. Boryń, “Speci�ication of robotic systems on an example of visual servoing”. In: 10th International IFAC Symposium on Robot Control (SYROCO 2012), vol. 10, no. 1, 2012, pp. 45–50.
listic
“Roots”, , pp.
[21] C. Zieliński, T. Kornuta, P. Trojanek, and T. Winiarski, “Method of Designing Autonomous Mobile Robot Control Systems. Part 1: Theoretical Introduction (in Polish)”, Pomiary Automatyka Robotyka, no. 9, 2011, pp. 84–87.
Nutture s So24th , pp.
[22] C. Zieliński, T. Kornuta, P. Trojanek, and T. Winiarski, “Method of Designing Autonomous Mobile Robot Control Systems. Part 2: An Example (in Polish)”, Pomiary Automatyka Robotyka, no. 10, 2011, pp. 84–91.
rski, botic otion Con, pp.
34
Articles
33
◦ VOLUME 2013 VOLUME 7, 7, N°N4 4 2013
Journal of of Automation, Automation, Mobile Journal Mobile Robotics Robotics&&Intelligent IntelligentSystems Systems
O�����������-����� A������� ��� M����� �������� �� � R���� ������� �� R���� ������� �u����ed: 20th January 2013; accepted: 4th June 2013
Dominik Belter
Abstract: The paper presents a mo�on planning algorithm for a robot wal�ing on rough terrain. The mo�on�planer is based on the improved RRT (Rapidly Exploring Random Tree���onnect algorithm. The �ar�cle �warm �p�mi�a� �on (���� algorithm is proposed to solve a posture op�� mi�a�on problem. The steepest descent method is used to determine the posi�on of the robot�s feet during the swing phase. The gradient descent method is used for smoothing the �nal path. The proper�es of the mo�on planning algorithm are presented in four cases� mo�on planning over a bump, concavity, step and rough terrain moc�up. The maximal si�es of par�cular obstacle types traversable by the �essor robot with the new, op�mi�ed mo�on plan are given. Keywords: wal�ing robot, mo�on planning, rough terrain locomo�on
34
should be considered. Finally, motion planning for a walking robot is a problem with many constraints, and the solution has to be determined in a multidimensional solution space. [xR,yR,zR]T
O
0.23
DOI: 10.14313/JAMRIS_4-2013/35
L4
ZR
0.055
0.1
0.16
5
XR YR
5
06
0.
L3
16
L5 18 17
L2
L6
L1
�� ��trod�c�o�
�i�� �� �inem��� s�������e o� ��e �esso� �o�o� ���� �i� mensions in [m])
Most of the outdoor mobile robots are equipped with tracks or wheels. This kind of locomotion is very ef�icient on �lat terrain. When the terrain is rough the mobility of such robots is limited by the size of obstacles (rocks, logs, etc.). Moreover, the tracked drive destroys the ground especially when the robot is turning round. Walking is more ef�icient on rough terrain. Thus, to improve the mobility of legged mobile robots on various types of terrain new motion planning methods are needed. Recently, quadrupeds and bipedal humanoid robots are gaining popularity, but during Urban Search and Rescue (USAR) missions multi legged robots are more suitable due to their inherent static stability. The nature of legged locomotion is discrete – different than wheeled locomotion. Usually, the wheels of a robot should have continuous contact with the ground, whereas each leg of a walking robot has a sequence of swing and stance phases. The feet of a robot have point-like contact with the ground. As a result, the locomotion task includes a sequence of robot’s postures and transitions between these postures (interpreted as discrete states). Motion planning of a robot walking on rough terrain is a complex task. Walking robot can’t be represented as a point in an environment with obstacles. While walking on rough terrain the robot should take into account full shape of the terrain. Collision with the ground as well as static stability and workspace of the robot (a space which the feet of the robot can reach)
The posture of our walking robot (the state of the robot) Messor [24] is determined in a 24-dimensional space. The kinematic structure of the robot is presented in Fig. 1. The position of the robot’s body in the reference frame O is determined by a 3D vector [xR , yR , zR ]T . Orientation of the robot’s body is given by three RPY (roll-pitch-yaw) angles θR , φR , γR . Each joint position is given by αi value. The numbering of joints starts from the shoulder servomotor of the �irst leg, and ends on the joint between tibia and femur of the sixth leg (Fig. 1). The posture is determined by 18 reference values for servomotors located in robot’s legs and six values representing position and inclination of the robot’s platform (q = [α1 , ..., α18 , xR , yR , zR , θR , φR , γR ]T ). We are looking for methods which can deal with multi-dimensional solution space to �ind a path for the robot walking on rough terrain. Moreover, the sought algorithm should be fast enough to operate in real time. To �ind feasible motion of the robot the motion planner should �ind the way how to go round obstacles and, if necessary, to �ind the way how to climb obstacles. �inematic structure of a robot in�luences the approach to motion generation. Various approaches are used to obtain ef�icient locomotion on rough terrain. The rimless wheel structure is used in the RHex robot [21]. The rotational motion of the shaft is converted into walking behavior through �lexible spokes. Robust locomotion on rough terrain can be also gen-
Articles
35
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal of of Automation, Automation, Mobile Journal MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
erated by using reactive controllers. The example of such a robot is the BigDog [29]. Both methods will fail whenever the robot has to deal with very risky environment e.g. while crossing the stream. Without a foothold selection method the robot will fall down. In our approach we precisely plan a full path of the robot. The robot determines its posture, selects footholds, plans motion of the feet and body. It also checks stability, workspace and avoids collisions with the ground and self-collisions. By planning full motion of the robot it is also possible to avoid deadlocks (situations when the robot can’t move because of the lack of stability or lack of the kinematic margin). Planning motion of the robot also allows to avoid local minima (here understood as a loop of moving backwards and forwards, oscillating around what is in these circumstances a lower-dimensionality local minimum [1]), and to avoid obstacles. This behavior is possible when a reactive controller is used. A reactive controller computes next action taking into account a current context. Thus, the reactive controller might cause ’in�inite loops’ in robot’s behavior, e.g. according to the current context the next motion of the robot is step forward, then step back and again forward, back, etc. To avoid the robot getting trapped in local minima the deliberative paradigm is used in the control software of the machine. The paper presents recent improvements of the motion planning method for the Messor six-legged robot. This method is based on Rapidly-exploring Random Trees Connect algorithm (RRT- Connect) [16]. During execution of the planned path the robot creates an elevation map of the environment using the Hokuyo �RG-04L� laser range �inder [3]. Then, the obtained map is used to plan the motion of the robot. The methods presented in this paper increase the ef�iciency of the motion planning method. With the recent improvements the robot is capable to climb higher obstacles by avoiding deadlocks. The deadlocks are caused by the lack of static stability or the lack of a kinematic margin. The results are obtained using various optimization methods to �ind a path of feet during swing phase, to optimize posture of the robot and to smooth the obtained path. 1.1. Related work
36
Autonomous walking on rough terrain is much more challenging than indoor locomotion [20]. Many impressive results were obtained using the LittleDog robot [6, 19, 32]. In this case the robot also uses various optimization methods during motion planning. The trajectory of the robot is optimized to maximize the stability and kinematic reachability. In the work of Ratliff et al. [19] the trajectory optimization is performed using the gradient-based Covariant Hamiltonian Optimization and Motion Planning (CHOMP) algorithm. Similarly, Kalakrishnan et al. proposed the use of optimization methods to improve the trajectory of the robot’s body with respect to the initial Zero Moment Point (ZMP) trajectory [7]. To plan the motion of the robot the Anytime Repairing A* (ARA*) algorithm [18] is used. The algorithm maximizes footholds
rewards from the initial to the goal position. Most of the walking robots as well as control methods are inspired by biological systems. Inspirations for six-legged robot are taken from behavior of stick insects [15]. The robot can use a dynamic gait which is based on results of Nature observations [30]. Periodic signals to control gait of the robot can be obtained using a Central Pattern Generator (CPG). The concept of CPGs is based on the real nervous structure which can be found in living creatures [31]. This concept is also successfully applied in multi-legged walking robots [9]. In our research we also take inspiration from biology. Our robot uses statically stable gaits and the leg coordination patterns, as stick insect do. It walks on rough terrain using secure and rather slow type of locomotion. The control system of the Messor robot strongly depends on the exteroceptive sensory system. Dynamic control techniques require high-torque drive units and precise proprioceptive sensing [14], which are not available to the Messor robot walking on rough terrain. Thus, the dynamic capabilities of our robot are limited. Not only optimization and inspirations from the Nature can be used to obtain control system for walking robot. The robot can also learn how to walk. In [26] a modi�ied Reinforcement Learning algorithm is presented. It was shown that the robot can utilize past experience and faster learn ef�icient locomotion on �lat terrain. Similarly, a gait of humanoid robot can be optimized to obtain maximal linear velocity [27]. Some procedures for control of six-legged robots, which allow them to climb high, but structural obstacles like steps [5] and stairs [25] are known from the literature. However, in our work we don’t assume particular shapes of the obstacles. The robot autonomously can deal with various obstacle. This behavior is obtained using RRT-based planner and optimization methods which support motion planning. In our work we utilize hybrid deliberative/reactive paradigm. When the initial path of the robot is planned the robot can execute the planned motion using reactive and fast controller which guarantees more robust locomotion. Moreover, during execution of the planned motion the robot uses simple leg compliant strategy. The robot stops the leg’s motion whenever contact force at leg’s tip exceeds given thresholds. The compliant strategy is very important to robust locomotion [10, 31] or motion of a robot arm [13, 28].
�. �o�o� �la����� al�or�t�� Find maximal possible length of robot’s step dSTEP
Select foothods for robot’s configuration simp q (k·dSTEP)
START k:=1
k:=k - 0,2
Optimize trajectories of the feet
Opimize posture
YES
NO
k>0 ? NO
FINISH FAILURE
Is motion possible? YES
Check stability, workspace, collisions, for each point of the path
FINISH SUCCESS
Fig. 2. ������ p�������� ���� ��� �����a��� ����n planning Articles
35
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal Journal of of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
The RRT-Connect-based integrated motion planner [2] creates two trees, which are extended alternately. The root of the �irst tree is located in the initial position of the robot. The root of the second tree is located in the goal position of the robot. At the beginning the algorithm tries to add new node to the �irst tree (E����� procedure). The position of the new node is determined randomly. If the robot can reach the next goal position q(k · dSTEP ) the node is added to the tree. Then, the second tree is extended in the direction to the new node. In the following iteration of the algorithm the sequence is swapped and the second tree is extended in the random direction at the beginning. The core of the RRT algorithm is the E����� procedure (Fig. 2). The procedure checks if the transition in the desired direction is possible. If the motion is possible the procedure plans the precise path for the body and each foot. The previous version of the procedure has been modi�ied [2]. First, the new procedure determines the maximal step length in the desired direction (dSTEP ). The maximal step length is found using kinematic and collision constraints. To determine the maximal length of the step we create the initial posture of the robot qsimp . The procedure assumes initial horizontal orientation of the robot’s body. The distance to the ground is set to such a value that guarantees secure clearance between robot’s body and the ground (in experiments the clearance is set to 0.1 m). The subprocedure, which plans the path for the next robot step, is executed �ive times for various step lengths k · dSTEP (k ∈ {0.2; 0.4; ...; 1.0}). For the next potential node of the tree the robot �inds footholds and optimizes the posture [3, 4]. Next, the procedure plans path of feet during swing phase. The path of the body is straight between two neighboring nodes. Then, the procedure checks if the execution of the planned path is possible. Workspaces of the robot’s feet, collisions and static stability are checked. The procedure returns the planned path for the maximal possible step length. Ϯ͘ϭ͘ WŽƐƚƵƌĞ ŽƉƟŵŝnjĂƟŽŶ ƐƚƌĂƚĞŐLJ
36
The RRT algorithm determines a new (xR ,yR ) position of the robot in the global coordinate system. For this position the planning procedure determines full state (posture) of the robot (vertical position of the robot’s platform zR , inclination and leg’s con�iguration). To this end, the elevation map of the environment is used. The direction of motion is represented −−−−→ by the vector piR pi+1 R identifying the previous position i of the robot piR = [xiR , yR ] and the next position of the i+1 i+1 robot pi+1 = [x , y R R R ]. The orientation of the robot on the surface γR is equal to the angle between the vec−−−−→ tor piR pi+1 R and x axis of the global coordinate system. If the desired rotation around z axis can’t be executed with the given kinematics of the robot (the reference angle γR is not reachable) the reference rotation angle γR is limited to maximal rotation which can be executed by the robot in a single step. In all experiments presented in the paper the maximal rotation angle is set to 0.2 rad. i+1 i+1 For the new position pi+1 R = [xR , yR ] the robot Articles
selects footholds [3]. Inclination of the robot’s platform (θR , φR ) and distance to the ground zR are determined by the posture optimization procedure. The optimization procedure searches for the vector pR = [θR , φR , zR ]T which determines the posture of the robot with maximal kinematic margin of the robot dKM : arg max{dKM (q(pR ))}, q(pR ) ∈ Cfree , q(pR ) ∈ Cstab , pR
(1)
where: q(pR ) – posture of the robot for the given con�iguration pR and position of feet determined using foothold selection algorithm, dKM (q(pR )) – kinematic margin of the robot, Cfree – con�igurations of the robot which are collision free (lack of collisions with the ground and between parts of the robot) and inside of the robot’s workspace, Cstab – statically stable con�igurations of the robot.
To �ind the optimal posture of the robot the Particle Swarm Optimization (PSO) algorithm is used [4, 12]. The kinematic margin diKM of the i-th leg is de�ined as the distance from the current position of the feet to the boundary of the reachable area of the leg [17]. To compute kinematic margin dKM of the robot the distance from current positions of the feet to the boundaries of the legs workspace are computed. The smalli=6 est distance (min(di=1 KM ), ..., dKM )) determines the dKM value [4]. The computation of the kinematic margin should be fast. It is important because the kinematic margin is computed hundred of times during single optimization run. The analytical relation between con�iguration of the leg and the value of the kinematic margin is found to speed up the computations [4]. To obtain this relation, the approximation with mixture of Gaussian functions is used. Ϯ͘Ϯ͘ >ĞŐͲĞŶĚ ŽƉƟŵŝnjĂƟŽŶ ĚƵƌŝŶŐ ƐǁŝŶŐ ƉŚĂƐĞ
When the posture of the robot is determined the robot plans the path of each foot from the current to the next foothold. The initial paths are located on the plane which is perpendicular to the gravity vector. The initial and the goal footholds are located on this plane. The shape of the initial path is created according the the vertical cross-section of the obstacles between footholds [2]. The safety margin is added to decrease the risk of collision. However, the found path does not guarantee proper execution of the planned motion. Some points of the planned path might be unreachable for the robot (the positions of feet are outside of the robot workspace). Moreover the legs of the robot might collide with other parts of the robot. Thus, it is necessary to modify the position of feet in the direction perpendicular to the direction of the foot motion. We call it leg-end optimization during swing phase. During the optimization the inverse and forward kinematic of the leg are known, thus we can switch easily between the leg con�iguration and position of the foot.
37
VOLUME7,7, N° N◦4 4 2013 2013 VOLUME
Journal of Journal of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
We use both representations in the optimization procedure. For example we use leg con�iguration to compute kinematic margin and we use position of the foot to check collisions with the ground.
�ig� �� ���ss�se���n �� �he �eg w���spa�e an� p��g�ess �� �he g�a�ien���ase� �p��i�a��n �e�h�� ���ing swing phase
�ig� �� �p��i�a��n �� �he �eg�en� p�si��n ���ing swing phase ��� �he �����wing p�in�s �� �he ini�a� pa�h
margin is bigger than dsafe KM = 8 cm. This approach speeds up the optimization procedure. The procedure is run for each leg-end position and all points of the initial path during swing phase (Fig. 4). To �ind the optimal foot position the method of steepest descent is used [11]. The cross-section of the leg workspace and progress of the gradient-based optimization method during swing phase are shown in Fig. 3. If the initial position of the foot FSTART is outside of the workspace of the leg the robot searches for the acceptable leg’s con�iguration (reference values for servomotors) which has positive value of the kinematic margin. To this end, the current position of the foot is moved iteratively in eight main directions (Fig. 3). When the foot is inside of the workspace of the robot, the procedure continues with gradientbased optimization. The position of the foot is modi�ied in the direction perpendicular to the direction of the foot motion (vertical cross-section through the leg’s workspace). The search space is limited by the shape of the obstacles increased by the safety margin (in the simulations presented in this paper the safety margin is 3 cm). 2.3. Path smoothing
The �inal path found by the RRT-based algorithm is made from straight lines which connect optimal postures of the robot. Thus, the motion of the robot is not smooth. In the nodes of the tree (positions of the robot where all feet are located on the ground and optimal position is determined) the robot rapidly changes the direction of the motion. To avoid rapid movements we use path smoothing on the results obtained from the RRT-based planner [8]. The goal of the path smoothing is the minimization of �itness function Fsm . The arguments of Fsm are the positions of points in the initial path p found by the RRT-based algorithm, and positions of the points in the smooth path p′ : n { ∑ α
i diKM (pif ) > dsafe KM , pf ∈ Cfree ,
38
(2)
where Cfree are leg’s con�igurations which are collision free and inside of the workspace of the robot and dsafe KM = 8 cm. We decided to use sub-optimization instead of full optimization because looking for leg-end position which satis�ies safety requirements is suf�icient. The sub-optimization stops searching when the kinematic
p′
} β ′ β ′ ′ 2 ′ 2 ′ 2 . (p − p ) + (p − p ) + (p − p ) i i i i−1 i i+1 2 2 2
i=0
The goal of the optimization (in our case sub-optimization) is to �ind the position of the i-th foot during swing phase pif = [xif , yfi , zfi ]T (expressed in the global coordinate system O) which satis�ies the requirements for the value of the i-th leg kinematic margin diKM :
arg min Fsm (p′ ) =
(3) The element of the �itness function Fsm which depends on the β coef�icient is responsible for path smoothing. Additionally, the element which depends on the α coef�icient is introduced. Without this element the optimal solution is a straight line (the positions of the initial and �inal points are not optimized). To �ind the optimal value of the function (3) the gradient-based optimization is used [11]. At the beginning of the optimization the smoothed path p′ is the same as the initial path p. In the following step of the optimization the algorithm modi�ies the position of the new path according to the equation: new
p′ i
= p′ i + α(pi − p′ i ) + β(p′ i−1 + p′ i+1 − 2p′ i ). (4) Articles
37
VOLUME VOLUME 7, 7, N°N◦4 4 2013
Journal Journal of of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
The positions of footholds are not modi�ied by the optimization. Similarly, the points of the path are not modi�ied if the new position is outside of the robot workspace, if it causes collisions, or the new position of the robot is not statically stable. The procedure ends when the new modi�ication of the path ∆p is bellow the pre-determined value ∆max : ∆p =
n ∑ i=0
new
|p′ i
− p′ i | < ∆max .
(5)
3. Results The results of the procedure which optimizes the position of feet during swing phase is presented in Fig. 5. During the experiment the robot was climbing a step. When the optimization is used the robot modi�ies the initial path of feet. Without using the proposed method the desired position of foot is outside of the robot’s workspace. The execution of the planned motion is not possible. When the proposed optimization procedure is used the robot moves its leg sideway to increase kinematic margin to avoid collisions with the obstacle. This optimization procedure allows to climb higher obstacles and avoid deadlocks. initial path final path foothold
z [m]
0.4
0.3
0.2
executable (after optimization)
-1
0.1
not executable (before optimization) -0.5
0 -0.4 -0.35
0
-0.3
x [m]
-0.25 -0.2
y [m]
0.5
-0.15 -0.1 -0.05
0 1
Fig. �. Results of the foot path op�miza�on procedure during swing phase on the step obstacle
Tab. 1. Maximal size of the obstacles which can be covered b� the robot using various variants of the mo�on planning approach. Table presents three variants: mo�on planning without op�miza�on �1�� with posture op�miza�on ��� and with all op�miza�on procedures ���. variant 1 2 3
-10
-5
0
y [cm]
RRT path α=0.8, β=0.2 α=0.1, β=0.9 α=0.01, β=0.99
5
-30 -25 -20 -15 -10
10
-5
x [cm]
0
Fig. 6. Results of the path smoothing procedure for various α and β parameters
38
parameter β is increased and the parameter α is decreased the obtained path is smoother. when parameter α is increased and parameter β is decreased the obtained path is similar to the initial path (Fig. 6). The parameters are set to α = 0.01 and β = 0.99 when the path for the body is smoothed. The parameters are set to α = 0.8 and β = 0.2 when the paths of feet are smoothed. The goal is to obtain smooth path for the robot’s body to limit unwanted sudden motions. The paths of feet are more constraint (workspace of the robot and collisions with obstacles). The footholds can’t be modi�ied in this stage thus the path is only slightly smoothed. The proposed methods were veri�ied in the realistic simulator of the Messor walking robot [24]. Dynamic properties of the robot are simulated using the Open Dynamics Engine [22]. The software library detects collisions between parts of the robot’s body and simulates behavior of the colliding objects (static and dynamic friction). The simulator also contains the software controller of the real robot. The environment, the state of the robot, and paths obtained using the motion planning algorithm are presented using OpenGL procedures. The properties of the proposed methods were presented in simulations using three various obstacles1 : a bump (Fig. 7a), concavity (Fig. 7b) and step (Fig. 7c). Using standard shape of the obstacles we can present ef�iciency of the proposed methods. We found the maximal size of the obstacles which can be covered by the robot using various variants of the motion planning approach (Table 1). It should be also mentioned that the proposed method is universal and does not assume any particular model of the obstacle.
Before execution of the planned motion the path is smoothed. Th sequence of the body positions [xR , yR , zR , θR , φR , γR ] as well as path of each foot is modi�ied. Example results of path smoothing for various values of parameters are shown in Fig. 6. When the Articles
bump [m] Fig. 7a 0.13 0.16 0.16
concavity [m] Fig. 7b -0.17 -0.185 -∞
step[m] Fig. 7c 0.15 0.18 0.25
To show the ef�iciency of using presented optimization procedures we performed simulations with three different obstacles: a bump, a concavity and a step. In the �irst simulation the robot plans its motion using only the RRT algorithm. The position of the robot’s platform is determined using a simple strategy – the orientation of the robot’s body is the same as the average slope of the terrain. The distance to the ground is set to a value which is minimal but safe and guarantees that the motion is collision free [2]. In the second simulation the robot uses posture optimization algorithm to determine its posture in the nodes of the RRT trees. In the third simulation the robot uses all methods presented in this paper (posture optimization, feet path planning during swing phase and path
39
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal of of Automation, Automation, Mobile Journal MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
a1
a2
z [m]
0.5 0.4 0.3 0.2 0.1 0 -0.8
body path feet path RRT branches -0.6 -0.4 -0.2 y [m]
0 0.2 0.4 0.6 0.8
0.3 0.2 0.1 0 -0.1 -0.8
b1
z [m]
1.5
1
0.5
0
-0.5
x [m]
b2
body path feet path RRT branches
-0.6 -0.4 -0.2 y [m]
0 0.2 0.4 0.6 0.8
0.5 0.4 0.3 0.2 0.1 0 -0.1 -0.8
1.5
1
0
0.5
-0.5
x [m]
c2
c1
z [m]
body path feet path RRT branches
-0.6 -0.4 -0.2 y [m] 0 0.2 0.4 0.6 0.8
1.5
1
0.5
0
-0.5
x [m]
�i�� �� ��e p�anne� pat� o� t�e �obot ��� an� e�ec��on o� t�e p�anne� �o�on in si���ato� ��� ��i�e c�i�bin�� a – b��p, b – concavity, c – step
40
smoothing). When the robot does not use optimization methods the maximal height of the bump which can be covered by the robot is 13 cm (Tab. 1). Using posture optimization procedure the robot is capable to climb a bump which is 3 cm higher. The maximal height of the obstacle is the same when the optimization during swing phase is used. To deal with the obstacle the robot has to put its legs on the ground level and on the obstacle at the same time. If the obstacle is too high the reference position of the the feet are outside of the workspace of the robot. The optimization during swing phase does not bring additional advantages. The path obtained using all optimization methods is shown in Fig. 7a. When the posture optimization is used during motion planning the maximal depth of the concavity is 18.5 cm. If necessary the robot decreases the distance to the ground to reach the bottom of the concavity and safely moves to the other side of the ditch. When all optimization methods are used the depth of the concavity does not play any role. The ������ procedure (Fig. 2) checks various step lengths. For the given width of the ditch the robot does not have to place its feet on the bottom of the concavity. The robot can make longer steps and place feet on the other side of the concavity. The ������ procedure proposed in this paper allows to autonomously modify length of steps to deal with various obstacles. In this case path optimization during swing phase does not bring additional advantages. The initial path is secure and guarantees a proper kinematic margin.
The advantages of using optimization procedure during the swing phase can be observed during step climbing. The obtained path is shown in Fig. 7C. The motion of feet is modi�ied to increase kinematic margin during swing phase. As a result the robot moves feet sideways when they are above the edge of the step. The proposed strategy allows to deal with steps 25 cm high. If only the posture optimization procedure is used the maximal height of the step is 18 cm (Tab. 1). To check the ef�iciency of the algorithm on terrain with irregular obstacles a simulation on rough terrain mockup was performed. The results are presented in Fig. 8. The distance between initial and �inal position of the robot is 3 m. Because the algorithm is randombased, the series of 10 simulations was performed. In each trial the robot was able to �ind a secure path to the goal position. The average time of the path planning is 389 s and standard deviation is 44 s (simulations were performed on a computer with Intel i7 2.7GHz processor).
4. Conclusions
The article presents various optimizations strategies to plan motion of the six- legged walking robot on rough terrain. The PSO optimization algorithm is used to �ind posture of the robot during phase when all feet of the robot are placed on the ground. The gradientbased optimization method is used to determine feet position during swing phase. A similar method is used for smoothing the path returned by RRT-based motion planner. The presented methods are used as core Articles
39
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal Journal of of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
&ŝŐ͘ ϴ͘ WůĂŶŶĞĚ ƉĂƚŚ ĚƵƌŝŶŐ ƐŝŵƵůĂƟŽŶ ŽŶ ƌŽƵŐŚ ƚĞƌƌĂŝŶ ŵŽĐŬƵƉ elements of the new E����� procedure which determines the following step of the robot. The goal of using the optimization methods is to create algorithm which is capable to plan motion of the robot on rough terrain with high obstacles. The capabilities of the algorithm were presented in the simulator. We determined the maximal size of the obstacles which can be covered by the robot using the presented methods. The goal of the optimization can be easily modi�ied. It could be the maximal security margin (a sum of three weighted coef�icients: value of distance between feet and obstacles, kinematic margin and stability margin) or combination of few coef�icients. In the future we are going to implement the proposed optimization methods on the real Messor robot. To this end, we are going to integrate the mapping algorithm which uses �okuyo laser range �inder and a selflocalization algorithms.
5. Acknowledgement
This work was supported by NCN grant no. 2011/01/N/ST7/02080. The author would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper.
Notes
1 A video with simulation experiments is available on http://lrm.cie.put.poznan.pl/jamris2013.wmv
AUTHOR Dominik Belter∗ – Poznań University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań, Poland, e-mail: Dominik.Belter@put.poznan.pl. ∗
Corresponding author
REFERENCES
[1] G. B. Bell, M. Livesey, ”The Existence of Local Minima in Local-Minimum-Free Potential Surfaces”, Towards Autonomous Robotic Systems (TAROS 2005), London, UK,2005 pp. 9–14. [2] D. Belter, P. Skrzypczyński, Integrated motion planning for a hexapod robot walking on rough 40
Articles
terrain, 18th IFAC World Congress, Milan, Italy, 2011, pp. 6918–6923
[3] D. Belter, P. Skrzypczyński, ”Rough terrain mapping and classi�ication for foothold selection in a walking robot”, Journal of Field Robotics, vol. 28(4), 2011, pp. 497–528 [4] D. Belter, P. Skrzypczyński, ”Posture optimization strategy for a statically stable robot traversing rough terrain”. In: Proc. IEEE Int. Conf. on Intelligent Robots and Systems, Villamoura, Portugal, 2012, pp. 2204–2209, [5] S. Fujii, K. Inoue, T. Takubo, T. Arai, ”Climbing up onto Steps for Limb Mechanism Robot “ASTERISK””, In: Proc of the 23rd International Symposium on Automation and Robotics in Construction, 2006, pp. 225–230 [6] M. Kalakrishnan, J. Buchli, P. Pastor, M. Mistry, S. Schaal, ”Fast, robust quadruped locomotion over challenging terrain”. In: Proc. IEEE Int. Conf. on Robotics and Automation, Anchorage, USA, 2010, pp. 2665- -2670
[7] M. Kalakrishnan, J. Buchli, P. Pastor, M. Mistry, S. Schaal, ”Learning, planning, and control for quadruped locomotion over challenging terrain”, Int. Journal of Robotics Research, vol. 30(2), 2010, pp. 236– 258.
[8] D. Dolgov, S. Thrun, ”Detection of principal directions in unknown environments for autonomous navigation”, In: Proc. of the Robotics: Science and Systems, 2008 [9] A. Gmerek, ”Wykorzystanie quasi-chaotycznych oscylatorów do generowania rytmu chodu robotów kroczacych”, Postępy robotyki 2012, K. Tchoń, C. Zieliśki (Eds.).
[10] A. Gmerek, E. Jezierski, ”Admittance control of a 1-DoF robotic arm actuated by BLDC motor”. In: 17th International Conference on Methods and Models in Automation and Robotics (MMAR), 2012, pp. 633–638. [11] C.T. Kelley, Iterative methods for optimization, SIAM, Philadelphia, 1999.
[12] J. Kennedy, R.C. Eberhart, ”Particle swarm optimization”. In: Proc. IEEE Int. Conf. on
41
Journal of Automation, Mobile Robotics & Intelligent Systems Journal of Automation, Mobile Robotics & Intelligent Systems
Neural Networks, Piscataway, Australia, 1995, pp. 1942–1948.
[13] M. Kordasz, R. Madoński, M. Przybyła, P. Sauer, Active Disturbance Rejection Control for a Flexible-Joint Manipulator, Lecture Notes in Control and Information Sciences, Robot Motion and Control 2011, K. Kozłowski (Ed.), Springer-Verlag, 2011, pp. 247–256.
[14] M. Kowalski, M. Michalski, D. Pazderski, Quadruped walking robot WR-06 - design, control and sensor subsystems, Lecture Notes in Control and Information Sciences, Robot Motion and Control 2009, Springer-Verlag, 2009, pp. 175–184. [15] S. Krenich, M. Urbanczyk, ”Six-Legged Walking Robot for Inspection Tasks”, Solid State Phenomena, vol. 180, pp. 137–144, 2011
[16] J.J. Kuffner, S.M. LaValle, ”RRT-Connect: An ef�icient approach to single-query path planning”. In: Proc. IEEE Int. Conf. on Robotics and Automation, San Francisco, USA, 2000, pp. 995–1001.
[17] P. D. Kumar, D. Kalyanmoy, and G. Amitabha, ”Optimal path and gait generations simultaneously of a six-legged robot using a GA-Fuzzy approach”, Robotics and Autonomous Systems, vol. 41, no. 1, 2002, pp. 1-20. [18] M. Likhachev, G. Gordon, S. Thrun, ”ARA*: anytime A* with provable bounds on suboptimality”.In: Advances in Neural Information Processing Systems 16: Proceedings of the 2003 conference, 2003.
[19] N. Ratliff, M. Zucker, J.B. Andrew, S. Srinivasa, ”CHOMP: Gradient optimization techniques for ef�icient motion planning”. In: Proc. IEEE Int. Conf. on Robotics and Automation, Kobe, Japan, 2009, pp. 489– 494.
[20] R.B. Rusu, A. Sundaresan, B. Morisset, K. Hauser, M. Agrawal, J.C. Latombe, M. Beetz, ”Leaving �latland: ef�icient real-time 3D perception and motion planning”, Journal of Field Robotics, vol. 26(10), 2009, pp. 841–862.
[21] U. Saranli, M. Buehler, D.E. Koditschek, ”RHex: a simple and highly mobile hexapod robot”, International Journal of Robotics Research, vol. 20, 2001, pp. 616–631.
42
VOLUME 7, N◦ 4 2013 VOLUME 7, N° 4 2013
[22] R. Smith, Open Dynamics http://www.ode.org, 2012.
Engine,
[23] K. Walas, D. Belter, ”Supporting locomotive functions of a six-legged walking robot”, International Journal of Applied Mathematics and Computer Science, vol. 21, no.2, 2011, pp. 363–377. [24] K. Walas, D. Belter, ”Messor – Versatile walking robot for search and rescue missions”, Journal of Automation, Mobile Robotics & Intelligent Systems, vol. 5, no. 2, 2011, pp. 28-34. [25] K. Walas, A. Kasiński, ”Controller for Urban Obstacles Negotiation with Walking Robot”. In: Proc. IEEE Int. Conf. on Intelligent Robots and Systems, Villamoura, Portugal, 2012, pp. 181–186.
[26] P. Wawrzyński, ”Real-Time Reinforcement Learning by Sequential Actor-Critics and Experience Replay”, Neural Networks, vol. 22, no. 10, Elsevier, 2009, pp. 1484-1497.
[27] P. Wawrzyński, ”Autonomous Reinforcement Learning with Experience Replay for Humanoid Gait Optimization”, Proceedings of the International Neural Network Society Winter Conference, Procedia, 2012, pp. 205–211.
[28] T. Winiarski, A. Woźniak, ”Indirect force control development procedure”, Robotica, vol. 31, 4, pp. 465–478, 2013. [29] D. Wooden, M. Malchano, K. Blankespoor, A. Howardy, A.A. Rizzi, ”Raibert, Autonomous navigation for BigDog”. In: Proc. IEEE Int. Conf. on Robotics and Automation, Anchorage, USA, 2010, pp. 4736–4741
[30] T. Zielińska, ”Biological inspiration used for robots motion synthesis”, Journal of Physiology – Paris, vol. 103(3-5), September 2009, pp. 133–140, 2009. [31] T. Zielińska, A. Chmielniak, ”Biologically inspired motion synthesis method of two-legged robot with compliant feet”, Robotica, vol. 29, no. 7, 2011, pp. 1049–1057. [32] M. Zucker, J.A. Bagnell, C.G. Atkeson, J. Kuffner, ”An optimization approach to rough terrain locomotion”. In: Proc. IEEE Int. Conf. on Robotics and Automation, Anchorage, USA, 2010, pp. 3589–3595
Articles
41
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
������������ ������������ �� � ������� R���� S���-������������ S����� ���� ��� ������ S����� �u����ed: 15th January 2013; accepted: 5th June 2013
Michał Nowicki, Piotr Skrzypczyński DOI: 10.14313/JAMRIS_4-2013/43 Abstract: �n this paper we inves�gate methods �or sel�-locali�a�on o� a walking robot with the �inect �� ac�ve range sensor. The �tera�ve �losest �oint (���) algorithm is considered as the basis �or the computa�on o� the robot rota�on and transla�on between two viewpoints. As an alterna�ve, a �eature-based method �or matching o� �� range data is considered, using the Normal Aligned Radial Feature (NARF) descriptors. Then, it is shown that NARFs can be used to compute a good ini�al es�mate �or the ��� algorithm, resul�ng in convergent es�ma�on o� the sensor egomo�on. ��perimental results are provided. Keywords: �� percep�on, salient �eatures, itera�ve closest point, visual odometry, walking robot
�� ��trod�c�o�
42
In recent years the research on Simultaneous Localization and Mapping (SLAM) in robotics heads towards more challenging scenarios, mostly in threedimensional (3D) environments. Mobile robots moving in 3D require three-dimensional positioning, with regard to six degrees of freedom (DOF). The 6DOF pose contains the position in three dimensions, orientation, and pitch and roll angles: xR = [xr yr zr θr ϕr ψr ]T . Self-localization with regard to 6-DOF is particularly important in the case of walking robots, because the discrete nature of their motion causes sudden and frequent changes in the robot’s trunk roll, pitch and yaw angles with regard to the global reference frame [2]. Various sensing modalities can yield the data necessary for 6-DOF SLAM or self-localization. Recent SLAM algorithms focus on the use of passive vision sensors (cameras) [20]. However, passive vision requires speci�ic photometric features to be present in the environment. Encountering areas with ill-textured surfaces and few visually salient features the systems relying on passive vision usually cannot determine the pose of the robot, and easily get lost. Therefore, active range sensors seem to be more practical solution for many mobile robots, particularly those, which have to work in highly unpredictable environments, such like disaster sites explored by robots during search and rescue missions. Often point clouds yielded by 3D laser scanners are matched against each other in order to obtain the displacement between two poses of the sensor. This approach, known as scan matching, is characterized by high reliability and relative independence from the Articles
characteristics of the environment, because in scan matching dense point clouds are matched against each other, not against a particular type of features. The scan matching procedure is typically implemented with the Iterative Closest Points (ICP) algorithm. However, implementation of 6-DOF scan matching with data from laser scanners in most of the walking robots is hardly possible because of the tight size, mass, energy and computing resources limits in such robots – there are no 3D laser scanners that could be placed on board of a typical walking robot. A solution to this problem would be to use a compact 3D range sensor, which has no moving parts. In this context an interesting device is the integrated 3D sensor developed and patented by PrimeSense [21], then introduced to the market by Asus as Xtion and by Microsoft as Kinect. These sensors are compact and affordable, and as shown by the recent research [31], the range measurement accuracy up to the distance of about 3.5 m is comparable to that achieved by a 3D laser scanner. This paper addresses the issues of reliable incremental self-localization for the six-legged walking robot Messor equipped with the Kinect sensor. Only the range data from Kinect are used, because we would like to have a procedure that is portable to other 3D range sensors, such as the VidereDesign STOC (STereo On a Chip) camera that yields a dense depth map in real-time using hardware-embedded image processing. This camera was already successful used on the Messor robot for terrain perception [15]. Another motivation for using only the range data from Kinect is the fact, that the Messor robot is designed for patrol and search missions in buildings [32], and it should not much rely on the natural lighting of the scene, which in such an application scenario may be insuf�icient. We evaluated performance of the standard ICP algorithm on Kinect range data in tests that simulated a scenario typical to a walking robot traversing rough terrain. Because these tests revealed that matching consecutive point clouds obtained from Kinect by using the ICP algorithm requires a good initial estimate of the robot displacement we propose a two-stage point cloud matching procedure exploiting salient features in the range data. The feature-based matching procedure employs the NARF detectors/descriptors [29] and yields reliable initial alignment, which is then re�ined using the ICP. This work is an extension of our previous conference papers [16, 17]. The major difference is evaluation of the proposed approach on more realistic in-
43
VOLUME7,7, N° N◦4 4 2013 2013 VOLUME
Journal of Journal of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
door datasets acquired with the Messor robot, and on a publicly available dataset. Moreover, presentation of the state of the art has been signi�icantly extended. In the remainder of this paper we brie�ly review the state of the art (section 2), then we present the concept of the incremental self-localization system for the Messor robot with Kinect (section 3). Next, we examine theoretical foundations of the point cloud matching algorithms (section 4), and in section 5 we show how the sought rotation and translation between two viewpoints can be computed robustly. We demonstrate experimentally the performance of the proposed selflocalization system, at �irst using an industrial manipulator to move the sensor, and then on the actual Messor robot (section 6). We conclude the paper in section 7 with some remarks as to the directions of further research.
2. State of the art
44
Nowadays passive vision is considered the most affordable exteroceptive sensing modality for mobile robots, and vision systems receive much attention in the SLAM research. In particular, the monocular EKFbased SLAM algorithm [6] with its numerous extensions is considered one of the most successful solutions to the 6-DOF self-localization problem. However, monocular vision provides only bearing information without a range to the detected features, which leads to data association problems. Although it was demonstrated that a real-time, monocular SLAM can be implemented on a walking robot [28], in a search and rescue scenario, which is considered for our Messor robot, the machine often explores new areas, and avoids returning to previously visited places. Therefore a self-localization system that is rather a form of visual odometry with some mapping capabilities, like in [30], seems to be more appropriate for this task than a SLAM algorithm maintaining a global map of the environment. Active sensors, such like laser scanners and 3D range cameras do not depend so much on both the natural illumination of the scene, and the ubiquity of salient photometric features. As it was shown in our previous work [27] the pose of a walking robot can be precisely estimated by using 2D laser scan matching and data from an inertial measurements unit. However, this approach works only in man-made environments, where �lat walls are available. In an unstructured environment, lacking such features like long �lat walls, point clouds yielded by a 3D laser scanner can be used for self-localization with regard to 6-DOF [18], but a 3D laser scanner cannot be used on Messor due to the mass and size limits. Recently, a number of research teams demonstrated the use of Kinect/Xtion sensors for selflocalization or SLAM. The 3D range data stream obtained from a Kinect-like sensor is very intensive. Therefore, the ICP algorithm, which tries to establish correspondences between all the available points requires much time for each iteration. Such amount of data can be processed using a parallel implementa-
tion of ICP on GPU [19]. Also the approach presented in [11] heavily uses the GPU to run the ICP algorithm on the Kinect range data. It is demonstrated that this method is enough for real-time tracking of the sensor with unconstrained motion, but this solution is not available for on-board processing in Messor, which uses a single-core x86 processor. The amount of data being processed can be reduced substantially if salient features are detected in the range or photometric data obtained from the Kinect sensor. Endres et al. [8] demonstrated recently a SLAM system using Kinect RGB and range information. This solution relies mostly on the photometric data for salient feature detection, and then uses the depth image to locate the keypoints in 3D space. While this system shows good performance on large data sets it is unclear how the solution based on photometric features (SIFT, SURF, ORB) will work on the walking robot, which has the sensor pointed partially towards the ground, where the number of photometric features can be much smaller. This system needs also small displacements between the consecutive frames. Therefore, the RGB-D images from Kinect have to be processed at the frame rate, which is impossible with the on-board computer of our robot. Also in [10] features extracted from PrimeSense camera’s RGB data are used. Feature matching initializes combined feature and ICP optimization. The RGB-D ICP algorithm proposed in [10] uses photometric features and their associated range values to obtain an initial alignment, and then jointly optimizes over the sparse feature matches and dense 3D point correspondences. Range-only features are used in [23] by an initial alignment algorithm that transforms the 3D point clouds to the convergence basin of an iterative registration algorithm, such as ICP. In this case persistent feature histograms are used. They are scale and pose invariant multi-value features, which generalize the mean surface curvature at a point. The approach is shown to work indoors and outdoors, but only with rather good quality data from mechanically scanning 3D laser sensors.
3. Concept of the locali�a�on s�ste�
Kinect-based
self-
3.1. Kinect as a sensor for the walking robot The 3D ranging technology used in Kinect is patented by PrimeSense, and thus few details concerning the characteristics of the sensor are publicly available. Therefore, we present a short overview of the principle of operation and basic uncertainty characteristics of Kinect, which are relevant to our application. Kinect is a device consisting of several components, and is intended to be used for natural interaction with computer games. The basic functionality of the Kinect sensor allows to obtain color images (RGB) and range images of 640 × 480 resolution, which gives 307200 points in 3D space. By combining information from the range image and the RGB image the RGB-D image can be obtained, in which RGB values Articles
43
Journal Journal of of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
IR CAMERA
disparity
base
CMOS
z
SPOTS AS SEEN BY AN EXTERNAL IR CAMERA
x
IR LASER
Fig. �. �rinciple of opera�on of the structured light sen� sor. Inset image shows the actual light spots projected by Kinect on a flat surface Spatial density of the measured points is de�ined by the number of projected spots per unit area. It decreases with the square of the measured distance. For the measured range of 2 m the distance between the dots is about 3 mm. Because the number of spots in the projected pattern is smaller than the number of pixels in the captured infrared image, some pixels have interpolated values. The depth resolution decreases with the increasing distance to the sensor. For the measured range of 2 m the depth resolution is about 1 cm, and falls exponentially for longer distances. Kinect is also very prone to errors due to environmental conditions, especially the ambient light. Too strong ambient light, and the sunlight in particular, reduces the contrast of the infrared spots in the captured image, often rendering the range measurements impossible. =27o
dzk (46 cm)
ZR
YR
Fig. �. �eometric con�gura�on of the Kinect�based per� cep�on system for the Messor robot
44
the neutral (default) posture of the robot. Since the main task of Kinect in this con�iguration is to observe the area in front of the robot, it was tilted down by 27◦ . This con�iguration of the perception system allows both to measure the shape of the terrain in front of the robot, and to perceive larger objects being located farther from the sensor (Fig. 3). These objects are important for the self-localization task, because they often provide more salient features than the terrain itself.
REFERENCE SURFACE
are allocated to the points in the 3D space. The range measurements are obtained by the principle of triangulation, using structured light. The laser projector operates in the infrared range, with the wavelength of about 830 nm. The laser beam passes through a diffraction grating to form a constant pattern of spots projected onto the observed surfaces. The image of this pattern is captured by the CMOS infrared camera, and compared with a stored reference pattern, obtained by projecting the spots onto a plane located at a known distance from the sensor (Fig. 1).
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
On the walking robot Messor the Kinect sensor is mounted on a mast (Fig. 2). The sensor is located at the elevation of 46 cm above the ground, assuming Articles
Fig. 3. Messor walking robot with the Kinect sensor ϯ͘Ϯ͘ /ŵƉůĞŵĞŶƚĂƟŽŶ ŽĨ ƚŚĞ ƐĞůĨͲůŽĐĂůŝnjĂƟŽŶ ŵĞƚŚŽĚ The most important limitation imposed on the self-localization method by the resources available on our walking robot is the processing speed. The singleboard PC used by the robot cannot process the range data at full frame rate, no matter which algorithm is applied for matching of point clouds. Thus, 3D points can be acquired only at selected locations. On the other hand, the ICP with Kinect 3D data is effective only for small values of translation (a few centimeters) and rotation (up to about 5◦ ) between the consecutive point clouds, as it was shown in our preliminary experiments (see Section 6.1). For bigger translations and�or rotations the ICP algorithm usually only �inds a local minima. In ICP scan matching on wheeled robots this problem is usually alleviated by obtaining a fairly good initial estimate of the displacement from odometry, but in walking robots proprioceptive sensing is unreliable [9]. Therefore, we are looking for a self-localization system, which uses range data and can process isolated point clouds, tolerating larger displacements (both translations and rotations) between the points of data acquisition. Moreover, we would like to avoid use of any additional hardware (such like a GPU board), which is not available to the Messor robot. Finally, we decided to use feature-based point cloud matching to �ind a better initial guess for the ICP algorithm. The implementation uses the open source Point Cloud Library (PCL) [24] to manipulate the 3D points, and to detect the salient features. Also the implementation of ICP is taken from the PCL. The whole procedure of self-localization is implemented in several steps. At �irst, the range data are trimmed to discard the measurements shorter than 50 cm and longer than 350 cm, that is unreliable or of low
45
Journal of Journal of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
depth resolution. Then, the VoxelGrid �ilter from the PCL library is used on both data sets with the voxel size of 2 cm. This �ilter reduces the number of points and decreases the in�luence of small range measurement errors. We do not use other �ilters available in the PCL, as their use did not improve the results, while it consumed additional time for processing. NARF keypoints are detected in both point clouds under consideration, and then their descriptors are computed. The next step is matching of the descriptors as described in Section 4.2. The kinematic constraints are imposed in this step. If a suf�icient matching is found, the transformation between the two point clouds is estimated, and the data set A is transformed to the new coordinates. Then, the standard ICP procedure from the PCL library is applied to the two data sets in order to obtain a more precise transformation.
4. Methods for range data matching
4.1. The ICP algorithm There are many variants of the ICP algorithm described in the literature. The basic version was proposed in [4], while the improvements introduced later involve speeding up the algorithm [22], improving its robustness to noise in range data, and improving robustness to wrong initial guess of the transformation [25]. In this paper, the standard ICP algorithm is used, in the implementation available in the PCL library. This choice makes it possible to asses how much the proposed feature-based matching method improves the self-localization with regard to the baseline method. Generally, the ICP algorithm is aimed at matching of two sets of spatial data (usually represented by points) describing the same object or part of a scene, and then to represent these sets in a common coordinate system. In each iteration, the ICP algorithm selects the closest points (Euclidean distance) in both clouds as the matching points. For matching points, the rotation R and translation t are computed, which minimize the criteria: nB nA ∑ ∑ ( R, t) = arg min wi,j ∥ pAi − ( R pBj + t)∥2 , R, t i=1 j=1
46
(1) where nA and nB are the numbers of points in the clouds A and B, respectively, wi,j are weights de�ined as wi,j = 1 when points pAi and pBj are closest neighbors or wi,j = 0 otherwise. Assuming that points are matched correctly, the transformation ( R, t) is computed by using the SVD method. When the transformation that minimizes (1) is computed, the A set of points is transformed to the new position and new correspondences of points are established. Due to the search for the closest neighbors the computational complexity of ICP is quadratic with regard to the number of points in the two clouds. A more effective way of searching neighbors can be achieved using k-d trees to represent the data. It is necessary to introduce a limit for the maximum Euclidean distance between neighbors, be-
VOLUME7,7, N° N◦4 4 2013 2013 VOLUME
cause the clouds may not represent the same part of the scene, and unlimited search for correspondences could yield a completely wrong spatial transformation. This limit, expressed by the dmax distance, prevents the algorithm from matching points that are too far away from each other. The ICP algorithm is only guaranteed to converge to a local minima, and may not reach the global one [4]. Because of this, it is extremely important to have a good initial guess for the sought ( R, t) transformation.
4.2. Feature-based matching of range data
An alternative approach to matching 3D data exploits salient features of various types. This approach is similar to matching 2D photometric images [14]. It is based on �inding some characteristic keypoints in both data sets, then describing them with unique descriptors, and �inding the matching between the computed descriptors. These features are salient points, which do not represent pre-speci�ied structures, and commonly appear in both man-made and natural environments. Because we are interested in the range data from Kinect, which we consider to be a richer representation of the environment than the photometric (RGB) data, we choose the Normal Aligned Radial Feature (NARF) detector/descriptor for our implementation of the feature-based point cloud matching procedure. The NARF concept was introduced for object recognition from 3D range data [29]. The NARF detector searches for stable areas with signi�icant changes in the neighborhood, that can be keypoints looking similar from various viewpoints. Detection is done by determining the plane minimizing the mean square error between the processed point, and other points contained in the prede�ined sphere around this point. When the mean square error does not exceed the given threshold, the tested point is considered to be a NARF keypoint. The most important parameter of the NARF detector is the diameter of the sphere around the processed point. This sphere contains points, which dominant directions are used to determine the �coef�icient of interest”, and then to detect the keypoints. Then, descriptors are built upon the areas around the keypoints. The NARF descriptor characterizes the keypoint by determining depth changes in each direction, calculating an aligned range value patch, and identifying the dominant orientation of this patch. Thus, descriptors are the normalized values of gradient in n directions with regard to the keypoint. The number of directions is also the length of the NARF descriptor, here n=36 is used. The number of NARF descriptors in a dataset depends on the environment characteristics, but in some cases hundreds of descriptors in one point cloud can be established.
�. Transforma�on between two �oint clouds with salient features
Once the keypoints and NARF descriptors are determined in both point clouds under consideration it Articles
45
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal of Journal of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
is necessary to compute the rotation R and translation t between these two data sets. In this step of the self-localization method the input data consists of a point set A (the �irst cloud) and a point set B (the second cloud). The set of all possible pairs of descriptors is de�ined as Z. For each descriptor of the �irst point cloud all possible pairings with the descriptors of the second cloud are established. For the set Z the matchings are de�ined by the descriptor dissimilarity function, which in our application is given as: fk =
min
j=0,1,...,mB −1
36 ∑ i=1
|pk [i] − qj [i]|,
(2)
where fk is the value of dissimilarity, which is the sum of absolute differences between 36 values describing the k-th descriptor of the cloud A, and the corresponding values of the descriptor of the cloud B. The descriptor of the cloud B is chosen to minimize the value of fk . In equation (2) pk [i] is the i-th value of the k-th descriptor of the �irst point cloud, while qj [i] is the i-th value of the j-th descriptor of the second point cloud, and mB is the number of descriptors of the second cloud. Then again, all pairings of the given descriptor of A are investigated, and the pairs for which the dissimilarity value is greater than 3 times fk are discarded as incorrect. Unfortunately, among the accepted pairing there could be incorrect correspondences, minimizing (2), but leading to a clearly incorrect estimate of the egomotion. To solve this problem we exploit the fact, that the walking robot is a physical system, having limited motion capabilities with regard to acceleration, velocity, and direction of motion. Taking this into account it is possible to introduce a set of constraints on the maximum translations and rotations with regard to the particular axes of the robot’s coordinate frame. These constraints are used to discard pairs of the descriptors for which the distance between their keypoints is bigger than the maximum distance: ddes
46
√ = d2desX + d2desY + d2desZ ,
(3)
where ddesX is the maximum distance in xr axis, ddesY is the maximum distance in yr axis, and ddesZ is the maximum distance in yr axis. Formula (3) is simpli�ied in order to allow fast computations, and does not take explicitly into account rotation of the robot, but for the limited range of Kinect measurements it is a reasonable approximation, as we do not expect to have keypoints on objects located far away. This condition allows to discard most of the incorrectly matched pairs from the Z set. However, the set of matched NARF pairs still can contain some outliers. Therefore, the point cloud transformation algorithm is applied within the robust estimation framework based on the RANSAC scheme. In each of its iterations RANSAC de�ines a subset N of the set Z, which contains 3 unique descriptor pairs. The next step of the algorithm computes the transforArticles
mation between those pairs:
arg min ∥ T P − Q∥, T
(4)
where T means the transformation combined of rotation R and translation t, while P and Q are matrices built from the coordinates of keypoints as follows: x11 y11 z11 P = x12 y12 z12 , (5) x13 y13 z13 x21 y21 z21 Q = x22 y22 z22 , (6) x23 y23 z23
where x1i , y1i , z1i are the coordinates of the i-th keypoint from the cloud A, belonging to the set N , while x2j , y2j , z2j are the coordinates of the j-th keypoint from the cloud B, also belonging to the set N . The 3D rigid body transformation that aligns two sets of points for which correspondence is known can be computed in several ways. In [7] four algorithms are compared, each of which computes the rotation R and translation t in closed form, as the solution to a formulation of the problem stated by (4). The results of comparison presented in [7] show that for the algorithms under study there is no difference in the robustness of the �inal solutions, while computing ef�iciency mainly depends on the implementation. Therefore, to solve (4) we chosen the algorithm based on Singular Value Decomposition (SVD) computation using the standard ( R, t) representation of the transformation. Although this method has been for the �irst time proposed by Kabsch [12] often a much later work on computer vision [1] is cited as the origin. After computing the transformation T new coordinates for the points of the cloud A are calculated. The next step involves computing the matching error: ϵ=
nA ∑ i=1
( hi − dj )2
for j =
arg min
j=0,1,...,nB −1
( hi − dj )2 ,
(7) where hi is the vector of coordinates of i-th keypoint of the point cloud A transformed by T, dj is the vector of coordinates of the j-th keypoint of the cloud B, which is the closest to hi . When the value of ( hi − dj )2 is greater than a �ixed parameter, the keypoint is treated as an outlier and the computed distance is not counted in (7). The necessary condition of stopping RANSAC is achieving the error ϵ below a preset value, or not improving the best model for a �ixed number of iterations, or exceeding the number of maximum iterations [17]. Alternatively, in the new version of our software the necessary number of RANSAC iterations can be estimated using a simple probabilistic model [5], which slightly improves on the computing time. When RANSAC is �inished, the transformation with the minimal error is performed on all points of the cloud A. This method transforms the point clouds to the convergence basin of the ICP algorithm, thus improving the chance of �inding the global minimum. The
47
VOLUME7,7, N° N◦4 4 2013 2013 VOLUME
Journal of Journal of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
points belonging to cloud A are transformed to the new coordinates by the formula: p′A [i] = R pA [i] + t
for
i = 1, 2, ..., nA
(8)
where pA [i] means the i-th point of the �irst point cloud. The newly computed set A′ is then the point cloud for the ICP algorithm searching for a more precise transformation of A′ into B.
6. ���erimental e�al�a��n �� t�e met��� 6.1. Preliminary tests
Testing Kinect as a sensor for walking robot selflocalization we conducted several experiments to evaluate robustness of the basic ICP algorithm and our NARF-based method for initial alignment of the point clouds. These initial tests were performed on pairs of point clouds obtained for various data sampling rates along the path of the sensor motion. First tests were aimed at determining the approximate maximum translational and rotational distance for which two Kinect point clouds can be successfully aligned by the ICP algorithm. Then, we tested if these limits can be relaxed by applying the NARF-based method.
ground truth for the sensor’s trajectory. The Kinect sensor was moved over a mockup of rocky terrain. The sensor was attached to the robot’s wrist and tilted down at an angle of 27◦ , as in the con�iguration used on the walking robot. An RGB-D rendered image of the experimental setup is shown in Fig. 4a. The processing times mentioned further in the paper were measured under Linux on a low-end PC notebook with an Intel Core2duo T6500 2.1GHz processor and 4GB of RAM. Although the notebook had a dual-core processor the single-thread program didn’t use this feature, effectively running in a single core con�iguration, similar to the one used on the actual Messor robot. In the experiment shown in Fig. 4 the robot’s end effector moved 20 cm along a straight line. Two datasets were obtained at the beginning, and at the end of the trajectory (Fig. 4b). The �irst dataset (A) is depicted by darker points, while the second one (B) is shown in lighter shade of grey. The small squares mark NARF keypoints for each of the point clouds, while descriptors that were used to determine the transformation between the clouds are marked with larger squares.
a
a
b
b
Fig. �. Rota�onal mo�on o� the sensor: point clouds with NARFs prior to matching (a), matching result with NARF+ICP (b)
c
d Fig. 4. Experiment with the industrial robot: terrain mockup as seen by Kinect (a), point clouds with NARFs prior to matching (b), matching result with ICP (c), matching result with NARF+ICP (d)
48
Experiments were carried out using the Kuka KR200 industrial robot, which allowed us to obtain
Results of this simple test have shown that for the displacement of 20 cm the standard ICP algorithm was unable to �ind any satisfying estimate of the egomotion of the sensor (Fig. 4c). It was not possible despite the fact that we have tested various values of the dmax parameter and allowed for thousands of iterations of the algorithm. In our experiments for large initial displacements between the point clouds the ICP algorithm usually converged to a local minima, yielding a wrong egomotion estimate. However, the NARFbased matching algorithm was able to �ind a good initial transformation of the A dataset, located in the convergence basin of the ICP algorithm. Thus, the ICP used as the second step of the self-localization method was able to �ind an acceptable estimate of the displacement: 17.3 cm, and a good alignment of the two clouds Articles
47
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal Journal of of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
(Fig. 4d). The time required by particular parts of the registration procedure was: detection and description of the NARFs 0.84 s, matching with the RANSAC procedure 0.25 s, and then the ICP algorithm 0.27 s. A similar experiment was performed with the rotational motion of the sensor. The robot wrist was rotated by 10◦ between two poses of point cloud acquisition (Fig. 5a). Again, the standard ICP could not �ind a satisfying transformation between the clouds. The initial displacement found by the NARF-based method was quite imprecise – a rotation by 4.7◦ and small translations along all axes. However, this initial transformation turned out to provide a good enough starting point for the ICP algorithm, transforming the clouds to its convergence basin. As the result of the two-step registration procedure a good alignment was achieved (Fig. 5b), with the rotation value close to the actual 10◦ . a
b
c
d
Displacement [m] Ground truth NARF-based alignment NARF+ICP estimate
∆xr 0.040 0.069 0.046
∆yr -0.028 -0.051 -0.048
∆zr -0.011 -0.076 -0.040
�ab. �. �s�mated egomo�on for the publicly available data
6.2. Indoor tests on the Messor robot Next, the proposed method was veri�ied on the Messor robot. Because we do not have any external motion capture system to measure the pose of the robot during the experiments, we compare the results of the Kinect-based self-localization to the results obtained from another self-localization system – Parallel Tracking and Mapping (PTAM) software [13], which uses monocular vision data from another camera mounted on Messor. The PTAM system offers precise tracking of the camera pose, and was already evaluated on the Messor robot with positive results [3]. However, PTAM works well only in environments rich in photometric features, and is limited to rather small workspaces. Therefore, PTAM was unable to provide self-localization results for longer trajectories, and we have decided to show here only results of pair-wise matching of two point clouds, similarly to the the industrial robot experiments.
a
b
c
d
Fig. 6. Results for the publicly available dataset: a view of the environment from the Kinect camera (a), point clouds with NARFs prior to matching (b), NARF-based ini�al alignment (c), matching result with NARF���� (d) Unfortunately, we do not have a motion capture system that can provide reliable ground truth for environments more extended than the relatively small mockup used with the industrial robot. Therefore, we evaluated the performance of our self-localization system also on a publicly available dataset1 . This dataset, which contains RGB-D data from Kinect, and timesynchronized ground truth poses of the sensor obtained from a motion-capture system was recently used to evaluate a SLAM algorithm [8]. Figure 6a shows an example RBG image taken from this dataset, while Fig. 6b presents two point clouds taken from this sequence (the �irst one is synchronized with the shown RGB image) with detected NARF keypoints. As it can be seen in Fig. 6c the NARF-based matching procedure provides quite good initial alignment of the two clouds, which is however further re�ined by the ICP method (Fig. 6c). Quantitative results (in meters) for this experiment are presented in Tab. 1. For this dataset the detection and description of NARFs took 0.735 s, and the estimation of initial alignment using RANSAC required 0.291 s. 48
Articles
Fig. 7. Results for the Messor robot traversing a simple mockup: a view from the Kinect camera (a), point clouds with NARFs prior to matching (b), NARF-based ini�al alignment (c), matching result with NARF���� (d) Although Messor does not have reliable odometry, the knowledge of the motion direction and the estimated length of the step was used to impose kinematic
49
VOLUME N◦4 4 2013 2013 VOLUME 7, 7, N°
Journal of of Automation, Journal Automation, Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
constraints on the egomotion estimates computed by the self-localization algorithm. The constraints are not so tight: 50 cm translation along each axis, and 30◦ rotation around each axis are allowed. These values are not violated even by sudden and unpredictable motions related to slippages of the walking robot [2]. They simply re�lect the fact, that the robot cannot be “teleported” during the relatively short time between the two moments of range data acquisition. Displacement [m] PTAM results NARF-based alignment NARF+ICP estimate
∆xr 0.043 0.047 0.039
∆yr 0.188 0.186 0.187
∆zr 0.003 0.069 0.007
a
b
c
d
�ab. �. �s�mated egomo�on for the simple moc�up e�� periment
50
The �irst dataset was obtained for translational motion of the Messor robot on a simple terrain mockup, which is visible in Fig. 7a. This environment contained few larger geometric features, so the NARF descriptors were mostly located on the borders of the mockup and on more distant objects, outside the mockup (Fig. 7b). Table 2 provides the egomotion estimation results for two versions of the range-databased point cloud registration procedure: the initial transformation with NARF keypoints (Fig. 7c), and the two-step procedure applying the ICP algorithm to the initially aligned point clouds (Fig. 7d). These results are compared to results provided by the PTAM system, which could be considered very accurate in this experiment, because the surface of the mockup is rich in photometric features. The NARF detection, description, and computation of the initial alignment required 1.055 s, while the whole self-localization procedure between two point clouds, with the NARF-based alignment and ICP took 1.543 s on a single-core PC. A similar experiment was conducted in one of corridors in our building. This corridor is rather wide and contains some furniture, as seen in Fig. 8a. The camera yielding images for the PTAM algorithm is mounted on Messor similarly to the Kinect sensor – it is tilted down in order to see the surface in front of the robot. Therefore, PTAM in our application relies mostly on photometric features found on the ground. In the corridor the �loor is very dark, and few natural photometric features can be found here, which makes PTAM quite unreliable in this environment. Therefore we put some small stones on the �loor to increase the number of features for PTAM. Table 3 provides results for the initial NARF-based transformation (Fig. 8c), and the full NARF+ICP version of the proposed method (Fig. 8d). In this case the NARF detection, description, and computation of the initial transformation took 1.756 s, because more RAN�AC iterations were needed to �ind an acceptable initial estimate of the egomotion. This was in turn caused by small number of NARFs that were found on the �loor. The whole self-localization procedure including ICP required 2.315 s. However, these results are still acceptable for our application, as the
Fig. 8. Results for the Messor robot moving in a corridor: a view from the Kinect camera (a), point clouds with NARFs prior to matching (b), NARF�based ini�al align� ment (c), matching result with NARF+ICP (d) walking robot needed four seconds to cover the distance of about 20 cm, using a tripod gait and making steps of the length of 5 cm. Thus, our system was able to complete all self-localization steps between the two consecutive data acquisition events. Displacement [m] PTAM results NARF-based alignment NARF+ICP estimate
∆xr -0.027 -0.468 -0.042
∆yr 0.217 0.246 0.244
∆zr 0.016 0.021 0.018
�ab. �. �s�mated egomo�on for the corridor e�peri� ment
6.3. Outdoor tests on the Messor robot In order to further verify the Kinect applicability to walking robot self-localization, experiments were carried out also in outdoor environment (Fig. 9a). When the sensor was exposed to direct sunlight, it was not possible to get any usable range measurements. However, it was possible to use Kinect for self-localization in the shade of a building wall. �ecause of the in�luence of sunlight (even in a shaded place) the obtained point clouds had only about 30% of the number of points typically obtained indoors (Fig. 9b). During the experiment, the robot moved forward, roughly along its y axis, on slightly uneven ground. The reduced number of points turned out to be suf�icient for the proposed algorithm. This is due to the fact that the visible (measured) areas are almost the same for both point clouds (Fig. 9c). Thus, the algorithm has found the correct displacement in spite of the environment conditions being dif�icult for the Kinect sensor (Fig. 9d). The time Articles
49
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal Journal of of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
required to �ind the NARFs and build the descriptors was only 0.15 s, obviously due to to the small number of points. The RANSAC procedure took 0.28 s – this value is quite stable across all our experiments. Finally, the ICP procedure required only few iterations to converge from a very good initial guess, thus the time was only 0.06 s.
a
tions and rotations between the viewpoints. This procedure provides a good initial estimate of the egomotion, which is within the convergence basin of the ICP algorithm. In turn, the ICP algorithm allows precise alignment of the two point clouds, using all the range information. The combination of these two steps yields a robust estimate of the robot’s egomotion in various environments, even under moderately varying lighting conditions. Further studies will be carried out to accelerate the ICP implementation, and to fully exploit the combined RGB-D data. Using the range and photometric information the robot should be capable of selflocalizing in both featureless areas (e.g. long corridors), and poorly illuminated areas. Implementation of a full scan-matching-based visual odometry system for the walking robot is also a matter of further research.
Notes
1 Data
downloaded http://vision.in.tum.de/data/datasets/rgbd-dataset
b
AUTHORS Michał Nowicki – Poznań University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań, Poland, e-mail: michalsminowicki@gmail.com. Piotr Skrzypczyński∗ – Poznań University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań, Poland, e-mail: piotr.skrzypczynski@put.poznan.pl. ∗
c
from:
Corresponding author
REFERENCES
[1] K. S. Arun, T. S. Huang, S. Blostein, ”Least-squares �itting of two 3-D point sets”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 9, no. 5, 1987, pp. 698–700.
d Fig. 9. Outdoor experimental setup (a), outdoor scene as seen by Kinect (b), two point clouds prior to matching with NARF correspondences (c), results of matching with NARF+ICP (d)
7. Conclusions This paper presented a two-step range data registration method, which is suitable for the inexpensive Kinect 3D range sensor mounted on a walking robot. The experiments have shown that the use of NARF descriptors matching brings the point clouds close to the correct position even for relatively large transla50
Articles
[2] D. Belter, P. Skrzypczyński, ”Rough terrain mapping and classi�ication for foothold selection in a walking robot”, Journal of Field Robotics, 28(4), 2011, pp. 497–528.
[3] D. Belter, P. Skrzypczyński, Precise selflocalization of a walking robot on rough terrain using PTAM, in: Adaptive Mobile Robotics (N. Cowan et al., eds.), Singapore, World Scienti�ic 2012, pp. 89–96. [4] P.J. Besl, N.D. McKay, ”A method for registration of 3-D shapes”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 14, 2, 1992, pp. 239–256. [5] S. Choi, T. Kim, W. Yu, ”Performance evaluation of RANSAC family”, In: Proc. British Machine Vision Conference, London, 2009. [6] A. Davison, I. Reid, N. Molton, O. Stasse, ”MonoSLAM: Real-time single camera SLAM”,
51
Journal of Automation, Mobile Robotics & Intelligent Systems Journal of Automation, Mobile Robotics & Intelligent Systems
IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, 2007, pp. 1052–1067.
[7] D.W. Eggert, A. Lorusso, R.B. Fisher, ”Estimating 3-D rigid body transformations: a comparison of four major algorithms”, Machine Vision and Applications, vol. 9, no. 5–6, 1997, pp. 272-290.
[8] F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, W. Burgard, ”An evaluation of the RGB-D SLAM system”, in: Proc. IEEE Int. Conf. on Robot. and Automat., St. Paul, 2012, pp. 1691–1696. [9] B. Gaßmann, J. M. Zöllner, R. Dillmann, ”Navigation of walking robots: localisation by odometry”. In: Climbing and Walking Robots VIII, Berlin, Springer 2005, pp. 953–960.
[10] P. Henry, M. Krainin, E. Herbst, X. Ren, D. Fox, ”RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments”, Int. Journal of Robotics Research, vol. 31, no.5, 2012, pp. 647–663.
[11] S. Izadi et al., ”KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera”, ACM Symp. on User Interface Software and Technology, New York, 2011, pp. 559–568. [12] W. Kabsch, ”A solution of the best rotation to relate two sets of vectors”, Acta Crystallographica 32:922, 1976. [13] G. Klein, D. Murray, ”Parallel tracking and mapping for small AR workspaces”, In: Proc. Int. Symp. on Mixed and Augmented Reality, Nara, 2007, pp. 225–234.
[14] D.G. Lowe, ”Distinctive image features from scale-invariant keypoints”, Int. Journal of Computer Vision, vol. 60, 2, 2004, pp. 91–110.
[15] P. Łabęcki, P. Skrzypczyński, ”Real-time visual perception for terrain mapping in a walking robot”, In: Adaptive Mobile Robotics (N. Cowan et al., eds.), Singapore, World Scienti�ic 2012, pp. 754–761. [16] M. Nowicki, P. Skrzypczyński, ”Experimental veri�ication of a walking robot self-localization system with the Kinect sensor”, Prace Naukowe Politechniki Warszawskiej – Elektronika. Postępy robotyki, vol. 182, Warsaw, 2012, pp. 561–572 (in Polish).
[17] M. Nowicki, P. Skrzypczyński, Robust registration of Kinect range data for sensor motion estimation, in: Computer Recognition Systems 5 (M. Kurzynski, M. Wozniak, eds.), Berlin, SpringerVerlag 2013. (in print)
[18] A. Nüchter, K. Lingemann, J. Hertzberg, H. Surmann, 6D SLAM – 3D mapping outdoor environments, Journal of Field Robotics, 24(8–9), 2007, pp. 699–722. [19] A. Nüchter, S. Feyzabadi, D. Qiu, S. May, SLAM à la carte – GPGPU for globally consistent scan
52
VOLUME 7, N◦ 4 2013 VOLUME 7, N° 4 2013
matching, in: Proc. 5th European Conf. on Mobile Robots (ECMR), Örebro, 2011, pp. 271–276. [20] L. Paz, P. Piniés, J. D. Tardós, J. Neira, Large-scale 6-DOF SLAM with stereo in hand, IEEE Trans. on Robotics, vol. 24, no. 5, 2008, pp. 946–957. [21] PrimeSense, http://www.primesense.com
2010,
[22] S. Rusinkiewicz, M. Levoy, ”E�icient variants of the ICP algorithm”, in: Proc. 3rd Int. Conf. on 3D Digital Imaging and Modeling, Quebec, 2001, pp. 145–152.
[23] R. B. Rusu, N. Blodow, Z. Marton, M. Beetz, ”Aligning point cloud views using persistent feature histograms”. In: Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Nice, 2008, pp. 3384–3391.
[24] R. B. Rusu, S. Cousins, ”3D is here: Point Cloud Library (PCL)”. In: Proc. IEEE Int. Conf. on Robot. and Automat., Shanghai, 2011, pp. 1–4. [25] A. Segal, D. Haehnel, S. Thrun, ”Generalized-ICP”. In: Proc. of Robotics: Science and Systems, Seattle, 2009 (on-line).
[26] P. Skrzypczyński, ”Simultaneous localization and mapping: a feature-based probabilistic approach”, Int. Journal of Applied Mathematics and Computer Science, 19(4), 2009, pp. 575–588. [27] P. Skrzypczyński, ”Laser scan matching for selflocalization of a walking robot in man-made environments”, Industrial Robot: An International Journal, 39(3), 2012, pp. 242–250.
[28] O. Stasse, A. Davison, R. Sellaouti, K. Yokoi, ”Realtime 3D SLAM for humanoid robot considering pattern generator information”. ”‘n: Proc. IEEE Int. Conf. on Intelligent Robots and Systems, Beijing, 2006, pp. 348–355. [29] B. Steder, R. B. Rusu, K. Konolige, W. Burgard, ”Point feature extraction on 3D range scans taking into account object boundaries”. In: Proc. IEEE Int. Conf. on Robot. and Automat., Shanghai, 2011, pp. 2601–2608.
[30] A. Stelzer, H. Hirschmüller, M. Görner, ”Stereovision-based navigation of a six-legged walking robot in unknown rough terrain”, Int. Journal of Robotics Research, vol. 31, no. 4, 2012, pp. 381–402. [31] T. Stoyanov, A. Louloudi, H. Andreasson, A. Lilienthal, Comparative evaluation of range sensor accuracy in indoor environments, in: Proc. 5th European Conf. on Mobile Robots (ECMR), Örebro, 2011, pp. 19–24.
[32] K. Walas, D. Belter, Messor – Verstatile walking robot for search and rescue missions, Journal of Automation, Mobile Robotics & Intelligent Systems, vol. 5, no. 2, 2011, pp. 28–34.
Articles
51
VOLUME 2013 VOLUME 7, 7, N°N◦4 4 2013
Journal of Automation, Automation, Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
��������� ������� S�� A������� �� ������ ���������� R���������� �� S����� I����� �u����ed: 1�th January 2013; accepted: 4th June 2013
Mateusz Żarkowski DOI: 10.14313/JAMRIS_4-2013/53 Abstract: �he ar�cle presents the preliminary concept of facial emo�on recogni�on system. �he approach focuses on the feature e�trac�on and selec�on process which simplifies the stage of defining and adding new elements to the feature set. �he evalua�on of the system was performed with two discriminant analysis classifiers, decision tree classifier and four variants of k-nearest neighbors classifier. �he system recogni�es seven emo�ons. �he verifica�on step u�li�es two databases of face images representing laboratory and natural condi�ons. �ersonal and interpersonal emo�on recogni�on was evaluated. �he best �uality of classifica�on for personal emo�on recogni�on was achieved by ��� classifier, the recogni�on rate was ��.�% for the laboratory condi�ons and ��.�% for natural condi�ons. �or interpersonal emo�on recogni�on the rate was 82.5%. Keywords: machine learning, image processing, social robo�cs
�� ��trod�c�o�
52
Emotions are inherent part of human nature and can be observed in the gestures of people, their way of movement or the tone of their voice, however the most important tool for emotional expression is the face. The �irst scienti�ic study of human facial expressions was done by Duchenne [1] and Darwin [2]. Modern psychologists de�ine six basic facial expressions of emotions – anger, disgust, fear, happiness, sadness and surprise [3]. As a way to standardize the process of emotion recognition, speci�ic groups of facial muscles are de�ined as facial Action Units (AU), which together form Facial Action Coding System (FACS) [4]. This system de�ines rules of how to compose Facial Action Units into particular facial expressions, as described in [5]. The basic structure of automatic facial expression analysis (AFEA) as desribed in [6, 7] consists of following steps: face acquisition, facial data extraction and representation, and facial expression recognition. The goal of face acquisition is detection and localisation of the face region in the input image or video sequence, most systems use Haar-based facial detection introduced by Viola and Jones [8]. In facial feature extraction for expression analysis, two main approaches can be distinguished: geometric feature-based methods [9, 10] and appearance-based methods [11]. The geometric facial features represent the position and shape of facial components (such as mouth, brows, Articles
eyes, etc.). The appearance-based methods utilize image �ilters (ie. �abor wavelets) applied over selected regions of the face image to extract feature vectors. The facial expression recognition system can classify the facial changes as prototypic emotional expressions (anger, disgust, fear, happiness, sadness and surprise) or as facial action units (AU of FACS) mentioned earlier. Moreover, depending on the utilization of temporal information frame-based and sequence-based approaches are de�ined. The additional review of studies in the �ield of facial expression recognition has been described in [12] and [13].
This paper serves an extended version of the article [14], which presents the preliminary results on the construction of emotion recognition system based on single-frame facial image analysis for its application by a social robot during human-robot interaction. Due to the nature of such interaction, the accuracy and the speed of the system are crucial, in order for the robot not to lose its social credibility. Seven emotion classes are taken into account: six basic emotions and a neutral facial expression. System performance in the case of one-person and multipersonal situations is evaluated. To solve the face acquisition problem the position of the face is tracked by pattern �itting with the use of Active Shape Models and Active Appearance Models [15, 16], which results in a 3-D mesh model. The FaceTracker [17] application is utilized for this purpose, because of its ability to provide real-time face tracking along with a good mesh resolution. Also, the software is open-source, making it easy to incorporate into the architecture of a social robot. The resulting facial mesh model is then subject to feature extraction. In usual approach, the number of extracted features is low, as each feature describes a speci�ic geometric property of the face, such as width of the lips or the angle of the brows. For example, in [18] only 24 features, 15 parameters of upper face and 9 parameters of lower face are de�ined. This becomes the serious issue in all recognition systems. How do we know if the chosen feature set is suf�icient� �ould the addition of a missing feature result in an improvement of classi�ication accuracy� How to �ind such features� The approach for feature extraction proposed in this paper is similar to brute-force attack in cryptography. A large number of features is calculated from the 3-D mesh model, then a feature selection algorithm should distinguish the features that best discriminate the given facial emotional expressions. The expression classi�ication is performed with basic classi�iers (discriminant analysis, k-nearest neighbours, decision
53
VOLUME N◦4 4 2013 2013 VOLUME 7, 7, N°
Journal of of Automation, Journal Automation, Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
trees), which should emphasize the general nature of the proposed methodology. Support Vector Machines and Neural Networks are often used in facial expression recognition, however, in this case, they could obscure the results of feature extraction step. Moreover, due to the modular nature of the system, more sophisticated classi�ication methods can be easily applied if needed. The results presented here should serve as a starting point for further system development.
Ϯ͘ ^LJƐƚĞŵ ƐƉĞĐŝĮĐĂƟŽŶ
The emotion recognition system consists of static facial image analysis in order to classify it as one of seven emotion classes. Before proper application, the system requires to be trained. The architecture overview is presented in �ig. 1. The following subsections describe the components of the system. Ϯ͘ϭ͘ &ĂĐŝĂů ŝŵĂŐĞ ƉĂƌĂŵĞƚƌŝnjĂƟŽŶ
The �irst step of the image processing is face detection and tracking. This results in a set of parameters that describe the selected properties of an aquired face that are used for further analysis. For this purpose the FaceTracker [17] application, developed by J. Saragih, was utilized. It performs face detection and tracking by iterative �itting of deformable 3� models of human face. More information about this application can be found in [19] and [20]. Yielded model consists of 66 points and connections between them that envelop the important features of the face. The points and connetions together form a three dimensional grid - a mesh. This model adapts to the face that it is tracking, taking into acount such parameters as the positioning of eyebrows, mouth and jaw. The example of FaceTracker performance is shown in �ig. 2. This �itting results in a set of points in a three dimensional space that describe the deformed model in its standardized orientation, the scale of the model and its orientation. The above parameters along with the desciprition of the mesh connections are transfered to the next processing stage. Ϯ͘Ϯ͘ &ĞĂƚƵƌĞ ĞdžƚƌĂĐƟŽŶ͕ ƐĞůĞĐƟŽŶ ĂŶĚ ĐůĂƐƐŝĮĐĂƟŽŶ
54
Presented methodology is based on performing the training stage before the proper use of the system. The training data have a twofold task. First, because of this data it is possible to reduce the dimensionality of the feature vector by identifying the features that provide the best distinction between the recognized classes. At the system application stage, only the features that carry the most important information are calculated, which speeds up the feature extraction process signi�icantly. Second, the reduced feature set is used to train the classi�ier. The next section presents how the features are calculated and tools that are used for feature selection and classi�ication. In the proposed approach, the identi�ication system uses only the geometric data obtained through the FaceTracker application, without taking into account any information about the texture. Pre-processing and parametrization of the face produces a set P consisting of 66 vectors that desribe the face. These vec-
tors represent the position of characteric points of the three-dimensional model that has been a subject to the subsequent deformations during the process of face detection. They are formed by the projection onto the frontal plane, which due to the standardized form of the model (both model scale and its orientation are separate parameters) requires only the removal of the depth component. Let pxi and pyi be the corresponding components of the i-th vector, and peuc denote its i norm. { } P = p⃗i : p⃗i ∈ IR2 ; i = 1, . . . , 66 , √ = (pxi )2 + (pyi )2 . p⃗i = (pxi , pyi ), peuc i
(1)
Also, we introduce a set of vectors A describing the average face, which is a special case of the set P . Calculation of the characteristic points Ai of the set A is done by averaging all the faces in the training set. This procedure allows to determine the deviation ∆pi between the face that is being processed and the average face of the whole training set (2)
∆pxi = pxi − Axi , ∆pyi = pyi − Ayi ,
∆peuc = i
(3)
√ (∆pxi )2 + (∆pyi )2 .
(4)
Let D denote the set of distances between all of the characteristic points, while also adding the constraint of no redundancy of this data.(This ) means that the cardinality (size) of the set D is 66 2 = 2145 D = {di,j : i = 1, . . . , 65; j = i + 1, . . . , 66} .
(5)
The proper combination of the characteristic points from the set P is de�ined by the mesh of the model. Let the mesh consist of T (in this case T = 91) triangles whose vertices are labeled by the indices of the characteristic points. Let us de�ine C as the set of triples of natural numbers, which represent the indices of vertices for each triangle. This allows to calculate the angles α inside each of these triangles { } α = α(i,j,k) , α(j,k,i) , α(k,i,j) : (i, j, k) ∈ C ( α(i,j,k) = arccos
p⃗ji · p⃗jk |p⃗ji ||p⃗jk |
) ,
(6)
p⃗ij = p⃗j − p⃗i . (7)
As in the case of positions, angles are also calculated for the average face A, which allows the calculation of relative changes in these angles. By denoting A the angles of the average face as αi,j,k , these changes are describe with the following equation { } A ∆α = ∆α(i,j,k) = α(i,j,k) − α(i,j,k) .
(8)
Articles
53
◦ VOLUME 2013 VOLUME 7, 7, N°N 4 4 2013
Journal Journal of of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
&ŝŐ͘ ϭ͘ ƌĐŚŝƚĞĐƚƵƌĞ ŽĨ ƚŚĞ ƉƌŽƉŽƐĞĚ ĞŵŽƟŽŶ ƌĞĐŽŐŶŝƟŽŶ ƐLJƐƚĞŵ
&ŝŐ͘ Ϯ͘ džĂŵƉůĞ ŽĨ &ĂĐĞdƌĂĐŬĞƌ ƉĞƌĨŽƌŵĂŶĐĞ͗ ƚŽƉ ŝŵĂŐĞ ʹ ŚĂƉƉŝŶĞƐƐ͕ ďŽƩŽŵ ŝŵĂŐĞ ʹ ĨĞĂƌ Finally, the elements described by above form a feature vector f by concatenation. The resulting vector is of following size: 66 points · 6 + 2145 distances + 91 triangles · 3 angles · 2 = 3087 T
f = [px py peuc ∆px ∆py ∆peuc D α ∆α] .
54
(9)
�ue to the signi�icant size of the vector f , the feature selection process is needed. As mentioned earlier, Articles
selection reduces the dimension of the feature vector, which leads to increasing the speed of calculations and simpli�ies the classi�ication. Feature selection can be easier understood as a case of state space search. The state is de�ined as a certain subset of selected features, and the actions consist of adding new features to this subset or removal of any of those present. The whole problem can be then decomposed into two tasks: evaulation of a subset of attributes and methods 55
VOLUME7,7, N° N◦4 4 2013 2013 VOLUME
Journal of Journal of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
for searching the state space. This heuristic approach is necessary because the large size of the feature vector prevents a brute force search of the state space. The features selection is a single-time process that occurs only during the training of the system. The designated set of features is used to build the classi�ier, while the use of the system, the knowledge of the indices of selected features allows the calculation of only the chosen attributes. The task of selection was carried out using the software WEKA [21]. To evaluate the subset of attributes the evaluation method based on correlation CfsSubsetEval was used, which is described in [22]. As a method of state space search algorithm BestFirst was selected. The choice of these methods was done on the basis of previous studies [23] and [24]. The classi�ication problem here is to build such a classifying function which is able to correctly assign a particular object to the corresponding class through the appropriate processing of the training data by supervised learning. All of the the data is strictly numerical and labeled. In the case of the presented identi�ication system, the recognition is done by using elementary classi�iers, such as linear (DA�in) and quadratic (DA�uad) discriminant analysis classi�iers, k-nearest neighbour (KNN) for k = 1, 3, 5, 7 as well as decision trees (DecTree). Due to the modular architecture of the system it is possible to employ more complex classi�iers (ie. neural network, support vector machine), however, for the purpose of clarity only simple classi�iers are used in this paper.
3. Results
3.�. ��t� ���u�s����
56
In order to evaluate the performance of the system two sets of data were used: images gathered in laboratory environment and in natural conditions. Solving an easier problem (laboratory conditions) should determine whether the proposed feature set is suf�icient for correct classi�ication. Then, if the system works satisfactorily, one should consider introducing additional factors. The MUG facial expression database [25] was used as input data set for laboratory conditions. The database contains frontal face images for more than 50 people aged 20-35 years. Each person was asked to express six basic emotions and a neutral facial expression. The pictures were taken at 896x896 resolution, at a rate of 19 frames per second, in a professionally equipped photography studio under favorable lighting conditions. Due to the fact that the recorded sequences start and return to a neutral facial expression, it was necessary to extract the appropriate emotion images from the middle of the sequence. The natural conditions were simulated by conducting photo sessions for three people at the age of about 25 years. The pictures were taken by a standard laptop camera, working at the resolution of 640x480 in normal room lighting conditions. People who took part in the experiment were instructed on how to generate speci�ic facial expressions depicting a particular
emotion in accordance to FACS system [4]. During the recording the person looked around the immediate vicinity of the camera, thus simulating the gentle head movements which occur during a standard conversation. For each person, six photo sessions took place, each consisting of seven emotion classes. For every emotion in each session ten pictures were taken. Total number of images for every person was 6 sessions · 7 emotion classes · 10 pictures = 420 face images. The interval between pictures was set at 800 ms, due to the performed movements of the head. 3.�. �e��������e e��lu����
In order to estimate the practical accuracy of the designed emotion recognition system two experiments were conducted: personal and interpersonal recognition. The goal of the �irst experiment was to recognize the facial emotions of a single person using only the images of his/her own face. This results in a personal recognition system which should perform well for a given person, while omitting its usefulness for others. Such a system was constructed and evaluated for six people – three from laboratory conditions database and three from natural conditions database. Moreover, two additional evaluations were made – the data of three people from laboratory conditions were merged into one dataset that was used to build a system designed for emotion recognition of these three people combined, the data of three representatives of natural conditions were subject to the same process. The assessment of the results was done through 10fold crossvalidation – the input data set was divided into ten subsets, next the recognition process was performed ten times, each time one of the subsets was chosen as validation data while the others were used to train the system. It is vital to say that the feature selection process is a integral part of system training and was performed anew for each of the training data. The success rate was evaluated as a ratio of correctly classi�ied instances to the total number of samples. The results are shown in tab. 1. In each case, the k-nearest neighbor classi�ier performs best. The greater number of neighbor decreases the success rate of the classi�ication but increases the robustness of the system if the training data are subject to noise. It is noteworthy that, while performing very well for laboratory conditions, the discriminant analysis classi�iers struggle with natural conditions and merged datasets. This leads to two conclusions. Firstly, complex problems may require complex tools which suggests employing more sophisticated classi�iers. Secondly, each modi�ications at feature extraction and selection steps should be evaluated by classi�iers of lower success rate. Although this is seemingly counter-intuitive, it allows for a clearer observation of improvement. In addition, less complex classi�iers usually run faster, so if their effectiveness will be comparable to the more complex (but slower) classi�iers (see tab. 1, laboratory conditions), one can obtain increased speed at the expense of the ef�iciency of the recognition system. The second experiment focused on recognizing the Articles
55
VOLUME VOLUME 7, 7, N°N◦4 4 2013
Journal Journal of of Automation, Automation,Mobile MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
�ab� �� Success rate �in �� for personal emo�on recogni�on for LA�orator� and NA�ural condi�ons� Numbers represent specific people, ALL represents a combined dataset consis�ng of all sub�ects
DALin DAQuad DecTree KNN1 KNN3 KNN5 KNN7
LAB 1
LAB 2
LAB 3
LAB ALL
NAT 1
NAT 2
NAT 3
NAT ALL
99.67 99.67 96.28 100.00 100.00 100.00 100.00
99.79 99.58 98.75 100.00 100.00 100.00 100.00
99.71 100.00 99.71 100.00 100.00 99.71 99.71
92.63 98.23 96.54 99.91 99.91 99.91 99.91
86.19 89.29 88.81 95.71 93.81 94.05 92.38
88.81 94.52 90.95 99.05 97.86 97.62 97.62
94.52 97.38 95.48 99.52 99.29 98.81 98.33
80.95 84.05 85.87 97.30 96.35 95.63 95.56
�ab� �� Success rate �in �� for interpersonal emo�on recogni�on for LA�orator� condi�ons� Numbers represent specific people, AVG is an average performace of a classifier
DALin DAQuad DecTree KNN1 KNN3 KNN5 KNN7
LAB 1
LAB 2
LAB 3
LAB 4
LAB 5
LAB 6
LAB 7
LAB 8
LAB 9
LAB 10
AVG
80.00 91.43 75.71 95.71 91.43 91.43 91.43
75.71 81.43 87.14 87.14 87.14 87.14 85.71
100.00 91.43 75.71 95.71 97.14 98.57 95.71
74.29 94.29 65.71 82.86 82.86 84.29 84.29
87.14 87.14 85.71 77.14 77.14 80.00 82.86
60.00 51.43 55.71 67.14 67.14 68.57 74.29
74.29 88.57 65.71 77.14 82.86 77.14 80.00
77.14 85.71 98.57 87.14 94.29 92.86 90.00
57.14 75.71 68.57 90.00 75.71 72.86 74.29
58.57 51.43 70.00 60.00 70.00 68.57 68.57
74.43 79.86 74.86 82.00 82.57 82.14 82.71
�ab� �� �omparison of success rate for personal emo�on recogni�on bet�een full feature vector �NS � no selec�on� and a SELected feature vector
DALin DAQuad DecTree KNN1 KNN3 KNN5 KNN7
NAT1 NS
NAT1 SEL
NAT2 NS
NAT2 SEL
NAT3 NS
NAT3 SEL
NS AVG
SEL AVG
73.33 79.52 86.19 95.24 94.76 93.10 89.52
86.19 89.29 88.81 95.71 93.81 94.05 92.38
85.00 87.86 90.71 98.33 98.10 97.14 96.43
88.81 94.52 90.95 99.05 97.86 97.62 97.62
89.05 97.14 93.81 99.29 99.05 98.33 97.86
94.52 97.38 95.48 99.52 99.29 98.81 98.33
82.46 88.17 90.24 97.62 97.30 96.19 94.60
89.84 93.73 91.75 98.09 96.99 96.83 96.11
facial emotions of a person that is completely new to the system based on the portrayals of emotions submitted by other people. This interpersonal recognition system should be able to correctly assign emotions to various people, however the recognition success rate is expected to drop signi�icantly in comparison with the personalized system. The system was constructed using data from ten people, all from laboratory conditions database. In respect to crossvalidation, each person was considered a subset of its own – the recognition process was done ten times, each time the emotions of nine people were used as training data while the emotions of the remaining person was treated as validation data. The results are presented in tab. 2. The drop in success rate is indeed signi�icant. This is due to the personal differences in anatomy as well as in execution of certain emotion by a speci�ic person. One can observe that two highest recognition results – subject 1 and 3 – are expected to have very similiar anatomy and manner of facial expressions. In fact, they both are women of delicate body consitution 56
Articles
and their expressions are much alike. In case of the lowest result – subject 10 – the confusion matrix (not included here) shows that two of his emotional expressions were classi�ied as entirely wrong, whereas other �ive were mostly correct. This is most probably due to the lack of appropriate template in the training data. In conclusion, ten people are not enough to cover the whole feature space for emotion recognition – more samples are needed – however, the obtained success rate of above 80% is still satisfactory, but can be easily improved. The results are inconclusive as to which classi�ier has performed the best. The achieved recognition rate is dif�icult to compare between diffent expression recognition systems, due to the different databases used for system evaluation. However, results obtained by other researchers can be found in [6]. ϯ͘ϯ͘ &ĞĂƚƵƌĞ ƐĞůĞĐƟŽŶ ĂŶĂůLJƐŝƐ
The results of classi�ication in the case when the feature selection process has been omitted are presented in tab. 3. Feature selection improves the
57
VOLUME N◦4 4 2013 2013 VOLUME 7, 7, N°
Journal of of Automation, Automation, Mobile Journal MobileRobotics Robotics&&Intelligent IntelligentSystems Systems
recognition quality and at the same time signi�icantly speeds up the system performance – a vector of more than 3000 features is reduced to about 100 elements, which drastically simpli�ies feature calculation as well as classi�ier training and use.
teusz.zarkowski@pwr.wroc.pl.
This paper presents a preliminary concept of visual emotion recognition system for the use of a social robot. The proposed methodology focuses on feature extraction and selection, which simpli�ies de�ining new features and adding them to the feature set. Face tracking and parametrization is performed by a designated application FaceTracker [17], while feature selection is done by data mining software WEKA [21]. The evaluation of the system uses two discriminant analysis classi�iers, decision tree classi�ier and four variants of k-nearest neighbors classi�ier. The system recognizes seven emotions. The veri�ication step utilizes two databases of face images representing laboratory and natural conditions. Personal and interpersonal emotion recognition was evaluated. The best quality of classi�ication for personal emotion recognition was achieved by 1�� classi�ier, the recognition rate was 99.9% for the laboratory conditions and 97.3% for natural conditions. For interpersonal emotion recognition the rate was 82.5%. The obtained results are promising and suggest further development of proposed methodology. There are different directions in which the project can be developed. One can easily add new features, such as texture information, in order to enhance the classi�ication rate. More sophisticated classi�iers can be used to increase the robustness of the recognition system. More training data can be presented for better representation of the feature space. Moreover, adaptive learning algorithms are needed for practical application in a social robot working in a dynamic environment. Author strongly believes in multimodal approach for the emotion recognition problem. Combining the visual cues with speech processing can provide more plausible emotional information. The immediate step will focus on processing of the obtained information in order for contextual analysis. Some work has already been done in the direction of utilizing this software for people identi�ication. Being able to tell people appart, the system can serve with personal emotion recognition and a social robot should provide an individual experience for each user.
[2] C. Darwin, The Expression of the Emotions in Man and Animals. Anniversary edition, Harper Perennial 1872/2009.
4. Summary
Acknowledgements
This research was supported by Wroclaw University of Technology under a statutory research project.
58
AUTHOR Mateusz Żarkowski∗ – Wroclaw University of Technology, Institute of Computer Engineering, Control and Robotics, e-mail: ma-
∗
Corresponding author
REFERENCES
[1] G.B. Duchenne de Bologne, Mechanisme de la Physionomie Humaine. Paris, Jules Renouard Libraire 1862. [3] P. Ekman, Emotion in the Human Face. Cambridge University Press 1982.
[4] P. Ekman, W. Friesen, Facial Action Coding System: A technique for the measurement of facial movement. Palo Alto, Consulting Psychologists Press 1978.
[5] M. Pantic, L. Rothkrantz, ”Expert system for automatic analysis of facial expressions”, Image and Vision Computing Journal, 2000, vol. 18, no. 11, pp. 881–905.
[6] Y. Tian, T. Kanade, J. Cohn, Facial Expression Recognition. Handbook of Face Recognition, 2nd ed. 2011, Springer [7] M. Pantic, M. Barlett, ”Machine Analysis of Facial Expressions”, In: Face Recognition. I-Tech Education and Publishing, 2007, pp. 377–416.
[8] P. Viola, M. Jones, ”Robust real-time face detection”, International Journal of Computer Vision, 2004, vol. 57, pp. 137–154.
[9] X. Li, Q. Ruan;, Y. Ming, ”3D Facial expression recognition based on basic geometric features”, 2010 IEEE 10th International Conference on Signal Processing (ICSP), 2010.
[10] I. Kotsia, I. Pitas, ”Facial expression recognition in image sequences using geometric deformation features and Support Vector Machines”, IEEE Trans Image Process, vol. 16, no. 1, 2007, p. 172–187. [11] C. Lien, L. Lin; C. Tu, A New Appearance-Based Facial Expression Recognition System with Expression Transition Matrices. Innovative Computing Information and Control, 2008.
[12] M. Pantic, L. Rothkrantz, ”Automatic analysis of facial expressions: The state of the art”, IEEE Trans. Pattern Anal. Mach. Intell., December, 2000, vol. 22, vo. 12, pp. 1424–1445. [13] B. Fasel, J. Luettin, ”Automatic facial expression analysis: A survey”, Pattern Recognition, 1999, vol. 36, no. 1, pp. 259–275.
[14] M. Żarkowski, ”Facial emotion recognition in static image”. In: 12 National Conference on Robotics, Świeradów-Zdrój, 2012, vol. 2, pp. 705–714 (in Polish). [15] T. F. Cootes et al. ”Active Shape Models – their training and application”, Comput. Vis. Image Underst., January 1995, vol. 61, no. 1, pp. 38–59. Articles
57
Journalof ofAutomation, Automation,Mobile MobileRobotics Robotics&&Intelligent Intelligent Systems Journal Systems
VOLUME 7, 7, N° VOLUME N◦44
2013 2013
[16] T. Cootes, G. Edwards, C. Taylor. ”Active Appearance Models”, IEEE Trans. Pattern Anal. Mach. Intell., June, 2001, vol. 23, no. 6, pp. 681–685.
[22] M.A. Hall, Correlation-based Feature Subset Selection for Machine Learning. PhD thesis, University of Waikato, Hamilton, New Zealand, 1998.
[18] Y. Tian, T. Kanade, J. Cohn. Recognizing action units for facial expression analysis IEEE Trans. Pattern Anal. Mach. Intell., 23(2), 2001, pp. 1–19.
[24] M. Żarkowski, Set of procedures helping through the learning process of using EMG signals for control. Master thesis, Wrocław University of Technology, Wrocław, 2011 (in Polish).
[17] J. Saragih, FaceTracker, http://web.mac.com/jsaragih/FaceTracker/ FaceTracker.html
[19] J. Saragih, S. Lucey, J. Cohn, ”Face alignment through subspace constrained mean-shifts. In: ICCV. Proceedings, 2009, pp. 1034–1041.
[20] J. Saragih, S. Lucey, J. Cohn, ”Deformable model �itting by regularized landmark mean-shift”. International Journal of Computer Vision, 2011, nol. 91, no. 2, pp. 200–215. VOLUME 7, N◦ 4 2013 [21] Weka 3: Data Mining Software in Java, http://www.cs.waikato.ac.nz/ml/weka/
[23] Ł. Juszkiewicz, Speech emotion recognition for a social robot. Master thesis, Wrocław University of Technology, Wrocław, 2011 (in Polish).
[25] N. Aifanti, C. Papachristou, A. Delopoulos, ”The MUG facial expression database”. In: 11th Int. Workshop on Image Analysis for Multimedia Interactive Services. Proceedings, Desenzano, Italy, April 12–14, 2010.
[22] M.A. Hall, Correlation-based Feature Subset Selection for Machine Learning. PhD thesis, University of Waikato, Hamilton, New Zealand, 1998.
pearch. In-
[23] Ł. Juszkiewicz, Speech emotion recognition for a social robot. Master thesis, Wrocław University of Technology, Wrocław, 2011 (in Polish).
/
action Trans. 1–19.
[24] M. Żarkowski, Set of procedures helping through the learning process of using EMG signals for control. Master thesis, Wrocław University of Technology, Wrocław, 2011 (in Polish).
nment . In:
[25] N. Aifanti, C. Papachristou, A. Delopoulos, ”The MUG facial expression database”. In: 11th Int. Workshop on Image Analysis for Multimedia Interactive Services. Proceedings, Desenzano, Italy, April 12–14, 2010.
model Interol. 91,
58
Articles
59
Journal of Automation, Mobile Robotics & Intelligent Systems
S
E
VOLUME 7,
R
S
S
N◦ 4
2013
R
Submi ed: 15th January 2013; accepted: 4th June 2013
Łukasz Juszkiewicz DOI: 10.14313/JAMRIS_4-2013/59 Abstract: The paper presents a speech emo on recogni on system for social robots. Emo ons are recognised using global acous c features of the speech. The system implements the speech parameters calcula on, features extrac on, features selec on and classifica on. All these phases are described. The system was verified using the two emoonal speech databases: Polish and German. Perspecves for using such system in the social robots are presented. Keywords: speech emo on recogni on, prosody, machine learning, Emo-DB, intona on, social robot
1. Introduc on Recently, a new ield of robotics is being developed: a social robotics. A social robot is able to communicate with people in an interpersonal manner and achieve social and emotional goals [25]. Social robots are meant to collaborate with people and be their companions in applications such as education, entertainment, health care etc. Emotions play a signi icant role in an interpersonal communication, so an ability to recognise other people’s emotions from speech and adapting one’s response is an important social skill. Person without it, would have dif iculties with living in a society. Similarly, the social robot needs to be able to recognise expressions of emotions, so that it could communicate with man in the natural manner and could be his companion. Information on emotion is encoded in all aspects of language, in what is said and in how it is said or pronounced, and the “how” is even more important than the “what” [29]. Considering all levels of language, from pragmatics down to the acoustic level, the following thing can be said: Starting with pragmatics, intention of speaker is highly correlated with his emotional state [22]. The literal meaning of an utterance is the most obvious display of emotions, so the statements or keywords such as “I am sad” can be treated as emotion indicators [19]. However, explicit expressions of emotions can be intended as ironic or not express true emotions of the speaker. Another indicator of the emotions is a tone of a voice, that is the phonetic and acoustic properties of speech. For example, the cracking voice could be the evidence of an excitement. Voice quality and the prosody (pitch, intensity, speaking rate) have been best researched in the psychological studies. They also
intuitively seem to be most important in expression of emotions. An often cited review of literature on emotions in speech was written by Murray and Arnott [17]. They refer to a number of studies which seem to have identi ied almost explicit correlation between emotions and acoustic parameters. However in the studies of different authors con licting results can be found. This is probably due to the large variability of the expression of emotions and different variants of the certain emotions, such as “hot” and “cold” anger [8]. The system has been developed to recognise emotions in recorded statements on the basis of acoustic features of speech. Section 2 discusses the structure of the system and its implementation. Section 3 presents the results of system’s veri ication. In conclusion the results are evaluated and further development plans are presented.
2. Speech Emo on Recogni on System Commonly used pattern recognition algorithms are applicable to the speech emotion recognition problem. However, there are at least two different approaches. One is estimating the short-time signal parameters and modelling their changes with the Hidden Markov Models or similar [16,18]. The other is extracting global features of the signal and applying statistical methods and various types of classi iers: SVM [5, 33], arti icial neural networks [9, 30], decision trees [26] and other. The second approach was chosen — each utterance is analysed as a whole, so global features are extracted and then classi ied. There is no compliance about an optimal set of features for speech emotion recognition. Moreover, such a set would depend on a number and type of emotions being recognised, language, recording conditions etc. Therefore, a standard approach is to extract very large features vector (even about 4,000) and then reduce the number of features to obtain a subset that has better discriminative power in particular task [23]. Commonly used features selection algorithms are principal component analysis, linear discriminant analysis, information gain ratio, sequential forward loating search [24, 31, 32]. The speech emotion recognition system [28] has the form of a set of MATLAB scripts using external tools — the programs Praat and Weka. Praat is a free (Open Source) program for phonetic analysis of speech. It can compute several parameters of speech: pitch, formants, spectrum, MFCC and many others [20]. Weka 3 (Waikato Environment for Knowledge Analysis) is a popular suite of machine learning soft59
Journal of Automation, Mobile Robotics & Intelligent Systems
ware written in Java. It contains a collection of visualisation tools and algorithms for data preprocessing, iltration, clasterisation, classi ication, feature selection and regression modelling [10]. It is freely available under GNU General Public License. This form of the system allows easy modi ications and facilitates testing various algorithms. The structure of the system is illustrated in igure 1. There are two phases of the system’s operation: learning phase and evaluation phase. In the off-line learning phase feature selection and supervised learning of classi ier is carried. In the evaluation phase the prepared system is used for on-line speech emotion classi ication.
VOLUME 7,
N◦ 4
2013
bank of 26 bandpass ilters with a central frequency equally spaced on a mel-scale, re lecting the subjectively perceived pitch. Subsequently, a discrete cosine transform is used to a logarithmised spectrum into a cepstrum. Only the irst 12 coef icients are used. Additionally, a 13rd series is computed as the average of all 12 series of the coef icients. Harmonics to noise ra o (HNR) is energy of the harmonic parts of the signal related to the energy of the noise parts. HNR is expressed in dB and computed using the autocorrelation method and the crosscorrelation method [1].
2.1. Speech signal acquisi on The irst stage of a recognition process is a speech signal acquisition — an acoustic signal has to be transformed into an electrical signal and then digitalised. Quality of a used microphone and a preampli ier plays important role — distortions may affect recognition accuracy. Regardless of the quality of the recording equipment there still is a problem of an environmental noise and the noise generated by motors and other moving parts of the robot itself. It was assumed that in recordings used for classi ication, signal to noise ration would be high enough and there would be no additional voices (one person speaking at the moment).
Long-Term Average Spectrum (LTAS) is averaged logarithmic power spectral density of the voiced parts of the signal with an in luence of the pitch corrected away [13]. 2.3. Feature extrac on In order to extract more useful information from obtained parameters vectors additional vectors are derived from them: - irst and second order difference, - values of local minima, - values of local maxima,
2.2. Speech signal parametrisa on
- distance between adjacent extrema,
To determine the emotion of the speech, six parameters of the speech signal were used: intensity, spectrogram, pitch, mel-frequency cepstral coef icients, long-term average spectrum and harmonics to noise ratio. Some of them were computed using several different methods, because they gave slightly different effects.
- value of difference of the adjacent extrema,
Intensity is the instantaneous sound pressure value measured in dB SPL, ie dB with respect to pressure of 2 · 10−5 Pa [20].
- slopes between the adjacent local extrema, - absolute values of the two above. The acoustic features are time-series — their length depends on the duration of analysed utterance. For classi ication purposes, it is necessary to convert the time series into a feature vector of ixed length. This is achieved by treating time series as outcomes of random variables and computing their statistics: - arithmetic mean, - median,
Spectrogram The speech signal is split into 16ms frames with a 10ms step. Each frame is Fourier transformed to compute the magnitude of the frequency spectrum. The phase is neglected. Then logarithm is taken from the magnitude spectrum, and the result is appended to a matrix. This matrix is called spectrogram [12].
- standard deviation,
Pitch is a fundamental frequency of the speech. It is produced by vocal folds (known commonly as vocal cords). Two algorithms were used to estimate the pitch: autocorrelation method [1] and crosscorrelation method [21]. The pitch exists only for the voiced parts of the signal, so resulting waveform is discontinuous. Built in smoothing and interpolation functions are used to overcome this issue.
- interquartile range. Further they will be referred to as the basic statistics. Algorithm of extracting the features from the raw acoustic features is basically uniform (except for spectrogram and LTAS). For each of them, treated as time series, there are computed: - basic statistics,
Mel-frequency Cepstral Coefficients (MFCC) are commonly used for parametrisation of the speech [20,27]. Spectrum of the windowed signal is analysed by a 60
- global maximum, - global minimum, - irst quartile, - second quartile, - range,
- linear regression coef icients, - basic statistics of derived series, - linear regression coef icients of local maxima, - linear regression coef icients of local minima.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N◦ 4
2013
Training
Speech signal acquisition (MATLAB)
Feature extraction (Praat & MATLAB)
Feature selection (Weka)
Feature extraction (Praat & MATLAB)
Removing unnecessary features (Weka)
Classificator training (Weka)
Testing
Speech signal acquisition (MATLAB)
Classification (Weka)
Fig. 1. Block diagram of speech emo on recogni on system. For each of the spectra forming spectrogram and for LTAS, there are computed: - linear regression coef icients,
Feature selection is done in training phase. Therefore the selected feature vector depends on the training set.
i Ei - centre of gravity, de ined as CG = Σf ΣEi , where: fi — i-th frequency, Ei — energy of fi ,
2.5. Classifica on
- 9th and 1st decile difference, - slope between global maximum and minimum. For LTAS, coef icients listed above are inal features. For spectrogram basic statistics of them are computed to obtain inal features. Using this method 1722 features are generated. The structure of the feature vector is illustrated in igure 2. The number of features is then reduced in the feature selection process.
Fig. 2. Structure of feature vector. 2.4. Feature selec on Weka’s function AttributeSelection was used to remove features that are redundant or not relevant for this particular task. This function requires subset evaluation and search functions. For evaluation correlation-based feature subset selection CfsSubsetEval was used — subsets of features that are highly correlated with the class while having low intercorrelation are preferred [11]. Search function was BestFirst, which searches the space of attribute subsets by greedy hill climbing augmented with a backtracking facility.
Several different classi iers, provided by the Weka package, were tested: - multilayer perceptron, - support vector machine with sequential minimal optimisation, - radial basis function network, - Bayes network. Results obtained from each of those classi iers were very similar. The feature selection stage seems to be crucial — selecting minimal vector of discriminative features makes the choice of classi ier a minor problem. In section 3 the average recognition ratios for every classi ier are presented. However, detailed classi ication results are presented only for BayesNet classi ier — Bayes network that can use various search algorithms and quality measures [2]. Here, SimpleEstimator (estimating probabilities directly from data) and K2 (implementing hill climbing) were used [7]. Advantage of Bayes Network classi ier is the relatively low complexity while maintaining performance comparable with more complex classi iers. Additional advantage is explicit representation of knowledge — created network can be analysed or used for other purposes. Learning of Bayesian classi ier is prone to mutual correlation of features — redundant ones could be dominant [14]. However, the algorithm used for feature selection removes redundant features, so this problem is eliminated.
3. Results The tests were carried in two stages. In the irst stage, two available databases of acted emotional speech: Berlin Database of Emotional Speech and Database of Polish Emotional Speech were used. They contain recordings registered in studio conditions, different from those that are expected in social robot 61
Journal of Automation, Mobile Robotics & Intelligent Systems
3.1. Speech corpora Berlin Database of Emo onal Speech (Emo-DB) was recorded at the Technical University of Berlin [4]. Five male and ive female carefully chosen speakers pretended six different emotions (anger, joy, sadness, fear, disgust and boredom) and neutral state in ten utterances— ive consisting one phrase and ive consisting two phrases. Every utterance is in German and has emotionally neutral meaning. After recordings 20 people were asked to classify emotion of each utterance and rate its naturalness. Utterances that were not classi ied correctly by more than four persons or considered as unnatural by more than two were discarded. Resulting database consists of 493 recordings. Database of Polish Emo onal Speech consists of 240 recordings [15]. Four male and four female actor speakers were asked to enact in ive phrases ive different emotions (anger, joy, sadness, fear and boredom) as well as the emotionally neutral state. Phrases are uttered in Polish and their meaning is emotionally neutral. The number of recordings is the same for each emotion. These recordings were evaluated by 50 humans — each of them was asked to classify 60 randomly selected samples. The average recognition ratio was 72%. 3.2. Speaker independent Tests were carried out for the full set of recordings form both databases excluding those marked as disgust. There are no recordings of this emotion in Polish database, so it was ruled out to to keep the same conditions for both tests. Tenfolds cross-validation was used to evaluate classi ication accuracy. Emo-DB A subset of Emo-DB used for experiments consisted of 489 utterances: 71 of joy, 62 of sadness, 69 of fear, 127 of anger, 81 of boredom and 79 neutral. Selected features vector was 123 elements long. Table 1 summarises numbers of the features derived from each of the parameters as well as the maximum and average information gain ratio (IGR) of those features. Despite the fact, that features derived from the MFCC and the pitch form a major part of vector, LTAS features have the higher average IGR. All other acoustic parameters are represented among selected features — each of the speech parameters is relevant and nonredundant. However, it should be noticed that HNR has signi icantly lower contribution than other parameters. System classi ied properly 396 (81,98%) instances using the Bayes net classi ier. Table 2 shows confusion matrix and accuracy of classi ications of each emotion. Results achieved using other classi iers were: 79.7% for perceptron, 80.1% for RBF network 62
2013
Tab. 1. Number of features derived from the individual parameters of speech signal for Emo-DB Parameter MFCC Pitch Intensity HNR Spectrogram LTAS Σ
No. of feat. 45 29 22 13 8 6 123
Max. IGR 0.446 0.434 0.478 0.189 0.429 0.440
Avg. IGR 0.258 0.241 0.245 0.133 0.213 0.268
and 80.9% for SVM. In comparison in the study [30] accuracy of 79.47 % was achieved. In the studies [26] and [29] whole database (including disgust) was used and respectively 78.58% and 66,5% were archived. Tab. 2. Confusion matrix for Emo-DB
Actual class
application. Those test, however, can verify the chosen methodology. The second stage is veri ication for the intended use of system — test recordings should meet typical responses to social robot and should not be recorded in such arti icial conditions as mentioned databases.
N◦ 4
VOLUME 7,
N J S F A B
N 68 2 3 1 0 8
J 0 37 0 8 11 0
Classi S 1 0 58 4 0 3
ied as F 3 8 0 52 4 1
A 0 24 0 4 112 0
B 7 0 1 0 0 69
86,1% 52,1% 93,5% 75,4% 88,2% 85,2%
Polish Database In experiment for Polish language, all 240 recordings (40 of each emotion) were used. There were 86 selected features. Table 3 summarises numbers of the features derived from each of the parameters as well as the maximum and average information gain ratio (IGR) of those features. It can be noticed that the LTAS has signi icantly higher average IGR than the other parameters. System, using the Bayes net Tab. 3. Number of features derived from the individual parameters of speech signal for Database of Polish Emoonal Speech Parameter MFCC Pitch Intensity HNR LTAS Spectrum Σ
No. of feat. 41 18 12 6 6 3 86
Max. IGR 0.467 0.458 0.390 0.320 0.666 0.240
Avg. IGR 0.287 0.224 0.278 0.223 0.454 0.200
classi ier, properly classi ied 177 (73,75%) instances (64,18% in [6]). Results achieved using other classiiers were: 71.7% for perceptron, 67.9% for RBF network and 70.8% for SVM. Table 4 shows confusion matrix and accuracy of classi ications of each emotion. 3.3. Speaker dependent Experiments with a speaker dependent recognition were carried out for the recordings from the
Journal of Automation, Mobile Robotics & Intelligent Systems
Actual class
Tab. 4. Confusion matrix for Database of Polish Emoonal Speech
N J S F A B
N 31 2 0 1 0 3
J 0 32 0 0 7 0
Classi S 2 0 35 8 0 7
ied as F A 3 0 0 6 1 0 21 8 4 29 1 0
B 4 0 4 2 0 29
VOLUME 7,
N◦ 4
2013
LTAS have much lower avg. IGR (0.201). The lower avg. IGR have HNR (0.164).
4. Conclusions 77,5% 80,0% 87,5% 52,5% 72,5% 72,5%
Database of Polish Emotional Speech. Feature selection and classi ication process was done separately for each speaker’s 30 utterances. Length of the feature vectors and classi ication accuracy were summarised in table 5. 3.4. For social robots In order to test the developed system for its usability for speech emotion recognition for the social robots, group of people was asked to utter phrases expressing four different emotional states: happiness, compassion, contempt and the neutral state. These statements were intended to meet the most common people’s reactions to short-term contact social robot. The social robot Samuel ( igure 3) was given as an example [3]. It was assumed that in such a task is better to recognise fewer emotions with the (expected) better accuracy. In comparison to the discussed databases, the meaning of phrases corresponds to the expressed emotion—it was concluded that in this case it is more likely. Example sentences are (English translation in brackets): - Ale wyczesany robot! (What a cool robot!) - Musi ci być smutno tak stać…(It must be sad to stand like this…)
This paper presents and discusses the speech emotion recognition system based on the acoustic features of speech, apart from its semantic. Veri ication of the system using the Berlin Database of Emotional Speech and Polish Database of Emotional Speech con irms the effectiveness of the chosen feature extraction, selection and classi ication methods. It should be noted, however, that in Polish language fear is clearly less recognised than other emotions, so is joy in the German language. For both languages, joy is often confused with anger — these emotions with opposite valence both have high arousal and are close to each other in acoustic feature space. This fact is important, because if the system was used by the social robot this type of mistake could result in incorrect response. One of the solutions could be weighting errors of misclassi ing different pairs of emotions. Veri ication for the target application showed that it is possible for the developed system to recognise emotions in recordings made in “non-sterile” conditions. Achieved recognition accuracy is promising for the future usage. However, social robot, designed to be human‘s companion, should be able to recognise more emotional states than the short-term contact robot. Therefore, further experiments and research should be carried out, especially concerning noise robustness.
Acknowledgement This research was supported by Wroclaw University of Technology under a statutory grant.
- Weźcie to stąd! (Take this away!) - Na podłodze leży dywan. (Carpet lies on the loor). Recordings were taken in classroom in presence of many people at the same time, so conditions were far from the studio both in terms of acoustics and background noise. Electret microphone was used along with battery powered preamp and an external USB sound card. Sampling rate was set at 22050Hz, and the resolution was 16b. During experiment 160 sentences were recorded (40 for each emotion). One recording turned out to be corrupted, so it was ruled out. Test set contained 159 recordings (39 for happiness, and 40 for each of other emotions). System performance was evaluated using Bayes net classi ier and tenfold cross-validation. System correctly classiied 110 instances (69%). Table 6 shows the confusion matrix and the accuracy of classi ication of each emotion. In the feature selection process 33 features were selected: 19 derived from MFCC, ive from pitch, four from spectrogram, two from intensity, two from LTAS and one from HNR. Differently from previous experiments, the highest average IGR have features derived from the intensity (0.269) and those derived from
Fig. 3. Social robot Samuel. 63
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N◦ 4
2013
Tab. 5. Feature vector length and classifica on accuracy for speaker dependent tests. Speaker Length of feature vector Classi ication accuracy
M1 25 96,7%
M2 26 100%
Cor. cl.
Tab. 6. Confusion matrix for social robot experiment.
N Ct H Cm
Neu. 30 5 4 8
Classi Cont. 7 30 9 0
ied as Hap. 1 2 25 7
Comp. 2 3 1 25
75% 75% 64% 63%
AUTHOR Łukasz Juszkiewicz∗ – Wrocław University of Technology, Institute of Computer Engineering, Control and Robotics, 50-370 Wrocław, Wybrzeże Wyspiańskiego 27, e-mail: lukasz.juszkiewicz@pwr.wroc.pl. ∗
Corresponding author
REFERENCES [1] P. Boersma, “Accurate Short-Term Analysis of the Fundamental Frequency and the Harmonics-toNoise Ratio of a Sampled Sound”, Institute of Phonetic Sciences, University of Amsterdam, Proceedings, vol. 17, 1993, pp. 97–110. [2] R. R. Bouckaert, “Bayesian network classi iers in weka for version 3-5-7”, Network, 2008. [3] R. Budziński, J. Kędzierski, and B. Weselak. “Head of social robot Samuel – construction (in Polish)”. In: Prace Naukowe Politechniki Warszawskiej, Elektronika, pp. 185–194, z. 175, t. I. O icyna Wydawnicza Politechniki Warszawskiej, 2010. [4] F. Burkhardt, A. Paeschke, M. Rolfes, W. F. Sendlmeier, and B. Weiss. A database of german emotional speech, volume 2005, pp. 3–6. Citeseer, 2005. [5] S. Casale, A. Russo, G. Scebba, and S. Serrano, “Speech emotion classi ication using machine learning algorithms”. In: Proceedings of the 2008 IEEE International Conference on Semantic Computing, Washington, DC, USA, 2008, pp. 158–165. [6] J. Cichosz and Ślot K., “Emotion recognition in speech signal using emotion-extracting binary decision trees”. In: Proceedings of the 2nd International Conference on Affective Computing and Intelligent Interaction (ACII): Doctoral Consortium, 2007. [7] G. F. Cooper and T. Dietterich, “A bayesian method for the induction of probabilistic networks from data”. In: Machine Learning, 1992, pp. 309–347. [8] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and J. G. Taylor, 64
M3 42 100%
M4 30 100%
F1 39 100%
F2 25 96,7%
F3 37 100%
F4 28 100%
“Emotion recognition in human-computer interaction”, IEEE Signal Processing Magazine, vol. 18, no. 1, 2001, pp. 32–80. [9] K. Dautenhahn. Socially intelligent agents: creating relationships with computers and robots, chapter Creating emotion recognition agents for speech signal. Multiagent systems, arti icial societies, and simulated organizations. Kluwer Academic Publishers, 2002. [10] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The weka data mining software: an update”, SIGKDD Explor. Newsl., vol. 11, 2009, pp. 10–18. [11] M. A. Hall. Correlation-based Feature Subset Selection for Machine Learning. PhD thesis, Department of Computer Science, University of Waikato, Hamilton, New Zealand, April 1999. [12] S. Haykin, Advances in spectrum analysis and array processing, Prentice Hall advanced reference series: Engineering, Prentice Hall, 1995. [13] E. Keller, “The analysis of voice quality in speech processing”. In: Lecture Notes in Computer Science, 2005, pp. 54–73. [14] P. Langley and S. Sage, “Induction of selective bayesian classi iers”. In: Conference on uncertainity in arti icial intelligence, 1994, pp. 399–406. [15] Lodz University of Technology, Medical Electronics Division. “Database of Polish Emotional Speech”. http://www.eletel.p.lodz.pl/ bronakowski/med_catalog/docs/licence. txt. [16] X. Mao, B. Zhang, and Y. Luo, “Speech emotion recognition based on a hybrid of HMM/ANN”. In: Proceedings of the 7th Conference on 7th WSEAS International Conference on Applied Informatics and Communications - Volume 7, Stevens Point, Wisconsin, USA, 2007, pp. 367–370. [17] I. Murray and J. Arnott, “Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion”, Journal of the Acoustic Society of America, vol. 93, no. 2, 1993, p. 1097–1108. [18] T. L. Nwe, S. W. Foo, and L. C. D. Silva, “Speech emotion recognition using hidden markov models”, Speech Communication, vol. 41, 2003, pp. 603–623. [19] A. Osherenko and E. André, “Differentiated semantic analysis in lexical affect sensing”, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, 2009, pp. 1–6.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 7,
N◦ 4
2013
[20] D. W. Paul Boersma. “Praat: doing phonetics by computer (version 5.2.05)”, 2010.
magnitude pitch”, Journal of the Acoustical Society of America, vol. 8, no. 3, 1937, pp. 185–190.
[21] S. Samad, A. Hussain, and L. K. Fah, “Pitch detection of speech signals using the cross-correlation technique”. In: TENCON 2000. Proceedings, vol. 1, 2000, pp. 283 – 286.
[28] Łukasz Juszkiewicz. Postępy robotyki, chapter Speech emotion recognition system for social robots (in Polish), pp. 695–704. O icyna wydawnicza PW, 2012.
[22] S. Schnall, “The pragmatics of emotion language”, Psychological Inquiry, vol. 16, no. 1, 2005, pp. 28–31.
[29] T. Vogt. Real-time automatic emotion recognition from speech. PhD thesis, Technischen Fakultät der Universität Bielefeld, 2010.
[23] B. Schuller, A. Batliner, D. Seppi, S. Steidl, T. Vogt, J. Wagner, L. Devillers, L. Vidrascu, N. Amir, L. Kossous, and V. Aharonson, “The relevance of feature type for the automatic classi ication of emotional user states: Low level descriptors and functionals”. In: Proceedings of Interspeech, Antwerp, Belgium, 2007.
[30] Z. Xiao, E. Dellandréa, W. Dou, and L. Chen. “Hierarchical Classi ication of Emotional Speech”. Technical Report RR-LIRIS-2007-006, LIRIS UMR 5205 CNRS/INSA de Lyon/Université Claude Bernard Lyon 1/Université Lumiére Lyon 2/École Centrale de Lyon, March 2007.
[24] B. Schuller, S. Reiter, R. Muller, M. Al-Hames, M. Lang, and G. Rigoll, “Speaker independent speech emotion recognition by ensemble classiication”. In: Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on, 2005, pp. 864–867. [25] B. Siciliano and O. Khatib, eds., Springer Handbook of Robotics, Springer, 2008. [26] J. Sidorova, “Speech emotion recognition with TGI+.2 classi ier”. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, 2009, pp. 54–60. [27] S. S. Stevens, J. Volkmann, and E. B. Newman, “A scale for the measurement of the psychological
[31] M. You, C. Chen, J. Bu, J. Liu, and J. Tao, “A hierarchical framework for speech emotion recognition”. In: Industrial Electronics, 2006 IEEE International Symposium on, vol. 1, 2006, pp. 515–519. [32] S. Zhang and Z. Zhao, “Feature selection iltering methods for emotion recognition in chinese speech signal”. In: Signal Processing, 2008. ICSP 2008. 9th International Conference on, 2008, pp. 1699–1702. [33] J. Zhou, G. Wang, Y. Yang, and P. Chen, “Speech emotion recognition based on rough set and svm.”. In: Y. Yao, Z. Shi, Y. Wang, and W. Kinsner, eds., IEEE ICCI, 2006, pp. 53–61.
65