pISSN 1897-8649 (PRINT) / eISSN 2080-2145 (ONLINE)
VOLUME 8
N° 3
2014
www.jamris.org
Journal of Automation, mobile robotics & Intelligent Systems
Editor-in-Chief Janusz Kacprzyk (Systems Research Institute, Polish Academy of Sciences; PIAP, Poland)
Associate Editors: Jacek Salach (Warsaw University of Technology, Poland) Maciej Trojnacki (Warsaw University of Technology and PIAP, Poland)
Co-Editors: Oscar Castillo (Tijuana Institute of Technology, Mexico) Dimitar Filev (Research & Advanced Engineering, Ford Motor Company, USA) Kaoru Hirota (Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Japan) Witold Pedrycz (ECERF, University of Alberta, Canada) Roman Szewczyk (PIAP, Warsaw University of Technology, Poland)
Statistical Editor: Małgorzata Kaliczynska (PIAP, Poland)
Executive Editor: Anna Ladan aladan@piap.pl
The reference version of the journal is e-version.
Editorial Board:
Patricia Melin (Tijuana Institute of Technology, Mexico)
Chairman: Janusz Kacprzyk (Polish Academy of Sciences; PIAP, Poland)
Tadeusz Missala (PIAP, Poland)
Mariusz Andrzejczak (BUMAR, Poland)
Fazel Naghdy (University of Wollongong, Australia)
Plamen Angelov (Lancaster University, UK)
Zbigniew Nahorski (Polish Academy of Science, Poland)
Zenn Bien (Korea Advanced Institute of Science and Technology, Korea)
Antoni Niederlinski (Silesian University of Technology, Poland)
Adam Borkowski (Polish Academy of Sciences, Poland)
Witold Pedrycz (University of Alberta, Canada)
Wolfgang Borutzky (Fachhochschule Bonn-Rhein-Sieg, Germany)
Duc Truong Pham (Cardiff University, UK)
Chin Chen Chang (Feng Chia University, Taiwan)
Lech Polkowski (Polish-Japanese Institute of Information Technology)
Jorge Manuel Miranda Dias (University of Coimbra, Portugal)
Alain Pruski (University of Metz, France)
Bogdan Gabrys (Bournemouth University, UK)
Leszek Rutkowski (Czestochowa University of Technology, Poland)
Jan Jablkowski (PIAP, Poland)
Klaus Schilling (Julius-Maximilians-University Würzburg, Germany)
Stanisław Kaczanowski (PIAP, Poland)
Ryszard Tadeusiewicz (AGH Univ. of Science and Technology in Cracow, Poland)
Tadeusz Kaczorek (Warsaw University of Technology, Poland)
Stanisław Tarasiewicz (University of Laval, Canada)
Marian P. Kazmierkowski (Warsaw University of Technology, Poland)
Piotr Tatjewski (Warsaw University of Technology, Poland)
Józef Korbicz (University of Zielona Góra, Poland)
Władysław Torbicz (Polish Academy of Sciences, Poland)
Krzysztof Kozłowski (Poznan University of Technology, Poland)
Leszek Trybus (Rzeszów University of Technology, Poland)
Eckart Kramer (Fachhochschule Eberswalde, Germany)
René Wamkeue (University of Québec, Canada)
Piotr Kulczycki (Cracow University of Technology, Poland)
Janusz Zalewski (Florida Gulf Coast University, USA)
Andrew Kusiak (University of Iowa, USA)
Marek Zaremba (University of Québec, Canada)
Mark Last (Ben–Gurion University of the Negev, Israel)
Teresa Zielinska (Warsaw University of Technology, Poland)
Editorial Office: Industrial Research Institute for Automation and Measurements PIAP Al. Jerozolimskie 202, 02-486 Warsaw, POLAND Tel. +48-22-8740109, office@jamris.org Copyright and reprint permissions Executive Editor
Publisher: Industrial Research Institute for Automation and Measurements PIAP
If in doubt about the proper edition of contributions, please contact the Executive Editor. Articles are reviewed, excluding advertisements and descriptions of products. All rights reserved © Articles
1
JOURNAL of AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS VOLUME 8, N° 3, 2014 DOI: 10.14313/JAMRIS_3-2014
CONTENTS 3
41
Interactive Programming of a Humanoid Robot Mikołaj Wasielica, Marek Wąsik, Andrzej Kasiński DOI: 10.14313/JAMRIS_3-2014/21
Face Detection in Color Images using Skin Segmentation Mohammadreza Hajiarbabi, Arvin Agah DOI: 10.14313/JAMRIS_3-2014/26
10
WiFi-Guided Visual Loop Closure for Indoor Navigation Using Mobile Devices Michał Nowicki DOI: 10.14313/JAMRIS_3-2014/22
19
Application of the One-factor CIR Interest Rate Model to Catastrophe Bond Pricing under Uncertainty Piotr Nowak, Maciej Romaniuk DOI: 10.14313/JAMRIS_3-2014/23
28
Boosting Support Vector Machines for RGB-D Based Terrain Classification Jan Wietrzykowski, Dominik Belter DOI 10.14313/JAMRIS_3-2014/24
35
Power System State Estimation using Dispersed Particle Filter Piotr Kozierski, Marcin Lis, Dariusz Horla DOI: 10.14313/JAMRIS_3-2014/25
2
Articles
52
Computing with Words, Protoforms and Linguistic Data Summaries: Towards a Novel Natural Language Based Data Mining and Knowledge Discovery Tools Janusz Kacprzyk, Sławomir Zadrożny DOI: 10.14313/JAMRIS_3-2014/27
59
The Bi-partial Version of the p-median/p-center Facility Location Problem and Some Algorithmic Considerations Jan W. Owsinski DOI: 10.14313/JAMRIS_3-2014/28
64
A Novel Generalized Net Model of the Executive Compensation Design Krassimir T. Atanassov, Aleksander Kacprzyk, Evdokia Sotirova DOI: 10.14313/JAMRIS_3-2014/29
Journal of Automation, Mobile Robotics & Intelligent Systems
I
P
VOLUME 8,
H
N◦ 3
2014
R
Submi ed: 16th April 2014; accepted: 29th May 2014
Mikołaj Wasielica, Marek Wąsik, Andrzej Kasiński DOI: 10.14313/JAMRIS_3-2014/22 Abstract: This paper presents a control system for a humanoid robot based on human body movement tracking. The system uses the popular Kinect sensor to capture the moon of the operator and allows the small, low-cost, and proprietary robot to mimic full body mo on in real me. Tracking controller is based on op miza on-free algorithms and uses a full set of data provided by Kinect SDK, in order to make movements feasible for the considerably different kinema cs of the humanoid robot compared to the human body kinema cs. To maintain robot stability we implemented the balancing algorithm based on a simple geometrical model, which adjusts only the configuraon of the robot’s feet joints, maintaining an unchanged imitated posture. Experimental results demonstrate that the system can successfully allow the robot to mimic captured human mo on sequences.
use all available joints simultaneously to provide the closest possible imitation of motion and static stability of the robot. Our mimicking strategy is based on optimization-free solutions and our balance controller uses a simple geometrical model. It should be noted that servomechanisms in M-bot have signi icant backlash, no feedback, limited resolution and control rate. Also construction of the robot was not precisely calibrated. Experimental results demonstrate that the controller can track human motion capture data while maintaining balance of the robot. After introducing related work in the next section, we brie ly summarize the overview of the system components in section 3. Section 4 provides a detailed description of the imitation strategy while a simple stability controller is presented in section 5. Experimental results are summarized in Section 6. The paper concludes with Section 7.
Keywords: bipeds, control of robo c systems, humanoid robots, legged robots, teleopera on
1. Introduc on Programming humanoid robot movements is a dificult task. They are usually programmed with numerical optimization techniques or manually, which requires a lot of knowledge and skills about kinematics and dynamics of the robot. Whereas humanoid robot movements should be natural and human-like, human motion capture systems appear to be the preferred solution. However, differences between human and robot kinematics and dynamics, as well as high computational cost cause dif iculties in the straightforward solution of this problem. In our previous work [13] we presented the smallsize humanoid robot M-bot (Fig. 1) designed from scratch as an alternative to robots built from commercially available kits [1] [2] [3]. As far as construction is concerned the main assumption was the low cost and relatively high number (23) of Degrees of Freedom (DOF). In our recent work [12] we also presented a manual programming method of the robot. In this paper, we propose a simple, ef icient and low-cost control system for our humanoid robot. The input device is Microsoft Kinect sensor [5], which is relatively cheap compared to professional motion capture (MoCap) systems and has an advantage in that it does not require sophisticated MoCap suits to wear. Kinect sensor and its included Microsoft Kinect Software Development Kit (SDK) provide a 3D Cartesian position of the joints of the observed person. We
Fig. 1. The robot (poses obtained from our mimicking system)
2. Related Work Most of the available control frameworks for humanoid robots require professional motion capture systems, precise commercial robots, and sophisticated algorithms. For example, Yamane et al. [14] developed a system that allows a real force-controlled humanoid robot to maintain balance while mimicking a captured motion sequence. This system employs the model of a simpli ied robot dynamics to control the balance, a controller for joint feedback, and an optimization procedure yielding joint torques that ensure simultaneously balancing and tracking. However, this system 3
Journal of Automation, Mobile Robotics & Intelligent Systems
does not allow tracking of motion sequences that include contact state changes, like stepping. In [15] this approach is extended, but only on a simulated robot, by adding a procedure that keeps the center of pressure inside the polygon of support, which varies as the robot moves its feet. The motion-capture-based approaches to control a humanoid robot mostly use pre-recorded motion sequences. Only a few systems described in the literature can interact with the human operator in realtime. Such a system, described in [11], uses the Kinect sensor and the Nao small humanoid. This system implements balance control and self-collision avoidance by involving an inverse kinematics model and posture optimization. The captured human motion is split into critical frames and represented by a list of optimized joint angles. The similarity of motion is evaluated by the con iguration of the corresponding links on the actor and imitator in the torso coordinate system. This work is most similar to our approach, but it was demonstrated on a more complicated robot that, unlike the M-bot, has reliable position feedback in the joints. Moreover, the solution presented in [11] requires numerical optimization, while our approach yields a feasible robot con iguration in a single-step, using only geometric computations, and thus it is computationally ef icient.
3. System Components Our control system scheme is presented in Fig. 2. Input device of the system is Kinect sensor, which obtains a depth map of the observed scene, where the human being is located. Microsoft Kinect SDK beta 2 processes the cloud of points and returns a Cartesian position of 20 skeletal joints [6]. This data is an input to the trajectory forming algorithm, which converts the 3D position of human joints to angular con iguration of the robot’s servomechanisms. Con iguration is then modi ied to provide static stability maintenance. Finally corrected information of joints angles is transferred to the robot.
Fig. 2. Kinect-based programming system overview
3.1. Mo on Capture System Kinect is equipped with RGB camera (1280×1024 pixels for 63×50 degrees FOV, 2.9 mm focal length, 2.8μm pixel size), IR camera (1280×1024 pixels for 57×45 degrees FOV, 6.1 mm focal length, 5.2μm pixel size) [8], and IR pattern projector. Both IR camera and IR projector are used to triangulate points position in space. Minimal depth sensor range is 800 mm 4
VOLUME 8,
N◦ 3
2014
and maximal 4000 mm. However, the Kinect for Windows Hardware can be switched to Near Mode which provides range of 500 mm to 3000 mm. Resolution of obtained depth map is 11-bit, which provides 2,048 levels of sensitivity [7]. Highest possible frame rate is 30 fps, which is available for depth and color stream in 640×480 pixels resolution [4].
Fig. 3. Microso Kinect SDK Mo on Capture process [10] Microsoft Kinect SDK is able to track users providing detailed information about twenty joints of the user’s body in the camera ield of view. Fig. 3 shows an overview of this process. First, from single input depth image human silhouettes are extracted (because it uses depth map it is no longer necessary to wear special MoCap suits). Then a per-pixel body part distribution is inferred. Each color indicates the most likely part labels at each pixel. Finally local modes of each part distribution are estimated to give con idenceweighted proposal for the 3D location of body joints [10]. 3.2. The robot Our robot is a proprietary construction. It was made of scale model servomechanisms and handbended 1mm aluminium sheet (Fig. 1). The robot weights 1.8 kg and is 42 cm tall. Total number of DOFs is 23. It has 7 servos in each leg. Usually a robot leg has 6 DOFs, but we added bended toes to improve walking possibilities. Three DOFs are located in each arm, one in the trunk, and two in the head. Inside the robot is located custom made printed circuit board equipped with ATXMega microcontroller, 3-axial accelerometer and 3-axial gyroscope. All 23 servomechanism are directly connected to motherboard. We added a bluetooth module to enable wireless communication with the host computer. This link is used to boost computational capabilities of the microcontroller and to allow the addition of external sensors like Kinect.
4. Mo on Imita on Problem Kinect sensor provides MoCap data in 3D Cartesian coordinates form, but our robot’s actuators are angular position controlled. In this situation, since we operate in two different coordinate systems we have to de ine how to understand the adequacy of a human
Journal of Automation, Mobile Robotics & Intelligent Systems
pose by the robot. We considered scaling the operators joints position to compare it with the robot’s one, but we realized that our robot has different proportions to a human and also the operator will not always be the same person. As a result we would have to adjust the scale for each joint and for each operator change, which will not guarantee pose adequacy. We observed that humans learn new choreography by imitating pose angular con iguration rather than Cartesian position of joints, e.g. children learning some new poses from adults. Therefore we decided to represent human posture as rotational joints con iguration and then transfer it directly to the robot’s servomechanisms. Operating in multi-angular space guarantees the important con iguration-based scale invariance.
VOLUME 8,
N◦ 3
2014
arm section. We calculated particular sections con iguration with Eulerian rotation matrices representation. However the arm of our robot has 3 DOFs, while a human one has 4 DOFs, excluding the wrist. In this situation we had to compensate for this disadvantage to a obtain visually acceptable con iguration. The advantage of that robot construction is that the elbow can be bended over the straight angle. Therefore we are able to minimize dead zones (Fig. 5). Also, when servo reaches angular limit, con iguration for this joint changes with hysteresis.
Fig. 5. Approxima ng human arm configura ons with the limited number of DOFs in the robot
Fig. 4. Kinema c structure of the robot We also wanted to avoid singular poses while still achieving accurate pose imitation to avoid computationally costly iterative calculus. In order to get this we assumed that the number of human DOFs and relative orientation of the rotation of their axes are the same as robot’s (Fig. 4). Distance between DOFs is unrestricted and depends on human proportions. As origin of the Coordinate System (CS) we chose the intersection of spine and shoulders axes. Our skeleton is treated as 5 independent kinematic chains (legs, arms and head). Then we provided structural reduction of DOFs. Namely, we consider a kinematic chain divided into several sections with maximum three DOFs each. Knowing the 3D position of 20 human joints from Kinect we are able to obtain inverse kinematics (IK) equations for each particular chain section. These IK equations are simple algebraic calculations and thanks to this, we avoid much slower numerical computations. Because the position of each body part given by Kinect is used, we obtain exact representation of human body con iguration, avoiding singular poses at the same time. Moreover this solution is optimization free, because the obtained robot’s con iguration is practically close to the operator’s. 4.1. Arms According to the above assumption we considered our upper limb as made of two sections. The shoulder section-starts in origin of the CS with 2 DOFs and
4.2. Head and Torso Microsoft Kinect SDK beta 2 do not provide human head orientation, therefore we implemented a controller, which directs the robot’s ”eyes” to look on the operators head. Knowing relative position of the robot head to the Kinect sensor and relative position of the operator head, we were able to triangulate con iguration of 2 DOFs of the robot’s neck. Single DOF of the robot torso enables it to tilt. Its angular position is de ined as roll angle between origin of CS and the hip CS of the operator. 4.3. Legs We divided the leg kinematic chain into two sections: hip and knee. We do not consider the foot orientation, because information extracted from the Kinect data is usually highly inaccurate (in section 5 we explain how to obtain ankle joint con iguration). To deine hip joint con iguration, we took into account hip to heel vector orientation instead of thigh orientation. This results from the assumption that representation of the operator’s leg length is expressed as the proportional distance between hip and heel, which varies from 0 to 100% of the maximal leg length. Therefore knee angular position is calculated with the law of cosines. All of this is necessary to improve the balancing algorithm of the robot. Since the hip joint has 3 DOFs it is critical to de ine one more vector, perpendicular to hip to heel vector, to obtain three Euler angles. To do this we de ined a vector, which is a cross product of the thigh and calf orientation. We used it, rather than feet orientation, because Kinect data for the calf and thigh area is much more accurate. 4.4. Configura on Correc on As result of the algorithm calculations we obtained con iguration of all joints, a set of 23 elements. Sometimes in this data outliers occur, caused by Kinect reading errors. To smooth the robot movements we 5
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
average the last 6 measurements for each joint angle. Number of samples was manually adjusted (the greater the number, the less dynamic movement). We also provided simple self-collision avoidance methods. The algorithm limits the range of the angle of each servomechanism, according to the eq. 1: Cimin ˆ ti = Θt Θ i Cimax
if Θti ≤ Cimin , if Cimax ≤ Θti ≤ Cimin , if Θti ≥ Cimax ,
(1)
where Cimin and Cimax are angular limits for extreme position of each servomechanism. 4.5. Opera ng Modes Additionally we implemented two operating modes: a show and face to face. In the irst mode the robot and the operator are turned in the same direction i. e. to the audience. In this case con iguration of the operator can be directly transferred to the robot. But in second mode, where the robot faces the operator, his con iguration has to be mirrored. We change location of con iguration in our data set of the left side of the robot to the right and vice versa. If it is necessary we change the angle sign.
5. Stability Maintenance Described in section 4 robot con iguration does not account for the different human mass distribution and inertia. Because of Kinects signi icant measurement error of human feet position, the obtained ankle con igurations are encumbered with errors, which is essential for pose stabilization. In this situation we need to implement a separate balance algorithm in the robot. It is based on the static model and controls actual COM position relatively to the robot’s supporting feet. When a subsequent pose indicates loses of the robot static stability the control system adjusts COM position by using only the robot feet joints. 5.1. Center of Mass To maintain stability of the robot, irst we considered using a stabilizer based on the Zero Moment Point (ZMP) [9]. Servomechanisms in our robot do not provide any feedback information, even about angular position. The robot does not have pressure sensors in it’s feet too, therefore implementation of ZMP was not possible. To maintain the static stability we inally decided to implement a controller based on the position of the Center of Mass (COM) projection on the ground plane. We obtained the mass and relative position of COM of each robot segment from CAD data. We calculated absolute COM position using con iguration data and the relative COM position of each segment, respectively to the robot Coordinate System located in the torso. Assuming that the robot consists of N rigidbody links in three dimensions, the absolute COM position is given as ∑N COMx,y,z =
6
′ i=1 mi [xi , yi , zi ] , ∑N i=1 mi
(2)
N◦ 3
2014
where mi is a mass and [xi , yi , zi ]′ is a position of each robot segment. Because used servomechanisms have signi icant backlash and positioning error of about ±1.5◦ , we do not consider COM as point, but as a ball. We assume that the ball represents punctual COM with embedded measurement uncertainty as its radius. This fundamental assumption implicates that to maintain static stability, projection of the COM to ground plane does not have to lie in the foot supporting area, but on the line segment connecting centers of the feet. Therefore space of existing solutions was reduced from two to one dimension. 5.2. Simple Stability Model The main task of our system is to imitate the operating person accurately, therefore any obtained robot’s con iguration modi ications are not recommended. Also the operator’s foot orientation readings are of poor quality especially since the robot’s ankle joints are essential to balance. Combining these arguments we propose a simple solution. Position of the robot’s COM is calculated before executing each posture, 50 times per second. To adjust COM position we use only feet orientation, rest of the body joints are ixed. Changing feet orientation has negligibly small in luence to mass distribution, therefore we can assume that the position of the robot’s COM is constant according to the robot’s CS in current frame. Thanks to that, the robots pose is same as the operators. Also this solution is optimization free, because no iterations are needed to estimate COM position for each con iguration change. The only problem to solve is the calculation of orientation of the robots foot according to its CS. To do this we propose a simple geometrical model presented in ig. 6. First we assumed that the ground on which the robot stands is lat, so the normal vector to the surface is parallel to gravity vector all over the place. Then we ⃗ , which has to be parallel to the can introduce vector V normal of the support surface to maintain static stability. Since the feet are robot’s support surface, with ⃗ should start in its centers in Pp and Pl points, the V line segment Pp Pl and end in the COM. Also observing ⃗ has to that Pp and Pl should lie on the same surface, V be perpendicular to the Pp Pl . Combining this assump⃗ orientations we obtained equation 3 to calculate V tion. −−→ −−−−−−→ −−−→ ⃗ =− V PL PP × (PL COM × PL PP )
(3)
⃗ orientation according to the robot’s CS Knowing V we can substitute it into inverse kinematics equations. Because the ankle joint has two DOFs, only a single 3D vector is needed to de ine its orientation, and we can easily calculate its con iguration using explicit algebraic transformations. 5.3. Side fall preven on Introduced ankle strategy guarantees stability maintenance only when projection of COM to the line segment will be inside this segment. In other words
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
Fig. 7. Illustra on of the side fall preven on strategy (from le to right) 5.4. Single Leg Standing and Correc on from IMU
Fig. 6. Stability maintenance strategy: schema cs (A) and visualiza on on the robot (B) we have to detect the possibility of exceeding outer boundary of the foot by COM and adjust con iguration to prevent against falling to one side. We decided to detect the possibility of falling by measuring the angle between two vectors (eq. 4): ̸
(⃗x, ⃗y ) = arccos
(⃗x, ⃗y ) . |⃗x| · |⃗y |
(4)
6. Experimental Results
For each vectors pair angles are speci ied in eq. 5: −−−−−−→ ⃗ α = ̸ (PP COM , V ), −−−−−−→ ⃗ β = ̸ (PL COM , V ), −−−−−−→ −−−−−−→ γ = ̸ (PP COM , PL COM ).
Our presented balance algorithm is also able to provide stabilization on a single leg. It requires de in−−−−−→ ⃗ −−−−−−→ ⃗ as V ⃗ =− ing vector V PP COM or V = PL COM . Then the balancing algorithm will be adjusting orientation of robot’s ankle so that the COM will be always over center of supporting feet. By adding the inclination angles obtained from the ⃗ we can also comIMU to the orientation of vector V pensate to some degree the ground tilting. This additional features have not been implemented, but theoretical considerations imply that the idea is correct.
(5)
To ind out if next pose is stable we have to solve eq. 6: if α + β = γ ∧ α ̸= 0 ∧ β ̸= 0, Stable Stability limit if α = 0 ∨ β = 0, P ose = U nstable if α + β > γ. (6) ⃗ is inside our We simply calculate whether vector V modelled triangle (Fig. 6(A)). If it is inside the pose will be stable and no adjustments are needed. Con iguration can be directly transmitted to the robot. If it is on the stability boundary, con iguration also can be transmitted. In the third case we have to modify existing con iguration to guarantee pose stability. However doing this we should as little as possible modify pose to maintain it’s similarity to the operator pose. Fig. 7 presents scheme of adjusting pose. The aim is to transform the unstable con iguration (A) to a stable one (D). First (A) we calculate the proper angle by using eq. 5. This angle informs us about vertical deviation. Rotating our model by this angle (B), we can calculate the intersection point of ground and the edge of the triangle. Knowing the position of this point and position of vertex we are able to calculate the distance between them (C). We can simply shorten the appropriate leg, as in a method of adjusting leg length in section 4.3. Finally we obtain a stable pose (D), which can be transferred to the robot, after updating the foot orientation.
6.1. Controller Implementa on We implemented the controlling framework on a standard PC computer, which is connected to the robot by bluetooth. Each robot actuator is controlled by the internal position feedback loop running at 50 Hz for position control. Therefore our implementation of the posture controller runs at this rate. Our control system uses a con iguration extracted from Kinect data twice because the Kinect provides skeletal data at about 25 Hz. Also it should be noted, that information about each actuator position is not visible for our main controller. This results from the lack of any feedback information from the low-cost servomechanisms used. The robot mass distribution is obtained from a CAD model. Similarly the angular position of gears is evaluated by eye, so it approximates only con iguration based on the appearance of the robot. Also the support platform is not levelled. All these factors cause the inconsistency with the real data. To compensate for this, while the robot is in initial pose, we manually adjust the foot joints until the robot is able to stand upright on its own. Then we add these constant offsets to the reference joint angles in order to match the initial reference robot pose with the actual initial pose. 6.2. Control of the System The control framework is designed to be userfriendly. Because the system was presented with audience participation, we implemented an operator chosen algorithm. From humans in the Kinect ield of view, the person who raises their hand will take control of the robot (Fig. 8 A). From this moment the affection is ixed on that person. Controlling the robot utilises the whole operators body, so it is not possible for him to manage the con7
Journal of Automation, Mobile Robotics & Intelligent Systems
N◦ 3
2014
trol application in a traditional way by using a mouse or keyboard. Therefore we implemented voice commands in the robots control system. Thus it is possible to establish a connection to stop or to turn off the robot. Mainly this mode is used to switch a controlling person.
proved it to be ef icient. Experimental results demonstrate that our implementation successfully controls a humanoid robot so that it tracks human motion capture data while maintaining the balance.
6.3. Tracking Human Mo on Data
This research was funded by the Faculty of Electrical Engineering, Poznan University of Technology grant, according to the decision 04/45/DSMK/0126.
In our experiment the operator presents simple and the hardest poses for the robot involving both upper-body and legs. Snapshots from the trial1 are shown in ig. 8. Each pose is presented in frontal view, to minimize occluding of body parts, which result in Kinect readouts errors. If the error occurs as a significantly fast joints position step, which is faster than people are able to perform, the robot trying to follow such a movement in some cases can fall down, because of the high value of dynamic forces.
Fig. 8. Snapshots from the experimental valida on: taking the control (A), various postures achieved by the robot (B,C,D) Analysing captured video frame by frame we observed insigni icant lag when mimicking motion using Kinect. This lag is mainly caused by the Kinect SDK calculations time. However, the robot movements stay as dynamic as the operator. Both feet and hands of the robot have the same maximum speed, which does not signi icantly affect the robots stability, thanks to the robust stability controller. This is an original property of the robot compared to other solutions [15] [11]. Sometimes when the robot joint reaches its angular limit, it performs additional corrective movements, which are not performed by the operator. Such movements help the robot to avoid singular con igurations and improve robot-human pose similarity limitation. Finally, the overall motion sequence is very similar to the reference motion performed by the operator.
7. Conclusion In this paper we have proposed a straight-forward method to build a real time full body human imitation system on the proprietary humanoid robot. First we suggest to rely on the angular representation of a human pose as it is independent of human size, uses a full set of available joints data, and needs no optimization. We also propose a straight-forward balance control strategy based on ankle joints control and have 8
VOLUME 8,
ACKNOWLEDGEMENTS
Notes 1 A supplemental video is available at: http://lrm.cie.put.poznan.pl/ROBHUM.mp4
AUTHORS
Mikołaj Wasielica∗ – Poznan University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań , e-mail: mikolajwasielica@gmail.com, www: http://www.cie.put.poznan.pl/. Marek Wąsik – Poznan University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań , e-mail: wasik.m@gmail.com, www: http://www.cie.put.poznan.pl/. Andrzej Kasiński – Poznan University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań , e-mail: Andrzej.Kasinski@put.poznan.pl, www: http://www.cie.put.poznan.pl/. ∗
Corresponding author
REFERENCES [1] Aldebaran Robotics, 2012 (online product speci ication, link: www.aldebaran-robotics.com). [2] I. Ha, Y. Tamura and H. Asama, “Development of open humanoid platform DARwin-OP”, Advanced Robotics, vol. 27, 2013, no. 3, pages 223–232, DOI:10.1080/01691864.2012.754079. [3] Kondo Kagaku Co., Ltd, 2012 (online product speci ication, link: www.kondo-robot.com). [4] Microsoft, “Kinect for Windows Sensor Components and Speci ications”, 2012 (online documentation, link: http://msdn.microsoft.com/ en-us/library/jj131033.aspx). [5] Microsoft, “Kinect for X-BOX 360”, 2010 (online product speci ication, link: http://www.xbox. com/en-US/kinect). [6] Microsoft Kinect SDK, “Getting Started with the Kinect for Windows SDK Beta from Microsoft Research”, 2011, pages 19–20, (online document, http://www.microsoft.com/en-us/ link: kinectforwindowsdev/Downloads.aspx). [7] Microsoft, “Kinect Sensor”, 2012 (online documentation, link: http://msdn.microsoft.com/ en-us/library/hh438998.aspx).
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
[8] OpenKinect, “Protocol Documentation”, 2012 (online document, link: http://openkinect. org/wiki/Protocol/_Documentation/ #Control/_Commands;a=summary). [9] P. Sardain and G. Bessonnet, “Forces acting on a biped robot. Center of Pressure – Zero Moment Point”, IEEE Trans. Systems, Man, and Cybernetics, Part A, vol. 34, 2004, no. 5, pages 630–637. [10] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman and A. Blake, “Real-time human pose recognition in parts from single depth images”, Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Providence, USA (2011), pages 1297–1304, DOI:10.1109/CVPR.2011.5995316. [11] F. Wang, C. Tang, Y. Ou and Y. Xu, “A realtime human imitation system”, Proc. 10th World Congress on Intelligent Control and Automation, Beijing, China (2012), pages 3692–3697, DOI:10.1109/WCICA.2012.6359088. [12] M. Wasielica, M. Wa̧ sik, A. Kasiń ski and P. Skrzypczyń ski, “Interactive Programming of a Mechatronic System: A Small Humanoid Robot Example”, IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) Wollongong, Australia (2013), pages 459–464, DOI:10.1109/AIM.2013.6584134. [13] M. Wasielica, M. Wa̧ sik and P. Skrzypczyń ski, “Design and applications of a miniature anthropomorphic robot”, Pomiary Automatyka Robotyka, vol. 2, 2013, pages 294–299. [14] K. Yamane, S. Anderson and J. Hodgins, “Controlling humanoid robots with human motion data: Experimental validation”, Proc. IEEE/RSJ, Int. Conf. on Humanoid Robots, Nashville, USA (2010), pages 504–510, DOI:10.1109/ICHR.2010.5686312. [15] K. Yamane and J. Hodgins, “Control-aware mapping of human motion data with stepping for humanoid robots”, Proc. IEEE/RSJ ,Int. Conf. on Intelligent Robots and Systems, Taipei, China (2010), pages 726–733, DOI:10.1109/IROS.2010.5652781.
9
Journal of Automation, Mobile Robotics & Intelligent Systems
W -
V
VOLUME 8,
L
C M
I
N◦ 3
2014
L
D
Submi ed: 3th June 2014; accepted: 11th June 2014
Michał Nowicki DOI: 10.14313/JAMRIS_3-2014/23 Abstract: Mobile, personal devices are ge ng more capable every year. Equipped with advanced sensors, mobile devices can use them as a viable pla orm to implement and test more complex algorithms. This paper presents an energy-efficient person localiza on system allowing to detect already visited places. The presented approach combines two independent informa on sources: wireless WiFi adapter and camera. The resul ng system achieves higher recogni on rates than either of the separate approaches used alone. The evalua on of presented system is performed on three datasets recorded in buildings of different structure using a modern Android device. Keywords: mobile devices, localiza on, sensor fusion
1. Introduc on Mobile devices, like tablets or smartphones, are nowadays equipped with more sensors than few years ago. Those sensors combined with increasing processing capabilities allow to develop more complex, realtime algorithms that can be used for personal navigation or detection of potentially dangerous situations. Those algorithms have not only academic, but also commercial signi icance due to the popularity of personal mobile devices in the modern world. One of the sensors available in every, recent Android device is a WiFi adapter. Most users use this adapter to connect to wireless Access Points (APs), but it can be used as a sensor that measures the strength of surrounding wireless networks. The researched approaches utilizing WiFi scans can be divided into two groups: WiFi triangulation or WiFi ingerprinting. The WiFi triangulation uses three or more APs that are visible in line-of-sight and triangulates the user position based on the measured signal strength of each network [1]. This approach is effective if the localization is performed in open-space areas. In a typical building with cluttered environment that is rich in corridors and additional rooms, WiFi triangulation is still applicable, but the number of APs needed to perform successful localization is higher. Therefore, if there exists an additional prerequisite to use only the already existing APs infrastructure, WiFi triangulation can provide misleading localization as the number of signal re lections negatively impacts the measured signal strength. In structured environment, the WiFi information can be used to determine the measured position based on the list of available wireless networks in a single scan. This technique, called 10
WiFi ingerprinting, determines the similarity of current scan to previous scans or to the entries in a recorded database of WiFi scans. The ef icient, working solutions utilizing WiFi ingerprinting were presented in [3], [4] and [12]. Other researches focus on using sensors that are equivalent to the equipment present in typical mobile devices, but do not perform the experiments on actual mobile devices [3], [19]. This information might be used to provide an estimate of the user’s localization, but the precision of signal measurement depends greatly on the orientation of the measurement with respect to the APs. Holding the mobile device in different way or shadowing the signal with the person’s body affects the obtained results and can have a negative impact on the repeatability of the measurements. Therefore, to alleviate this in luence, it is bene icial to incorporate information from additional sensors, e.g., an inertial sensor. Modern mobile devices are in most cases equipped with a 3-axis accelerometer, a 3-axis gyroscope, and a 3axis magnetometer. The information from these sensors can be used to create a system estimating the orientation of a smartphone [10]. The orientation estimate can be later effectively used to enhance the WiFi measurement. Another sensor that is a standard in mobile devices is a camera. The sight plays signi icant role in the localization strategy of human beings and therefore image processing is researched in robotics and computer vision communities. Methods estimating the total motion based on consecutive image-image estimates are called Visual Odometry and are especially important for mobile robots [14]. Typically, those methods ind a sparse set of features that are matched/tracked in consecutive images. The positions of features in compared images are used to estimate the transformation. Due to the frame-frame estimation, those methods suffer from an estimation drift arising due to error summation over time. This approach provides a continuous estimate of motion, but is also computation-demanding and thus energy-consuming. Energy-ef iciency is especially important for small, portable devices, and from user’s point of view should not have a signi icant, negative impact on the battery lifetime. The WiFi and vision based approaches to indoor localization are usually researched separately, neglecting the possible synergies of both information sources and gains due to data fusion. The known works approaching the problem of multi-sensor fusion for indoor localization on mobile devices are
Journal of Automation, Mobile Robotics & Intelligent Systems
dominated by the continuous data fusion paradigm, employing a ilter-based framework [8]. The results being presented are often achieved with custom experimental setups [19], not actual mobile devices. Thus, these works avoid confronting the problems of limited computing power and energy. Some other approaches focus on enhancing the WiFi-based localization with data from inertial sensors, but do not use cameras [11], [18]. This paper presents a prototype system that determines on demand the position of a person inside a building using data from the WiFi and camera of a mobile device (smartphone). The acquired WiFi scan is used to determine the best ingerprint match to the WiFi scans recorded previously and stored in a database of known locations. Then, the WiFi-based position estimate is con irmed and re ined by matching a compact representation of the location’s visual appearance to the image-based description of the known locations, also stored in a database. Thus, the proposed system combines data from both sources of localization information available in a typical mobile device, achieving higher recognition rates than eithor of subsystems and is less prone to failures caused by the peculiarities of a particular environment. Moreover, the system is energy-ef icient as the loop closure detection procedure is triggered only when needed, as a discrete event. To the best knowledge of the author, a similar idea has not been yet presented in the literature. In section 2, the structure of the proposed system is presented, as well as the details of the WiFi-based and image-based subsystems. The next section 3 focuses on the experimental evaluation of each subsystem and the integrated solution. Moreover, it describes three datasets recorded in different environments and used for evaluation. The last section 4 concludes the paper and mentions future work.
2. System Structure 2.1. WiFi Fingerprin ng The WiFi ingerprinting approach was irstly described in [1]. As the WiFi ingerprint allows only to localize in a known environment, the system based on WiFi ingerprint operates in two stages: - data acquisition stage, - localization stage. In the data acquisition stage, certain positions are chosen as references, where available WiFi signals are scanned and stored in a database. These positions can be randomly chosen, uniformly chosen or based on the structure of the building. Due to the energy considerations, the proposed system scans only the positions that are important for user navigation, .e.g., doors that have to be crossed, beginning of the long corridor or the entrance to a new part of the building. Due to these limitations, it is assumed that the user is capable of performing local navigation whereas system provides global position information that the user can apply to plan his/her movement. The WiFi ingerprint ap-
VOLUME 8,
N◦ 3
2014
proach assumes that each position can be uniquely deined by the combination of access points’ MAC addresses and RSSI signal strength values. An exemplary situation is represented in Fig. 1, where the user movement is represented by dashed lines, whereas the discrete events, when WiFi scanning is performed, are drawn using circles. Each WiFi found in a single position is marked using a line connecting the AP and user position. The list of WiFi networks available in each position is the list of lines that are pointing towards user’s position. Assuming that the WiFi database of a loor is created, it is essential to ef iciently compare the list of scanned WiFis X to the WiFi scans stored in the database D. The comparison has to be performed using a function that evaluates the difference of two scans: new scan X and one of the scans Y in the database D. Typically, the WiFi scans are compared using the Euclidean norm [1]: v uN ∑ 1u d(X , Y) = t (Xi − Yi )2 , N i=1
(1)
where Xi and Yi represent the strengths of i-th network found in both scans, X and Y. Number N is the count of networks found in both scans. Finding the best correspondence in the database can be written as inding a record, which distance function to current scan is minimal: Ymin = argmin d(X , Y) Y∈D
(2)
The Euclidean distance is usually applied as it allows to precisely position user based on the measured RSSI values. But in the case of sparse position set it is more important to rely on the unique set of found networks than on the strength of these networks. Therefore an evaluation of various distance/similarity functions is performed in section 3. Moreover, as the system operates, it gathers new data that might be stored as the scans that have been correctly matched to some WiFi ingerprint from the database, or as unclassi ied cases. This way the system might gather new information, which can be used to detect, when user revisits position previously added to the database. The information about new positions can be also used to provide user with the database containing positions important for particular user, which due to the personal importance might be revisited in the future. 2.2. Visual Loop Closure Visual loop closure is a technique that tries to determine if the currently observed scene had been previously encountered based on the captured images. Computer vision algorithms usually try to process only a subset of available image information in order to reduce the processing time. This observation is also valid for visual loop closure, for which the detection of a sparse set of salient features is performed. In most 11
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
Fig. 1. In WiFi fingerprin ng approach, user’s posi on is recognized based on the combina on of scanned WiFi networks sented in Fig. 2. In practical applications, the vision-based loop closure is hard to detect robustly. Even a small difference in the observation’s orientation can in luence the observed feature set and therefore prevent the system from correctly recognizing that the place was previously visited. What is more, the database of image words takes a lot of memory and may grow with the system’s running time, therefore the corresponding image’s are not stored. 2.3. WiFi-guided Visual Loop Closure
Fig. 2. The processing steps of the Visual Bag of Words approach
cases the SURF [2] detector and descriptor is used. Another possible approach may utilize the HoG [7] information. Each feature is then described by the set of values representing its local neighbourhood. Descriptors for all salient features are then compressed into a single image descriptor called a word. This approach is known as the Bag-of-Words approach [6]. To compress the information into an image’s word, the Bagof-Words approach irstly determines the k clusters of descriptor types using the k-means algorithm and then labels each image descriptor with the number of cluster it has been assigned to. The numbers of descriptors assigned to each cluster is used to create a histogram representing an image in further computations. The process results in reducing the representation of a single image into one vector of loating point values. The processing low of Bag-of-Words is repre12
The main contribution of this paper is the combination of the already known algorithms in creation of a robust, data integrating system. The idea behind the proposed algorithm is simple: try to match WiFi information giving global estimate than can be a good initial estimate for further con irmation from the visionbased loop closure subsystem. The system starts with gathering WiFi and image information into database during the preparation task to allow further loop closures. Due to the WiFi mechanism, WiFi scanning time takes one to ive second depending on the used WiFi adapter drivers. These, relatively long scanning times make WiFi ingerprinting useless in case of a dynamic motion, e.g., person running through a building. What is important, dynamic motion also negatively impacts the visionbased loop closure as the images would contain signi icant amount of motion blur. Therefore, in the proposed system, dynamic motion is detected using the combination of gyroscope and accelerometer and in that situation new information is not inserted into the database. Assuming that the motion speed is below the chosen threshold, WiFi scan, image and orientation from the Android-based orientation estimation system are stored. Between the starting and ending time of the WiFi scanning, 20–40 images can be cap-
Journal of Automation, Mobile Robotics & Intelligent Systems
tured. From those images, the images with most distinct orientations are chosen to best represent different points of view. For each scan, the image taken approximately in the half of WiFi scan duration is chosen and will be referred to as the mid-scan image. If there are images with orientation signi icantly different than the mid-scan image, those images are considered to be used for visual loop closure. Maximally, mid-scan image and two additional images with highest orientation difference are processed per scan in the visual loop closure approach. For each image, it’s salient features are detected and described using descriptors. The descriptors are then used to form an image’s word using Bag-of-Words approach. The created word is a shorter representation of the image and allows ef icient comparison between images. The processing of the localization mode of the proposed system is presented in Fig. 3. The system gathers the WiFi and image information. From the image, Bag-of-Words technique creates a word representing observed location. Then the WiFi scan is compared to the database entries and in case of successful WiFi ingerprints match, the comparison of words representing the images is performed. If the WiFi match is conirmed by the image match, the mobile device is believed to have been successfully localized. If the position is not recognized, the image and the WiFi scan are stored in the database as a new position used in the recognition process.
VOLUME 8,
N◦ 3
2014
able for x86/x64 and Android platforms [16]. The proposed application consists of a Java-part used for GUI and less demanding computations, and C++ NDK libraries for more demanding taska, e.g., image processing. The structure was proved to be a good tradeoff between programming complexity and processing time and has been already proven to work well in another Android-based experiments [15]. A similar project structure is also used in [13].
3. Experimental Evalua on 3.1. Recorded Datasets The experiments were performed on the dataset recorded in two buildings of the Poznan University of Technology (building of Mechatronics, Biomechanics and Nanoengineering (PUT CM) and the Lecture Center (PUT CW) and a shopping mall located in Poznań (SM). The user equipped with a smartphone was moving around the buildings gathering WiFi scans and corresponding images in places that seemed important for user localization due to the building structure, e.g. short corridor connecting two parts of the building or unique objects in sight. The dataset PUT CM contains 14 places of possible loop closures, where the dataset PUT CW contains 20 places of possible loop closures. The shopping mall dataset SM contains also 20 places of possible loop closure. For each place, several WiFi scans and several images were recorded. For each of those positions, one recording was assumed to be inserted into the database created prior to localization. The remaining samples were used in a testing phase. More information about the datasets is presented in Table 1. Exemplary images from the datasets are presented in Fig. 4. Tab. 1. Short descrip on of recorded datasets
Fig. 3. Processing steps of the proposed loop closure approach 2.4. Implementa on Remarks The proposed approach is application-orientated, therefore it has been tested on the Samsung Galaxy Note 3, which uses the Android 4.4 as an operating system. The information about the WiFi signal strength was captured using Android-available functionality in the Java API. The time of a single WiFi scan time depends on the wireless adapter driver installed on the mobile device and on the Samsung Galaxy Note 3 it takes approx. 4 seconds. The image processing was done using the commonly used OpenCV library (2.4.8) [5], which is avail-
dataset name num. of positions num. of records avg. num. of WiFis in the scan avg. RSSI of 5 strongest WiFis Building structure
PUT CM
PUT CW
SM
14 140
20 100
20 100
14.21
39.21
20.22
-75.045
-41.642
-32.143
corridors
openspace
shopping mall
3.2. Tes ng the Nature of WiFi Signal The evaluation starts with an assessing the repeatability of WiFi scans. In a perfect environment with APs in line-of-sight, the measurement should be perfectly the same. In a cluttered environment with possible, multiple re lections and additional disturbances due to moving people, the scans information might be noisy. What is also essential to propose a distance function measuring the similarity of two scans is the probability distribution of measurements. This experiment consist of performing 1439 consecutive scans in a single spot using the Samsung Galaxy Note 3. 13
Journal of Automation, Mobile Robotics & Intelligent Systems
N◦ 3
VOLUME 8,
2014
Histogram of measured RSSI values 35 WiFi id=1 WiFi id=2
Percent of total
30 25 20 15 10 5 0 −90
−80
−70
−60
−50
−40
−30
RSSI Fig. 4. Exemplary images presen ng different building structures for PUT CM (row a), PUT CW (row b) and SM (row c) datasets
In the experiment the average RSSI signal is measured, while looking for the standard deviation of the measurement. Also the repeatability was measured to determine if there is a clear correspondence to the measured signal strength. The results of the experiment are presented in Tab. 2. Tab. 2. WiFi signal floa ng example for 1439 measurements taken in a single spot WiFi id
avg(RSSI)
std(RSSI)
1 2 3 4 5 6 7 8 9 10
-49.12 -74.24 -74.57 -83.49 -83.99 -84.15 -86.08 -86.64 -87.45 -87.57
5.51 3.05 2.99 3.42 1.83 2.69 3.33 1.78 1.21 1.47
Network detection percent 100.00% 100.00% 100.00% 56.12% 94.02% 82.82% 94.65% 94.65% 95.76% 67.39%
The presented results show that in most cases the stronger the signal, the higher is the standard deviation of these measurements. Moreover, with an exception for network 4, the stronger networks are detected with higher repeatability percentage and thus they are a good indicator if the user is in a vicinity of a previously stored WiFi scan. Also, in Fig. 5 the histogram of values for two WiFi networks with the greatest average signal strength is presented. Due to the cluttered environment, the achieved probability distributions are not Gaussian in all cases (like for WiFi with id=1). This observation indicates that when possible, it is better to rely more on the combination of detected networks than trust the measured signal strength, which can differ up to 20 dBm in a single spot. 14
Fig. 5. Experimental distribu on of RSSI for series of measurements in a single spot 3.3. Tes ng the Distance Func ons used for WiFi Scans Comparison The WiFi ingerprinting in the proposed approach is used to localize in the discrete set of positions. Therefore, the WiFi ingerprinting returns information about the most similar pose stored in the database or information about the unsuccessful match. Due to these assumptions, the comparison of WiFi scans using the standard Euclidean distance might not be the best choice as the combination of detected WiFi networks in most cases is suf icient to determine in which position the scan was performed. To determine the correctness of this statement, several de initions of distance/similarity functions are proposed and evaluated on the recorded datasets. For each position, one scan was treated as the database entry, whereas other scans were compared to all available database entries. The results are presented in Tab. 3. Tab. 3. Comparing WiFi fingerprin ng distance funcons on the recorded datasets Used function Simple similarity Euclidean norm Euclidean norm II Gaussian with σ = 2 Gaussian with σ = 3 Gaussian with σ = 5 Gaussian with σ = 10 Gaussian with σ = 15 Gaussian with σ = 20
PUT CM
PUT CW
SM
72.86% 61.43% 61.43% 82.14% 84.29% 86.43% 85.71% 83.57% 83.57%
67% 53% 53% 99% 99% 100% 100% 98% 96%
75% 94% 94% 94% 96% 97% 95% 89% 84%
The irst tested function was a simple similarity function, which for both scans represents the number of WiFi networks that are detected in both scans. This function does not use the RSSI information, but has been chosen as the baseline approach, that can be a reference point for other approaches. This method achieved position recognition rates of 72.86%, 67%
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
and 75% for PUT CM, PUT CW and SM datasets respectively. The high recognition rates of this simple approach are believed to be task speci ic. In the performed tests, a sparse set of WiFi measurements is taken in locations that are separated by several meters. As the localization positions are not placed closely to each other, in many cases the combination of network names is suficient to correctly determine the user’s location. The second tested function is the Euclidean distance deined as in the state-of-the-art works [1]. Surprisingly, the Euclidean norm results in lower regonition rate for PUT CM and PUT CW datasets, which is in contrast to better recognition rate for SM. The author believes that those results are caused by different structures of the building. In case of PUT buildings, the APs are usually placed inside rooms and regardless of the corridor type, the WiFi information that reaches the mobile device was probably de lected several times. In case of the shopping mall, the open-spaces result in a WiFi signal propagating directly to the user, thus resulting in lesser number of de lections. Another tested functions was an Euclidean norm with an additional subtracted discount for each correctly matched network (called Euclidean norm II). This approach was based on an observation, that WiFi scans with higher number of matched networks are intuitively more likely to be the same. This modi ication didn’t have any signi icant impact and resulted in values similar to the Euclidean distance approach. Due to the low recognition rate achieved with the Euclidean distance propositions, the simple similarity idea was expanded to incorporate RSSI values. For simplicity, the RSSI of the same networks are assumed to have Gaussian distribution. Then the similarity of networks found in two scans is de ined by Gaussian membership values. The similarity between two scans X and Y is measured as a sum of Gaussian membership values for all networks available in both scans. Formally, it can be written as:
SGauss (X , Y, σ) =
N ∑ i=1
exp{−
(Xi − Yi )2 }, −2σ 2
N◦ 3
2014
3.4. Tes ng the Vision-based Loop Closure The next tests concern the recognition rate of the vision-based loop closure subsystem. Similarly to the WiFi evaluation, for each distinct position one image was chosen as a reference. The remaining images were then compared against all of the images in the database in order to ind a positive match. The images taken with the Samsung Galaxy Note 3 have a maximum resolution of 1920 × 1080 pixels (Full HD). Due to the mobile platform processing power, the resolution of 640 × 480 pixels (VGA) is chosen as the image of reduced size have 7 times less pixels to process. This results in obvious processing speed up. A detailed comparison with VGA and FullHD images is presented in Tab. 4. The most time consuming part of any system using the SURF detector/descriptor is the detection of keypoints that takes almost 1s on the Samung Galaxy Note 3. The obvious reduction of needed time can be achieved by lowering the number of keypoints used by system and thus described by the descriptor. Unfortunately, the minimal number of keypoints needed to achieve a robust system is application dependent and in the proposed tests 500 strongest keypoints were chosen. Another time reduction strategy is to use different detector/descriptor pairs [15], but tests concerning the choice of detector/descriptor pairs are not a part of presented research.
(3)
where, N is the number of common networks found in both X and Y scans. The σ is the standard deviation of the measurement used to de ine the shape of Gaussian membership function. The choice of σ is arbitrary, but from the experiment measuring the WiFi scans in a single spot, it was assumed that best results should be achieved for a value in the range of 2 to 7. To conirm this assumption, different σ values have been chosen. As expected, the best results were obtained for σ equal to 5. The results using modi ied similarity values turned up to be better when compared to previous approaches. For PUT CM the recognition rate increased to 86.43%, for PUT CW to 100%, for SM to 97%. In case of PUT CW, the WiFi information is suf icient to precisely localize the mobile device. In the remaining cases, the usage of image information may be useful in inding loop closures in scenarios, where WiFi matching failed.
Fig. 6. Similar corridors with corresponding image words observed in two distant localiza on posi ons in PUT CM dataset When the detection and description parts are inished, a k-dimensional word creation is initiated. The process start with the classi ication of descriptors of keypoints found in the image. The descriptors are classi ied into clusters, for which the minimal error to the centroid is obtained. The centroids are computed prior to the localization. The centroids are found by performing a k-means algorithm computation on a dataset consisting of every descriptor found in all reference images. After the classi ication, each image is described by a histogram of length k with number of descriptors classi ied into each cluster. The construction of the image word inishes with a normalization procedure of the histogram. The exemplary computed words for images from PUT CM dataset are presented 15
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
Tab. 4. Processing me of the proposed visual loop closure subsystem System part Image resizing Keypoints detection Keypoints description Word creation (K=5) Word creation (K=20) Word creation (K=50) Word creation (K=200) Estimated total time per word creation (K=20)
S. Galaxy Note 3 VGA 18.86 ms 460.53 ms 486.73 ms 30.71 ms 127.21 ms 351.21 ms 1417.36 ms
S. Galaxy Note 3 Full HD 2376.32 ms 720.86 ms 30.14 ms 126.78 ms 349.43 ms 1378.71 ms
Nexus 7 VGA 34.0 ms 443.57 ms 431.21 ms 34.00 ms 116.79 ms 374.00 ms 1303.36 ms
Nexus 7 Full HD 2516.21 ms 648.57 ms 33.64 ms 116.07 ms 300.43 ms 1133.71 ms
1093 ms
3224 ms
1026 ms
3281 ms
in Fig. 6. After the normalization, k-dimensional word for an image is successfully computed, the subsystem determines the correct match to the reference frames stored in the database by comparing current image to all entries in the database. In the proposed subsystem the comparison of words is done using the Euclidean distance. If the smallest distance between matches is higher than a preset threshold, the match is considered to be correct.
Tab. 5. Accuracy of the proposed visual loop closure approach
k=5 k = 20 k = 50 k = 200 k = 500 k = 1500
PUT CM
PUT CW
SM
70.71% 91.43% 95% 97.86% 98.57% 98.57%
48% 72% 78% 88% 90% 92%
64% 84% 86% 92% 97% 98%
jects poses a great challenge for the visual system as the detected features are in most cases similar for all of the positions in the sequence. The problem of similar places also arises for PUT CW, for which the lowest performance is achieved. The recognition rate of 92% for the SM dataset in most cases is a result of situations, when passing pedestrians are present in a signi icant part of the image and thus make images from training and testing sets look different. 3.5. Results – Tes ng WiFi Guided Vision Loop Closure Fig. 7. The recogni on rate and me taken for word crea on for different number of centroids evaluated at PUT CW
To determine, what number of classes k used by the k-means algorithm results in the highest recognition rate, different number of k values were evaluated. The results are presented in Fig. 7. For the proposed datasets, higher number of classes for the k-means algorithm results in higher recognition ratee. But, in the presented vision-based loop closure approach, each descriptor is assigned to one of k clusters. If the k value is higher, the total time needed to classify descriptors is higher. Therefore, it is necessary to ind a k value that results in high recognition rate within reasonable time. From Fig. 7, the value of k equal to 200 is chosen as the best choice and used in described subsystem. The results obtained by the proposed approach are also presented in Tab. 5. The visual loop closure has the highest recognition rate of visited places in case of PUT CM dataset, which is equal to 97.86% for chosen k equal to 200. The small number of distinct ob16
The system that combines information from both subsystems is expected to outperform either of them. In Tab. 6 the best results obtained from both subsystems are presented. The comparison shows, that WiFi ingerprinting provides more reliable estimate for the PUT CW or SM datasets, whereas visual loop closure works better for the PUT CM dataset. In case of application tailored for a speci ic building, the system designer may decide to use only one source of information. If the single-source solution is inef icient, there exists a need for a system integrating data from both subsystems. In case of an unknown building structure, it is essential to correctly weight information from both subsystems. In the presented research, three methods are proposed and tested: 1) method I – rank-based, 2) method II – normalize and sum, 3) method III – normalize and multiply. Method I, called rank-based, for each position to evaluate assigns the ranks based on the similarity of WiFi scans and distance functions for vision-based loop closure to positions stored in the database. For
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
each position to classify, the most probable estimates from both subsystems are provided. Then, separately for each subsystem, the most probable estimate is assigned rank 1, the second most probable is assigned rank 2 and so forth. At this point of processing, each position to process contains two ranks representing the estimates from both subsystems. Then, for each reference position, a summation of assigned ranks is performed. The position in the database with a lowest sum of ranks is chosen as a combined system estimate. Method II, normalize and sum, tries to incorporate also information about the distances between position in the estimates of separate subsystems. To include this information into the proposed system, subsystems estimates must have a similar range of values. Therefore, for each position a vector of distances to all database entries is created and then normalized in L2 norm. The normalized vector of estimates for WiFi ingerprinting is denoted by w. The equivalent, normalized vector for vision loop closure is denoted by v. As the WiFi subsystem operates using similarities between classes, whereas vision-based loop closure uses distances, the inal estimation is computed as difference of estimates (w − v). The inally inferred position is based on inding an index of maximal element in a w − v vector: estimatedIDII = argmax{w(i) − v(i)},
(4)
i
Method III, normalize and multiple, uses a similar strategy to previously presented method II. In this case, the distances from vision loop closure are recomputed to represent similarities by exchanging each value x in a vector v with 1 − x. The resulting vector is again normalized. The best position estimated by the integrated system corresponds to an index of maximal value after elementwise multiplication of vectors v and w: estimatedIDIII = argmax{(1 − v(i)). ∗ w(i)},
N◦ 3
2014
operating in three different buildings, the recognition rate was equal or greater than 90% in each, tested case without usage of additional subsystem weights. Tab. 6. Localiza on recogni on rate of subsystems and different approaches to the system combining informaon from WiFi and Vision subsystems
WiFi ingerprinting Visual loop closure Method I (rank-based) Method II (sum) Method III (product)
PUT CM
PUT CW
SM
86.43% 97.86% 92.14% 90% 88.57%
100% 88% 97% 100% 100%
97% 92% 97% 98% 98%
4. Conclusion The proposed event-based, WiFi-guided visual loop closure approach presents a new approach to data integration of mobile platforms’ sensor information that results in a system than outperforms each individual approach. The information from camera usually helps in localization in areas with small number of WiFis, e.g., corridors or staircases. What is surprising, the system performed well in the case of corridors that seemed alike. The system works with lesser recognition rate in case of a shopping mall, where sudden pedestrian’s occlusions negatively affect the visual localization. Moreover, the achieved results suggest that WiFi and vision information complement each other and provide a data needed to create a more robust localization system. Contrary to proposed event-based localization, the further works will focus on providing a continuous estimate at the user by estimating the motion through the vision-based monocular visual odometry with additional incorporation of WiFi information.
(5)
i
The proposed system is evaluated in the same way as shown for the subsystems. The results are presented in Tab. 6. It is shown that when concerned about the PUT CM building, all of different functions for the system using WiFi data performed worse than vision-based loop closure. In other cases, the proposed system performs the same or better than either of the subsystems. The best results are obtained for method II and it is the recommended method if the structure of the building is unknown or if the system must operate in changing conditions regarding the uniqueness of images and number of available WiFi networks. In case of a system created for speci ic building it is recommended to record a training and testing set and perform experiments to correctly weight the input of each subsystem. In some cases it might be also necessary to detect same positions based solely on WiFi, whereas in other completely rely on gathered images. These mentioned remarks are application-speci ic and cannot be applied to universal system. In case of the proposed system
ACKNOWLEDGEMENTS The author would like to thank Piotr Skrzypczyń ski for numerous discussions regarding the presented localization system. This work is inanced by the Polish Ministry of Science and Higher Education in years 2013-2015 under the grant DI2012 004142. AUTHOR Michał Nowicki∗ – Institute of Control and Information Engineering, Poznań University of Technology, Poznań , Poland, e-mail: michal.nowicki@cie.put.poznan.pl. ∗
Corresponding author
REFERENCES [1] P. Bahl, V. N. Padmanabhan, “RADAR: An InBuilding RF-Based User Location and Tracking System.” In: 19th Annual Joint Conf. of the IEEE 17
Journal of Automation, Mobile Robotics & Intelligent Systems
Computer and Communications Societies (INFOCOM), 2000, pp. 775–784. DOI: http://dx.doi. org/10.1109/INFCOM.2000.832252. [2] H. Bay, A. Ess, T. Tuytelaars, L. Van Gool, “SURF: Speeded up robust features”, Comp. Vis. and Image Underst., vol. 110, no. 3, 2008, pp. 346–359. DOI: http://dx.doi.org/10.1016/j.cviu. 2007.09.014. [3] J. Biswas, M. Veloso, “WiFi localization and navigation for autonomous indoor mobile robots.” In: 10 IEEE Int. Conf. on Robotics and Automation (ICRA), 20, 2010, pp. 4379–4384. DOI: http:// dx.doi.org/10.1109/ROBOT.2010.5509842. [4] S. Boonsriwai, A. Apavatjrut, “Indoor WIFI localization on mobile devices.” In: 2013 10th Int. Conf. on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), 2013, pp. 1–5. DOI: http://dx. doi.org/10.1109/ECTICon.2013.6559592. [5] G. Bradski, “The OpenCV library”, Dr. Dobb’s Journal of Software Tools, opencv.org, 2000. [6] G. Csurka et al., “Visual categorization with bags of keypoints.” In: Workshop on Statistical Learning in Computer Vision, ECCV, 2004, pp. 1–22. [7] N. Dalal, B. Triggs, “Histograms of oriented gradients for human detection.” In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR), 2005, vol. 1, pp. 886–893. DOI: 10.1109/CVPR.2005.177. [8] T. Gallagher et al., “Indoor positioning system based on sensor fusion for the Blind and Visually Impaired.” In: Int. Conf. Inndoor Positioning and Indoor Navigation (IPIN), 2012, pp. 1–9. DOI: 10.1109/IPIN.2012.6418882. [9] A. Glover et al., “OpenFABMAP: An Open Source Toolbox for Appearance-based Loop Closure Detection.” In: IEEE Int. Conf. on Robotics and Automation, St Paul, Minnesota, 2011. [10] J. Goś liń ski, M. Nowicki, “Performance Comparison of EKF-based Algorithms for Orientation Estimation on Android Platform.” [11] M. Holĉ ik, “Indoor Navigation for Android,” M.S. thesis, Faculy of Informatics, Masaryk Univ., Brno, 2012. [12] H. Liu et al., “Accurate WiFi Based Localization for Smartphones Using Peer Assistance,” IEEE Transactions on Mobile Computing, vol. PP, no. 99, 2013, pp. 1. DOI: 10.1109/TMC.2013.140. [13] K. Muzzammil bin Saipullah, A. Anuar, N. A. binti Ismail, Y. Soo, “Real-time video processing using native programming on Android platform.” In: Proc. IEEE 8th Int. Col. on Signal Proc. and its App., 2012, pp. 276–281. [14] M. Nowicki, P. Skrzypczyń ski, “Combining photometric and depth data for lightweight and robust visual odometry.” In: European Conference on Mobile Robots (ECMR), 2013, pp. 125–130. DOI: 10.1109/ECMR.2013.6698831. 18
VOLUME 8,
N◦ 3
2014
[15] M. Nowicki, P. Skrzypczynski, “Performance Comparison of Point Feature Detectors and Descriptors for Visual Navigation on Android Platform,” Int. Wireless Communications and Mobile Computing Conference (IWCMC), 2014. [16] K. Pulli et al., “Real-time Computer Vision with OpenCV,” Commun. ACM, 2012, vol. 55, no. 6, pp. 61–69. DOI: 0.1145/2184319.2184337. [17] N. Ravi, P. Shankar, A. Frankel, A. Elgammal, L. Iftode, “Indoor localization using camera phones,” Proc. 7th IEEE Work. on Mobile Comp. Sys. and App., 2006, pp. 1–7. [18] U. Shala, A. Rodriguez, “Indoor Positioning using Sensor-fusion in Android Devices,” M.S. thesis, Dept. Computer Science, Kristianstad Univ., Kristianstad, 2011. http://hkr.diva-portal. org/smash/record.jsf?pid=diva2:475619 [19] M. Quigley, D. Stavens, A. Coates, S. Thrun, “Sub-meter indoor localization in unmodi ied environments with inexpensive sensors.” Proc. IEEE/RSJ Int. Conf. on IROS, Taipei, 2010, pp. 2039–2046. DOI: 10.1109/IROS.2010.5651783.
Journal of Automation, Mobile Robotics & Intelligent Systems
A
O B P
VOLUME 8,
CIR I
R
M
N◦ 3
2014
C
U
Submi ed: 10th June 2014; accepted: 27th June 2014
Piotr Nowak, Maciej Romaniuk DOI: 10.14313/JAMRIS_4-2013/23 Abstract: The number and amount of losses caused by natural catastrophes are important problems for insurance industry. New financial instruments were introduce to transfer risks from insurance to financial market. In this paper we consider the problem of pricing such instruments, called the catastrophe bonds (CAT bonds). We derive valua on formulas using stochas c analysis and fuzzy sets theory. As model of short interest rate we apply the one-factor Cox–Ingersoll–Ross (CIR) model. In this paper we treat the vola lity of the interest rate as a fuzzy number to describe uncertainty of the market. We also apply the Monte Carlo approach to analyze the obtained cat bond fuzzy prices. Keywords: asset pricing, catastrophe bonds, CIR model, stochas c analysis, Monte Carlo simula ons, fuzzy numbers
1. Introduc on Nowadays, natural catastrophes are important source of serious problems for insurers and reinsurers. Even single catastrophic event could results in damages worth of billions of dollars – e.g. the losses from Hurricane Katrina in 2005 are estimated at $40–60 billion (see [26]). The insurance industry is not prepared for such extreme damages. The classical insurance approach is based on assumption of independent and small (in comparison of the value of the whole insurance portfolio) losses (see, e.g. [3]). This assumption is not adequate in the case of outcomes of natural catastrophes, like hurricanes, loods, earthquakes etc. Therefore, after such catastrophic event, there are bankruptcies of the insurers, problems with liquidity of their reserves or increases of reinsurance premiums. For example, after Hurricane Andrew more than 60 insurance companies fell into insolvency (see [26]). Then new kind of inancial instruments were introduced. The main aim of such inancial derivatives is to transfer risks from insurance markets into inancial markets, which is know as securitization (see, e.g., [10,15,28]). Catastrophe bond, known also as cat bond or Act-of-God bond (see, e.g., [8, 14, 17, 31, 33, 38, 40]) is an example of such new approach. The payment function of the catastrophe bond is connected with additional random variable, i.e. triggering point. This triggering point (indemnity trigger, parametric trigger or index trigger) depends on occurrence of speci ied catastrophe (like hurricane) in
given region and ixed time interval or it is connected with the value of issuer’s actual losses from catastrophic event (like lood), losses modeled by special software based on the real parameters of a catastrophe, or other parameters of a catastrophe or value of catastrophic losses (see, e.g. [17,40,41]). Usually if the triggering point occurs, the payments for the bondholder are lowered or even set to zero. Otherwise, the bondholder receives full payment from the cat bond. The cat bond pricing literature is not very rich. An interesting approach applying discrete time stochastic processes within the framework of representative agent equilibrium was proposed in [9]. In [2] the authors applied compound Poisson processes to incorporate various characteristics of the catastrophe process. The authors of [5] improved and extended the method from [2]. In [1] the doubly stochastic compound Poisson process was used to model the claim index, and QMC algorithms was applied. In [13] structured cat bonds were valued with application of the indifference pricing method. Vaugirard in [40] used the arbitrage method for pricing catastrophe bonds. In his approach a catastrophe bondholder was deemed to have a short position on an option based upon a risk index. Similar approach was proposed in [25], where the Markov-modulated Poisson process was used for description of the arrival rate of natural catastrophes. In this paper we continue our earlier research concerning pricing cat bonds (see [33]). We apply stochastic analysis and fuzzy arithmetic to obtain the catastrophe bond valuation expression. In our approach the risk-free spot interest rate r is described by the Cox–Ingersoll–Ross model. For description of natural catastrophe losses we use compound Poisson process with a deterministic intensity function. We also consider a complex form of catastrophe bond payoff function, which is piecewise linear. Main assumptions in our approach are: (i) the absence of arbitrage on the inancial market, (ii) neutral attitude of investors to catastrophe risk. Similar assumptions were made by other authors (see, e.g. [40]). Applying fuzzy arithmetic, we take into account different sources of uncertainty, not only the stochastic one. In particular, the volatility parameter of the spot interest rate is determined by luctuating inancial market and in many situations its uncertainty does not have stochastic type. Therefore, in order to obtain the cat bond valuation formula we apply fuzzy volatility parameter of the stochastic process r. As result, price obtained by us has the form of a fuzzy number. For a given α (e.g. α = 0.9) its α-level set can be 19
Journal of Automation, Mobile Robotics & Intelligent Systems
used for investment decision-making as the interval of the cat bond prices with an acceptable membership degree. Similar approach was applied to option pricing in [42] and [30, 34, 35], where Jacod-Grigelionis characteristics of stochastic processes (see, e.g. [29, 39]) were additionally used. In more general setting, so called soft approaches are applied in many other ields, see, e.g. [19–23]. This paper is organized as follows. Section 2 contains preliminaries on fuzzy and interval arithmetic. In Section 3 the catastrophe bond pricing formula in crisp case for the Cox–Ingersoll–Ross risk-free interest rate model is derived. Section 4 is devoted to catastrophe bond pricing with fuzzy volatility parameter. Since the pricing formula is considered for arbitrary time moment before maturity, fuzzy random variables are additionally introduced. Apart from the fuzzy valuation formula, the expressions describing the forms of α-level sets of the cat bond price are obtained. In Section 5 the introduced formulas are used to obtain the fuzzy prices of catastrophe bonds. Based on fuzzy arithmetic and Monte Carlo approach, the behavior of prices is analyzed for various settings close to the real-life cases. Special attention is paid to the in luence of selected parameters of the model of catastrophic events on the evaluated fuzzy prices. Finally, Section 6 contains conclusions.
2. Fuzzy Sets Preliminaries In this section we present basic de initions and facts concerning fuzzy and interval arithmetic, which will be used in the further part of the paper. For a fuzzy subset A˜ of the set of real numbers R we denote by µA˜ its membership function µA˜ : R → [0, 1] and by A˜α = {x : µA˜ (x) ≥ α} the α-level set of A˜ for α ∈ (0, 1]. Moreover, by A˜0 we denote the closure of the set {x : µA˜ (x) > 0}. A fuzzy number a ˜ is a fuzzy subset of R for which µa˜ is a normal, upper-semicontinuous, fuzzy convex function with a compact support. If a ˜ is a fuzzy number, then for each α ∈ [0, 1] the α-level set a ˜α is U a closed interval of the form a ˜α = [˜ aL , a ˜ ], where α α a ˜L ˜U ˜L ˜U α, a α ∈ R and a α ≤ a α . We denote the set of fuzzy numbers by F (R). Let us assume that ⊙ is a fuzzy-number binary operator ⊕, ⊖, ⊗ or ⊘, corresponding to its real-number counterpart ◦: +, −, × or /, according to the Extension Principle. Let ⊙int be a binary operator ⊕int , ⊖int , ⊗int or ⊘int between two closed intervals [a, b] and [c, d]. Then the following equality holds: [a, b]⊙int [c, d] = {z ∈ R : z = x◦y, x ∈ [a, b], y ∈ [c, d]}, where ◦ is the corresponding real-number binary operator +, −, × or /, under the assumption that 0 ∈ / [c, d] in the last case. Thus, if a ˜, ˜b are fuzzy numbers, then a ˜ ⊙ ˜b is also a fuzzy number and the following equalities are ful illed. ˜L ˜U + ˜bU ] , (˜ a ⊕ ˜b)α = a ˜α ⊕int ˜bα = [˜ aL α + bα , a α α 20
N◦ 3
VOLUME 8,
2014
˜U ˜U − ˜bL ] , (˜ a ⊖ ˜b)α = a ˜α ⊖int ˜bα = [˜ aL α − bα , a α α (˜ a ⊗ ˜b)α = a ˜α ⊗int ˜bα = ˜L˜bU , a ˜U ˜bL , a ˜U ˜bU }, = [min{˜ aL˜bL , a α α
α α
α α
α α
U ˜L U ˜U ˜L ˜L˜bU , a max{˜ aL ˜α bα }] , α bα , a α α ˜ α bα , a
(˜ a ⊘ ˜b)α = a ˜α ⊘int ˜bα = = [min{˜ aL /˜bL , a ˜L /˜bU , a ˜U /˜bL , a ˜U /˜bU }, α
α
α
α
α
α
α
α
U ˜L U ˜U ˜L ˜L /˜bU , a max{˜ aL ˜α /bα }] , α /bα , a α α ˜α /bα , a
if α-level set ˜bα does not contain zero for all α ∈ [0, 1] in the case of ⊘. A fuzzy number a ˜ is called positive (˜ a ≥ 0) if µa˜ (x) = 0 for x < 0 and it is called strictly positive (˜ a > 0) if µa˜ (x) = 0 for x ≤ 0. A triangular fuzzy number a ˜ = (a1 , a2 , a3 ) is a fuzzy number with the membership function of the form x−a1 a2 −a1 if a1 ≤ x ≤ a2 x−a3 if a2 ≤ x ≤ a3 . µa˜ (x) = a2 −a3 0 otherwise. In our further considerations we will use the following proposition, proved in [42]. Proposition 1. Let f : R → R be a function such that f −1 ({y}) is a compact set for each y ∈ R. Then f induces a fuzzy-valued function f˜ : F (R) → F (R) via the ˜ ∈ F (R) the α-level Extension Principle and for each Λ ˜ has the form f˜(Λ) ˜ = {f (x) : x ∈ Λ ˜ α }. set of f˜(Λ) α We recall the notions of weighted interval-valued and crisp possibilistic mean values of fuzzy numbers. For details we refer the reader to [16]. Let a ˜ ∈ F (R). A non-negative, monotone increas∫1 ing function f : [0, 1] 7→ R such that 0 f (α)dα = 1 is said to be a weighting function. The lower and upper weighted possibilistic mean values M∗ (˜ a) and M ∗ (˜ a) of a ˜ are de ined by the integrals: ∫ 1 M∗ (˜ a) = a ˜L α f (α)dα, 0
M ∗ (˜ a) =
∫
1
a ˜U α f (α)dα. 0
The weighted interval-valued possibilistic mean M (˜ a) ¯ (˜ and the crisp weighted possibilistic mean M a) of the fuzzy number a ˜ have the following form: M (˜ a) = [M∗ (˜ a), M ∗ (˜ a)], M∗ (˜ a) + M ∗ (˜ a) ¯ (˜ . M a) = 2 Let B (R) be the Borel σ- ield of subsets of R and (Ω, F) be a measurable space. A fuzzy-number-valued ˜ : Ω 7→ F (R) is called a fuzzy random variable map X if { } ˜ (ω) (x) ≥ α ∈ F × B (R) (ω, x) : X for every α ∈ [0, 1] (see, e.g. [36]).
Journal of Automation, Mobile Robotics & Intelligent Systems
3. Catastrophe Bond Pricing in Crisp Case As it was previously noted, the triggering point changes the structure of the payment function of the cat bond. Usually cat bonds are issued by insurers or reinsurers (see, e.g., [37]) via a special tailor-made fund, called a special purpose vehicle (SPV) or special purpose company (SPC) (see, e.g., [24, 40]). The hedger (e.g. insurer or reinsurer) pays an insurance premium in exchange for coverage in the case if triggering point occurs (see Figure 1). The investors purchase the catastrophe bonds for cash. The premium and cash lows are directed to SPV, which purchases safe securities and issues the catastrophe bonds. Investors hold these assets whose payments depend on occurrence of the triggering point. If the pre-speci ied event occurs during the ixed period (e.g. there is a speci ied kind of natural catastrophe), the SPV compensates the insurer and the cash lows for investors are changed. Usually these lows are lowered, i.e. there is full or partial forgiveness of the payment. However, if the triggering point does not occur, the investors usually receive the full payment (i.e. the face value of the bond).
VOLUME 8,
N◦ 3
We de ine compound Poisson process by the formula Nt ∑ ˜t = N Ui , t ∈ [0, T ′ ] . i=1
for modeling the cumulative catastrophic losses till moment t. All the introduced above processes and random variables are de ined on probability space ( (Ω, ) F, P ). We introduce the following iltrations: Ft0 t∈[0,T ′ ] , ( 1) ( ) Ft t∈[0,T ′ ] and (Ft )t∈[0,T ′ ] . Ft0 t∈[0,T ′ ] is generated ( ) ˜ ˜ by W , Ft1 ′ by W and N . ′ by N and (Ft ) t∈[0,T ]
t∈[0,T ]
Moreover, they are augmented to encompass P -null sets from FT0 ′ , FT1 ′ and F = FT ′ , respectively. ∞ (Wt )t∈[0,T ′ ] , (Nt )t∈[0,T ′ ] and (Ui )i=1 are independent and the) iltered probability space ( Ω, F, (Ft )t∈[0,T ′ ] , P satis ies usual assumptions: σ-algebra F is P -complete, the iltration (Ft )t∈[0,T ′ ] is right continuous and each Ft contains all the P -null sets from F. Let r = (rt )t∈[0,T ′ ] be the risk-free spot interest rate, i.e. short-term rate for risk-free borrowing or lending at time t over the in initesimal time interval [t, t + dt]. We assume that r is an one-factor af ine model. For more details concerning af ine interest rate models we refer the reader to [11] and [12]. The Cox – Ingersoll – Ross model, considered in this paper, is of this type. The risk-free spot interest rate (rt )t∈[0,T ′ ] , belonging to the class of one-factor af ine models, is a diffusion process of the form drt = α (rt ) dt + σ (rt ) dWt ,
Fig. 1. Payments related to issuing and termina ng of the cat bond In the further part of this section we derive and present the pricing formula for catastrophe bonds in crisp case, assuming no arbitrage opportunity on the market. At the beginning we introduce all necessary de initions and assumptions. We use stochastic processes with continuous time to describe the dynamics of the spot risk-free interest rate and the cumulative catastrophe losses. The time horizon has the form [0, T ′ ], where T ′ > 0. The date of maturity of catastrophe bonds T is not later than T ′ , i.e. T ≤ T ′ . We consider two probability measures: P and Q and denote the expected values with respect to them by the symbols E P and E Q . We introduce standard Brownian motion (Wt )t∈[0,T ′ ] and Poisson process (Nt )t∈[0,T ′ ] with a deterministic intensity function ρ(t), t ∈ [0, T ′ ]. The Brownian motion will be used for description of the risk-free interest rate. ∞ We introduce a sequence (Ui )i=1 of independent, identically distributed random variables with inite second moment. For each i the random variable Ui will describe the value of losses during i-th catastrophic event.
2014
(1)
where α (r) = φ − κr and σ 2 (r) = δ1 + δ2 r for constants φ, κ, δ1 , δ2 (see, e.g. [27]). We denote by S the set of all the values which r can have with strictly positive probability. We require that δ1 + δ2 r ≥ 0 for all values r ∈ S. We assume that zero-coupon bonds are traded on the market, investors have neutral attitude to catastrophe risk and interest rate changes are replicable by other inancial instruments. Moreover, we assume that there is no arbitrage opportunity on the market. Then the family of zero-coupon bonds prices is arbitrage-free with respect to r for the probability measure Q equivalent to P , given by the RadonNikodym derivative ( ∫ ) ∫ T dQ 1 T ¯2 ¯ = exp − λt dWt − λ dt , P − a.s. dP 2 0 t 0 (2) ¯ t = λ0 σ (rt ) is the market price of risk prowhere λ cess, λ0 ∈ R. Under Q the process r is described by drt = α ˆ (rt ) dt + σ (rt ) dWtQ ,
(3)
where α ˆ (r) = φˆ − κ ˆ r, φˆ = φ − λ0 δ1 , κ ˆ = κ + λ0 δ2
(4) 21
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
and WtQ is Q - Brownian motion. We ix n ≥ 1, T ∈ [0, T ′ ] and F v > 0. Let K = (K0 , K1 , ..., Kn ) be levels of catastrophic losses, where 0 ≤ K0 < K1 < K1 < ... < Kn . Let w = (w1 , w2 , ..., wn ) be a sequence of nonnegative∑ numbers such that their sum is not greater n to 1, i.e. i=1 wi ≤ 1. De inition 1. By the symbol IB (T, F v) we denote catastrophe bond with the face value F v, the date of maturity and payoff T and the payoff function of the form n−1 ∑N ˜T ∧ Kj+1 − N ˜T ∧ Kj νT,F v = F v 1 − wj+1 . Kj+1 − Kj j=0 For the considered type of catastrophe bond the ˜T . payoff function is a piecewise linear function of N ˜T < K0 ), If the catastrophe does not occur (i.e. N the bondholder receives the payoff equal to its face ˜T ≥ Kn , the payoff is equal to value F ∑ v. If N n ˜T ≤ Kj+1 for j = F v (1 − i=1 wi ). If Kj ≤ N 0, 1, ..., n, the bondholder is paid ∑ ˜T ∧ Kj+1 − N ˜T ∧ Kj N wi+1 − F v 1 − wj+1 Kj+1 − Kj 0≤i<j
and in the interval [Kj , K ( j+1 ]∑the payoff) decreases linearly from F v 1 − 0≤i<j wi+1 to ) ( ∑ ˜T . F v 1 − 0≤i≤j wi+1 as the function of N We will use the following general theorem concerning catastrophe bond pricing, proved by us in [32]. Theorem 1. Let (rt )t∈[0,T ′ ] be a risk-free spot interest rate given by the diffusion process (1) and such that, after the change of probability measure described by the Radon – Nikodym derivative (2), it has the form (3) with the coef icients given by equalities (4). Let IBT,F v (t) be the price at time t, 0 ≤ t ≤ T , of the catastrophe bond IB (T, F v). Then IBT,F v (t) = η (t, T, rt , F v) , 0 ≤ t ≤ T,
(5)
where (i) η (t, T, r, F v) =
( ) = exp (−a (T − t) − b (T − t) r) E Q νT,F v |Ft1 ; (6) (ii) functions a (τ ) and b (τ ) satisfy the following system of differential equations: 1 2 δ2 b (τ ) + κ ˆ b (τ ) + b′ (τ ) − 1 = 0, τ > 0, 2 (7) 1 a′ (τ ) − φb ˆ (τ ) + δ1 b2 (τ ) = 0 τ > 0 2 with a (0) = b (0) = 0. 22
N◦ 3
2014
In particular, IBT,F v (0) =η (0, T, r0 , F v) = exp (−a (T ) − b (T ) r0 ) E P νT,F v . (8) The interest rate process r, applied in this paper, is the Cox–Ingersoll–Ross model described by the following stochastic equation √ drt = κ (θ − rt ) dt + Γ rt dWt
(9)
for positive constants κ, θ and Γ. The CIR model is af ine with parameters φ = κθ, δ1 = 0 and δ2 = Γ2 . Generally, for the considered model, interest rate cannot become negative (i.e., S = [0, ∞)), which is a major advantage relative to other models. Moreover, if its parameters satisfy the inequality 2φ ≥ Γ2 , then S = (0, ∞). The CIR model has the property of mean reversion around the long-term level θ. The parameter κ controls the size of the expected adjustment towards θ and is called the speed of adjustment. The volatility √ is the product Γ rt and therefore the interest rate is less volatile for low values than for high values of the process rt . The following theorem is a special case of Theorem 1 for the spot interest rate dynamics described by the Cox–Ingersoll–Ross model. Theorem 2. Let the risk-free spot interest rate (rt )t∈[0,T ′ ] be described by the CIR model. Assume that IBT,F v (t) is the price of the bond IB (T, F v) at moment t ∈ [0, T ]. Then ( ) IBT,F v (t) = ea(T −t)−b(T −t)rt E Q νT,F v |Ft1 , (10) where b (τ ) =
(eγτ − 1) (ˆ κ+γ) 2
(eγτ − 1) + γ
,
(11)
[ ( ) ] 2φ (ˆ κ + γ) τ γ a (τ ) = 2 ln (ˆκ+γ) + , Γ 2 (eγτ − 1) + γ 2 (12) √ κ ˆ = κ + λΓ, γ = κ ˆ 2 + 2Γ2 . In Theorem 2 the constant λ is the product λ = λ0 Γ. Since all the model parameters should be positive after change of probability measure, we assume that κ ˆ > 0. The equalities (11) and (12) are obtained as the solution of the system of equations (7). One can also ind them in inancial literature (see, e.g. [27]), since they are used in the zero-coupon bond pricing formula.
4. Catastrophe Bond Pricing in Fuzzy Case Usually some parameters of the inancial market are not precisely known. In particular, the volatility parameter of the spot interest rate is determined by luctuating inancial market and very often its uncertainty does not have stochastic character. Therefore it is unreasonable to choose ixed values of parameters, which are obtained from historical data, for later use in
Journal of Automation, Mobile Robotics & Intelligent Systems
pricing model, since they can luctuate in future (see, e.g. [42]). To estimate values of uncertain parameters one can use knowledge of experts, asking them for forecast of a parameter. The forecasts can be transferred into triangular fuzzy numbers. Their average can be computed and used for estimation of the parameter. Such an estimation method was proposed in [4] and [18] for inancial applications. In the reminder of this paper we assume more generally that the volatility parameter is a strictly positive fuzzy number, which is not necessarily triangular. We ˜ denote the fuzzy volatility parameter by Γ. In the following theorem we present catastrophe bonds pricing formula for the one-factor Cox–Ingersoll–Ross interest rate model. Theorem 3. Assume that IBT,F v (t) is the price of bond IB (T, F v) at moment t ∈ [0, T ] for a strictly pos˜ Then itive fuzzy volatility parameter Γ. ( ) ˜ IBT,F v (t) = ea˜(T −t)⊖b(T −t)⊗˜rt ⊗ E Q νT,F v |Ft1 , (13) where ˜ δ˜ (τ ) , ˜b (τ ) = α a ˜ (τ ) = ϕ⊗ ˜ (τ ) ⊘ β˜ (τ ) , ( ) ˜ Γ ˜ ,κ ˜ > 0, ϕ˜ = (2φ) ⊘ Γ⊗ ˜ = κ ⊕ λ⊗Γ √ ˜ Γ, ˜ γ˜ = κ ˜ ⊗˜ κ ⊕ 2⊗Γ⊗ 1 α ˜ (τ ) = eγ˜ ⊗τ ⊖ 1, β˜ (τ ) = ⊗˜ α (τ ) ⊗ (˜ κ ⊕ γ˜ ) ⊕ γ˜ 2 and ( ) τ δ˜ (τ ) = ln γ˜ ⊘ β˜ (τ ) ⊕ ⊗ (˜ κ ⊕ γ˜ ) . 2 Moreover, for α ∈ [0, 1], (IBT,F v (t))α = (14) [ U ( ) L U ˜ E Q νT,F v |Ft1 e(˜a(T −t))α −(b(T −t))α (rt )α , ] ( ) (˜a(T −t))U −(˜b(T −t))L (rt )L Q 1 α α α E νT,F v |Ft e , where
] [ ˜ L + κ, λΓ ˜U + κ λ Γ α α [ ] ˜U ˜L κ ˜α = λ Γ + κ, λ Γ + κ α α κ
for λ > 0, for λ < 0,
[√ γ˜α =
L (˜ κ⊗˜ κ)α
(
˜L +2 Γ α
(15)
[ L ] U (˜ α (τ ))α = eγ˜α τ − 1, eγ˜α τ − 1 ,
(
[ ( ) ˜ δ (τ ) = ln 1
)
γ˜αL U
α α (τ ))α (˜ κU ˜αU ) + γ˜αU α +γ 2 (˜ ( ) ( L ) τ κ ˜ α + γ˜αL γ˜αU , ln 1 + L 2 α (τ ))α (˜ κL ˜αL ) + γ˜αL α +γ 2 (˜ ( U )] τ κ ˜ α + γ˜αU , (19) + 2
) ( ˜b (τ ) =
[
α
L
(˜ α (τ ))α 1 2
U
(˜ α (τ ))α (˜ κU ˜αU ) + γ˜αU α +γ
for λ < 0, for λ = 0, (16)
] )2 √ ( )2 U U ˜ , (˜ κ⊗˜ κ)α + 2 Γα , (17)
,
U
(˜ α (τ ))α 1 2
L
(˜ α (τ ))α (˜ κL ˜αL ) + γ˜αL α +γ
] (20)
and [( (a (τ ))α =
( ) )L ϕ˜α ⊗int δ˜ (τ ) , (
α
( ) )U ] ϕ˜α ⊗int δ˜ (τ ) . (21) α
Proof. We replace the crisp volatility parameter Γ ˜ and arithmetic operators by its fuzzy counterpart Γ +, −, . by ⊕, ⊖, ⊗ in (10). As result we obtain the formula (13). Let α ∈ [0, 1] and τ ≥ 0. For a given fuzzy number F˜ we denote by F˜αL and F˜αU the lower and upper bound of its α-level set. ˜ > 0, the number κ Since φ, κ > 0 and Γ ˜ ⊗˜ κ ⊕ ˜ ˜ 2⊗Γ⊗Γ is also strictly positive. From direct calculations it follows that (15) and (16) hold. √ Function exp (x) for x ∈ R and functions x and ln (x) for x > 0 satisfy the assumptions of Proposition 1 and they are increasing. Thus, γ˜ > 0, [√ ] ( )L √( )U ˜ Γ ˜ , ˜ Γ ˜ κ ˜ ⊗˜ κ ⊕ 2⊗Γ⊗ κ ˜ ⊗˜ κ ⊕ 2⊗Γ⊗ γ˜α = α
for λ > 0,
2014
2φ 2φ ϕ˜α = ( )2 , ( )2 , (18) ˜U ˜L Γ Γ α α
for λ = 0,
[( )2 ( )2 ] L U ˜ ˜ λ Γ + κ , λ Γ + κ α α [ ( )2 ( )2 ] (˜ κ⊗˜ κ)α = U L ˜ ˜ λ Γ + κ , λ Γ + κ α α 2 κ
N◦ 3
VOLUME 8,
α
and (17) is satis ied. From direct calculations it follows that κ ˜ ⊕ γ˜ > 0, α ˜ (τ ) ≥ 0, β˜ (τ ) > 0, ˜b (τ ) ≥ 0 and (20) is ful illed. Proposition 1 implies the equality (
˜
ea˜(τ )⊖b(τ )⊗rt
[ L ˜ = e(a˜(τ )⊖b(τ )⊗˜rt )α ,
) α
] U ˜ e(a˜(τ )⊖b(τ )⊗˜rt )α , (22)
>From properties of the Cox–Ingersoll–Ross interest rate model it follows that the fuzzy random variable r˜t is positive for t ∈ [0, T ] and, since ˜b (T − t) ≥ 0, 23
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
2014
that (14) holds. Applying Proposition 1 also gives the equality ( ( )) ln γ˜ ⊘ β˜ (τ ) α [ ( ) γ˜αL = ln 1 , U α (τ ))α (˜ κU ˜αU ) + γ˜αU α +γ 2 (˜ ( )] γ˜αU ln 1 , L κL α (τ ))α (˜ ˜αL ) + γ˜αL α +γ 2 (˜
and NHPP (non-homogeneous Poisson process) as the process of the quantity of catastrophic events (see Table 1), but other random distributions and other processes could be directly applied using the approach introduced in this paper. As noted in [7], because of annual seasonality of occurrence of catastrophic events, the intensity function of losses for NHPP is given by
Finally, the (standard give ) interval calculations ( ) the forms of ϕ˜α , δ˜ (τ ) and (a (τ ))α = ϕ˜ ⊗ δ˜ (τ ) de-
The triggering points in our considerations are related to quantiles given by QNHPP-LN (x), i.e. the x-th quantile of the cumulated value of losses for the NHPP process (quantity of losses) and lognormal distribution (value of each loss). After conducting N = 1000000 Monte Carlo simulations, the fuzzy value of the cat bond price was obtained using fuzzy arithmetic (see Figure 2). This fuzzy price is close to symmetry in the case of the parameters from Table 1. Based on this fuzzy number, the relevant intervals of prices for various α may be also found. Because of practical purposes the analyst may be also interested in crisp value of the cat bond price, then e.g. α = 1 can be set or the crisp possibilistic mean can be calculated (see [34, 35] for related approach in analysis of European options pricing). The obtained results in the considered case are enumerated in Table 2. For the crisp possibilistic mean the intuitive function f (α) = 2α is applied. The difference between both of the obtained crisp values is about 0.091%.
α
scribed by (18), (19) and (21).
α
Applying the equality µIB ˜ T ,F v (t) (c) = sup αI(IB ˜ T ,F v (t)) (c) α 0≤α≤1 ˜ v (t). one can obtain the membership function of IBT,F For a suf iciently high value of α (e.g. α = 0.95) the ˜ T,F v (t) can be used for investment α-level set of IB decision-making. A inancial analyst can choose any value from the α-level set as the catastrophe bond price with an acceptable membership degree.
5. Monte Carlo Approach The calculations required to ind the price of the cat bond via the formulas introduced in Section 4 could be very complex, especially if the payment function or the model of losses are not straightforward ones. Then instead of directly inding an analytical formula for the price, other approaches may be used. In this paper we focus on Monte Carlo simulations and application of fuzzy arithmetic for α-cuts. To model complex nature of the practical cases, the parameters similar to the ones based on the reallife data are applied. In [6] the parameters of the CIR model are estimated using Kalman ilter for monthly data of the Treasury bond market. But these values, namely φ, κ, Γ, r0 , are crisp ones (compare with Table 1). Because in Section 4 the cat bond pricing ap˜ was proach for the CIR model with fuzzy number Γ established, then instead of crisp value Γ = 0.0754 ˜ (as estimated in [6]), the fuzzy triangular number Γ is applied (see Table 1). This fuzzy value is similar to the crisp parameter obtained in [6], but with the introduced fuzzy volatility the future uncertainty of the inancial markets is modeled. The other applied model, i.e. the process of losses, is also based in our approach on the real-life data. In [7] the information of catastrophe losses in the United States provided by the Property Claim Services (PCS) of the ISO (Insurance Service Of ice Inc.) and the relevant estimation procedure for this data are considered. For each catastrophe, the PCS loss estimate represents anticipated industrywide insurance payments for different property lines of insurance covering. An event is noted as a catastrophe when claims are expected to reach a certain dollar threshold. We focus on lognormal distribution of the value of the single loss 24
N◦ 3
ρNHPP (t) = a + 2πb sin (2π(t − c)) .
(23)
alpha 1.0
0.8
0.6
0.4
0.2
0.80
0.85
0.90
0.95
Price
Fig. 2. Fuzzy price of the cat bond (parameters of the model from Table 1) For other symmetric fuzzy values of the volatility ˜ considered in our analysis, the calculated fuzzy cat Γ bond prices have similar shapes (see Figure 3). The membership function could be also evaluated in the ˜ (see case of asymmetrical triangular fuzzy values of Γ Figure 4). The model of catastrophic events is usually based on historical data as in the case discussed in [7]. Therefore the estimators calculated from such data may be not completely adequate for future natural catastrophes. Then the behavior of cat bond prices could be analyzed if some of the important parameters of the model are changed. For example, if the parameter µLN of lognormal distribution of the single loss becomes
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
Tab. 1. Parameters of Monte Carlo simula ons CIR model (crisp) CIR model (fuzzy) Intensity of NHPP Lognormal distribution Triggering points Values of losses coef icients
Parameters φ = 0.00270068, κ = 0.07223, r0 = 0.02 ˜ = (0.07, 0.075, 0.08) Γ a = 30.875, b = 1.684, c = 0.3396 µLN = 17.357, σLN = 1.7643 K1 = QNHPP-LN (0.75), K2 = QNHPP-LN (0.85), K3 = QNHPP-LN (0.95) w1 = 0.4, w2 = 0.6
Tab. 2. Crisp prices for the cat bond (parameters of the model from Table 1)
tained cat bond prices.
mu
alpha
Method α=1 Crisp possibilistic mean
Price 0.851857 0.852631
1.0
mu=17.2 0.8
mu=17.4 0.6
mu=17.3
alpha 0.4
1.0
0.2
0.8
0.6
0.75
0.80
0.85
0.90
0.95
1.00
Prices
Fig. 5. Fuzzy price of the cat bond for various values of µLN
0.4
0.2
0.75
0.80
0.85
0.90
0.95
1.00
Tab. 3. Crisp prices for for various values of µLN
Prices
Fig. 3. Fuzzy price of the cat bond for various ˜ ((0.072, 0.075, 0.078) – do ed line, fuzzy values of Γ (0.07, 0.075, 0.08) – dashed line, (0.068, 0.075, 0.082) – solid line)
µLN Price for α = 1 Crisp possibilistic mean
17.2 17.3 17.4 0.893286 0.86935 0.83707 0.894098 0.87014 0.837831
alpha 1.0 alpha 0.8
1.0
sigma=1.74 0.6
0.8
sigma=1.76 sigma=1.75
0.4
0.6
0.2
0.4
0.80
0.82
0.84
0.86
0.88
0.90
0.92
Prices
Fig. 4. Fuzzy price of the cat bond for various ˜ ((0.07, 0.075, 0.077) – do ed line, fuzzy values of Γ (0.073, 0.075, 0.08) – dashed line)
higher and other parameters are the same as in Table 1, then the relevant fuzzy prices of the cat bonds are shifted left-side (see Figure 5) and the crisp prices are lower (see Table 3). The same applies for the case of various values of the parameter σLN (see Figure 6 and Table 4). As it may be seen from Figure 5 and Figure 6, these parameters have important impact on the ob-
0.2
0.80
0.85
0.90
0.95
Prices
Fig. 6. Fuzzy price of the cat bond for various values of σLN
6. Conclusions In this paper the catastrophe bond pricing formula in crisp case for the Cox–Ingersoll–Ross risk-free interest rate model is derived. Then on basis of this formula catastrophe bond valuation expression for fuzzy 25
Journal of Automation, Mobile Robotics & Intelligent Systems
Tab. 4. Crisp prices for for various values of σLN σLN Price for α = 1 Crisp possibilistic mean
1.74 1.75 1.76 0.865891 0.858843 0.854912 0.866678 0.859623 0.855689
volatility parameter is obtained. Since the pricing formula is considered for arbitrary time moment before maturity, fuzzy random variables are introduced. Besides the fuzzy valuation formula, the forms of α-level sets of the cat bond price are received. Therefore this approach can be applied for general forms of fuzzy numbers. Also the Monte Carlo simulations are conducted in order to directly analyze the fuzzy cat bond prices. We apply fuzzy arithmetic and introduce triangular fuzzy number for the value of the volatility in CIR model, but using other fuzzy numbers (e.g. L-R numbers) is also possible in our setting. Then the in luence of the shape of fuzzy numbers and other parameters of the model like distribution of the single loss on the inal cat bond price is considered.
AUTHORS
∗
Piotr Nowak – Systems Research Institute Polish Academy of Sciences, ul.Newelska 6, 01–447 Warszawa, Poland, e-mail: pnowak@ibspan.waw.pl. Maciej Romaniuk∗ – Systems Research Institute Polish Academy of Sciences, The John Paul II Catholic University of Lublin, ul. Newelska 6, 01–447 Warszawa, Poland, e-mail: mroman@ibspan.waw.pl. ∗
Corresponding author
REFERENCES [1] Albrecher H., Hartinger J., Tichy R.F., ”‘QMC techniques for CAT bond pricing”, Monte Carlo Methods Appl., vol. 10, no. 3–4, 2004, 197–211. [2] Baryshnikov Y., Mayo A., Taylor D.R., Pricing CAT Bonds. Working paper, 1998). [3] Borch, K., The Mathematical Theory of Insurance, Lexington Books, Lexington (1974) [4] Buckley J.J., Eslami E., ”Pricing Stock Options Using Fuzzy Sets”, Iranian Journal of Fuzzy Systems, vol. 4, no. 2, 2007, 1–14. [5] Burnecki K., Kukla G., ”Pricing of Zero-Coupon and Coupon CAT Bond”, Applied Mathematics, vol.30, 2003, 315–324. [6] Chen R.-R., Scott L., ”Multi-factor Cox-IngersollRoss Models of the Term Structure: Estimates and Tests from a Kalman Filter Model”, Journal of Real Estate Finance and Economics, vol. 27, no. 2, 2003, 143–172. [7] Chernobai A., Burnecki K., Rachev S., Trueck S., Weron R., ”Modeling catastro26
VOLUME 8,
N◦ 3
2014
phe claims with left-truncated severity distributions”, Computational Statistics, vol. 21, issue 3–4,2006, 537–555. DOI: http: //dx.doi.org/10.1007/s00180-006-0011-2. [8] Cox S.H., Fairchild J.R., Pedersen H.W., ”Economic Aspects of Securitization of Risk”, ASTIN Bulletin, vol. 30, no. 1, 2000, 157–193. [9] Cox S.H., Pedersen H.W., ”Catastrophe Risk Bonds”, North American Actuarial Journal , vol. 4, issue 4, 2000, 56–82. DOI: http://dx.doi. org/10.1080/10920277.2000.10595938. [10] Cummins J.D., Doherty N., Lo A., ”Can insurers pay for the ”big one”? Measuring the capacity of insurance market to respond to catastrophic losses”, Journal of Banking and Finance , vol. 26, no. 2–3, 2002, p. 557. DOI: http://dx.doi.org/ 10.1016/S0378-4266(01)00234-5. [11] Dai Q., Singleton K.J., ”Speci ication analysis of af ine term structure models”, Journal of Finance, vol. 55, no. 5, 2000, 1943– 1978. DOI: http:// dx.doi.org/10.1111/0022-1082.00278. [12] Duf ie D., Kan R., ”A yield-factor model of interest rates”, Mathematical Finance, vol. 6, no. 4, 1996, 379–406. DOI: http://dx.doi.org/10. 1111/j.1467-9965.1996.tb00123.x. [13] Egami M., Young V.R.:, ”Indifference prices of structured catastrophe (CAT) bonds”, Insurance: Mathematics and Economics, vol. 42, no. 2, 2008, 771–778. DOI: http://dx.doi.org/10.1016/ j.insmatheco.2007.08.004. [14] Ermolieva T., Romaniuk,. M., Fischer. G., Makowski. M., ”Integrated model-based decision support for management of weather-related agricultural losses”. In: Enviromental informatics and systems research. Vol. 1: Plenary and session papers - EnviroInfo 2007, ed. by Hryniewicz. O., Studzinski J., Romaniuk M., Shaker Verlag 2007. [15] Freeman P. K., Kunreuther H., Managing Environmental Risk Through Insurance. Kluwer Academic Press, Boston 1997. DOI: http://dx. doi.org/10.1007/978-94-011-5360-7. [16] Fuller R., Majlender P., ”On weighted possibilistic mean and variance of fuzzy numbers”, Fuzzy Sets and Systems, vol. 136, no. 3, 2003, 363–274. DOI: http://dx.doi.org/10.1016/ S0165-0114(02)00216-6. [17] George J.B., ”Alternative reinsurance: Using catastrophe bonds and insurance derivatives as a mechanism for increasing capacity in the insurance markets”, CPCU Journal, vol. 52, no. 1, 1999, p. 50. [18] Gil-Lafuente A. M., Fuzzy logic in inancial analysis, Springer, Berlin 2005. [19] Kowalski P. A., Łukasik S., Charytanowicz M., Kulczycki P., ”Data-Driven Fuzzy Modeling and Control with Kernel Density Based Clustering Technique”, Polish Journal of Environmental Studies, vol. 17, no. 4C, 2008, 83–87.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
[20] Kulczycki P., Charytanowicz M., ”Bayes sharpening of imprecise information”, International Journal of Applied Mathematics and Computer Science, vol. 15, no. 3, 2005, 393–404.
[32] Nowak P., Romaniuk M., Application of the onefactor af ine interest rate models to catastrophe bonds pricing. Research Report, RB/1/2013, SRI PAS, Warsaw 2013.
[21] Kulczycki P., Charytanowicz M., ”Asymmetrical Conditional Bayes Parameter Identi ication for Control Engineering”, Cybernetics and Systems, vol. 39, no. 3, 2008, 229–243.
[33] Nowak P., Romaniuk M., ”Pricing and simulations of catastrophe bonds”, Insurance: Mathematics and Economics, vol. 52, no. 1, 2013, 18–28. DOI: http://dx.doi.org/10.1016/j. insmatheco.2012.10.006.
[22] Kulczycki, P., Charytanowicz, M., ”A Complete Gradient Clustering Algorithm Formed with Kernel Estimators”, International Journal of Applied Mathematics and Computer Science, vol. 20, no. 1, 2010, 123–134. DOI: http://dx.doi.org/10. 2478/v10006-010-0009-3.
[34] Nowak P., Romaniuk M., ”A fuzzy approach to option pricing in a Levy process setting”, Int. J. Appl. Math. Comput. Sci., vol. 23, no. 3, 2013, 613–622. DOI: http://dx.doi.org/10.2478/ amcs-2013-0046.
[23] Kulczycki P., Charytanowicz M., ”Conditional Parameter Identi ication with Different Losses of Under- and Overestimation”, Applied Mathematical Modelling, vol. 37, no. 4, 2013, 2166 – 2177. DOI: http://dx.doi.org/10.1016/j. apm.2012.05.007.
[35] Nowak P., Romaniuk M.:, ”Application of Levy processes and Esscher transformed martingale measures for option pricing in fuzzy framework”, Journal of Computational and Applied Mathematics, vol. 263, 2014, 129–151. DOI: http://dx. doi.org/10.1016/j.cam.2013.11.031.
[24] Lee J. P., Yu M. T., ”Valuation of catastrophe reinsurance with catastrophe bonds”, Insurance: Mathematics and Economics, vo. 41, no. 2, 2007, 264–278. DOI: http://dx.doi.org/10.1016/ j.insmatheco.2006.11.003.
[36] Puri M.L., Ralescu D.A., ”Fuzzy random variables”, Journal of Mathematical Analysis and Applications, vol. 114, no. 2, 1986, 409–422. DOI: http://dx.doi.org/10.1016/ 0022-247X(86)90093-4.
[25] Lin S. K., Shyu D., Chang C. C., ”Pricing Catastrophe Insurance Products in Markov Jump Diffusion Models”, Journal of Financial Studies, vol. 16, no. 2, 2008, 1–33.
[37] Ripples IntoWaves: The Catastrophe Bond Market at Year-End 2006. Guy Carpenter & Company, Inc. and MMC Security Corporation 2007.
[26] Muermann A., ”Market Price of Insurance Risk Implied by Catastrophe Derivatives”, North American Actuarial Journal, vol. 12, no. 3, 2008, 221–227. DOI: http://dx.doi.org/10.1080/ 10920277.2008.10597518. [27] Munk C., Fixed Income Modelling, Oxford University Press 2011. DOI: http://dx.doi.org/10. 1093/acprof:oso/9780199575084.001.0001. [28] Nowak P., ”Analysis of applications of some exante instruments for the transfer of catastrophic risks”. In: IIASA Interim Report I IR-99-075, 1999. [29] Nowak P., ”On Jacod-Grigelionis characteristics for Hilbert space valued semimartingales”, Stochastic Analysis and Applications, vol. 20, no. 5, 2002, 963–998. DOI: http://dx.doi.org/ 10.1081/SAP-120014551. [30] Nowak P., Romaniuk M., ”Computing option price for Levy process with fuzzy parameters”, European Journal of Operational Research, vol. 201, no. 1, 2010, 206–210. DOI: http://dx. doi.org/10.1016/j.ejor.2009.02.009. [31] Nowak P., Romaniuk M., Ermolieva T., ”Evaluation of Portfolio of Financial and Insurance Instruments: Simulation of Uncertainty”. In: Managing Safety of Heterogeneous Systems: Decisions under Uncertainties and Risks, ed. by Ermoliev Y., Makowski M., Marti K., Springer-Verlag Berlin Heidelberg 2012.
[38] Romaniuk M., Ermolieva T., ”Application EDGE software and simulations for integrated catastrophe management”, International Journal of Knowledge and Systems Sciences, vol. 2, no. 2, 2005, 1–9. [39] Shiryaev A.N., Essentials of stochastic inance, Facts, Models, Theory, World Scienti ic 1999. DOI: http://dx.doi.org/10.1142/ 9789812385192. [40] Vaugirard V.E., ”Pricing catastrophe bonds by an arbitrage approach”, The Quarterly Review of Economics and Finance, vol. 43, no. 1, 2003, 119– 132. DOI: http://dx.doi.org/10.1016/ S1062-9769(02)00158-8. [41] Walker G., ”Current Developments in Catastrophe Modelling”. In: Financial Risks Management for Natural Catastrophes, ed. by Britton, N. R., Olliver, J. Brisbane, Grif ith University, Australia 1997. [42] Wu H.-Ch., ”Pricing European options based on the fuzzy pattern of Black-Scholes formula”, Computers & Operations Research, vol. 31, no. 7, 2004, 1069–1081. DOI: http://dx.doi.org/ 10.1016/S0305-0548(03)00065-0.
27
Journal of Automation, Mobile Robotics & Intelligent Systems
B
S RGB-D B
VOLUME 8,
V T
N◦ 3
2014
M C
Submi ed: 6th June 2014; accepted: 15th July 2014
Jan Wietrzykowski, Dominik Belter DOI: 10.14313/JAMRIS_3-2014/24
1.1. Problem statement
Abstract: This paper deals with the terrain classifica on problem for an autonomous mobile robot. The robot is designed to operate in an outdoor environment. The classifier integrates data from RGB camera and 2D laser scanner. The camera provides informa on about visual appearance of the objects in front of the robot. The laser scanner provides data about distance to the objects and their ability to reflect infrared beam. In this paper we present the method which create terrain segments and classifies them using joint applica on of Support Vector Machine (SVM) classifier and AdaBoost algorithm. The classificaon results of the experimental verifica on are provided in the paper.
Our goal is to create the robot which can navigate in urban environment. The paper is focused on terrain classi ication which is important part of the navigation system of a mobile robot. We are interested in robotic competitions for delivery or search and rescue. The scenario of such challenges include autonomous navigation on paved park roads (Robotour) or inding and fetching an object (e.g. 1 kilo ”bag of gold” in Robots Intellect competition). The robot which navigates in urban environment can’t use only depth sensors to create environment model. Some obstacles, however lat, are not traversable (e.g. lawn). Other places like pedestrian crossing should be recognized to apply special strategy for traversing. This can be done by visual camera. Using monocular RGB camera only the robot would have problems to distinguish between asphalt and lat, vertical and gray wall. It is much easier to classify terrain using two complementary sensors. In the paper we present way to classify terrain using data from RGB-D sensors (in our case laser scanner and visual camera). We are interested in segmentation of an image and labeling detected areas. To this end, we applied classi ication strategy which utilizes SVM classi ication and AdaBoosting. We present results from indoor and outdoor experiments. The obtained results are compared with other approaches to show ef iciency of the proposed method.
Keywords: Terrain classifica on, mobile robot, RGB-D
1. Introduc on Autonomous navigation in urban environment is a challenge for mobile robots. The robot which operates in urban space should localize itself, ind the path to the goal position and avoid obstacles. Moreover, it should obey rules which are designed for humans. It is obvious that autonomous car-like robot should follow the road. Access to the pavement is prohibited. The robot which is designed as short distance courier should use pavement for locomotion and avoid road as possibly dangerous area. The access to the lawn should also be prohibited. In this case it isn’t dangerous for the robot, but such a behavior is against principles of community life. To obey all rules the robot should recognize various terrain types. The autonomous operation in urban environment differs to operation in off-road natural environment. The irst difference is related to traversability assessment methods. Outdoor and and off-road locomotion takes into account mainly the shape of the terrain. The terrain type does not play an important role. Grass as well as asphalt is considered as traversable. Such situation is not acceptable in urban environment. Moreover, in off-road environment the borders between various regions are dif icult do distinguish (e.g. the grass can be also found on the ield track). In manmade environment most of objects and terrain types have standard size, color and location. On the other hand robot which operates in urban-like environment has to distinguish between very similar areas like road and pavement. 28
1.2. Related work and research contribu on Most of the existing terrain classi ication methods employ RGB cameras for features extraction [5, 13]. In our work we increase the robustness of the classi ication procedure by incorporation information about depth of the scene. Laible et al. presented that the classi ication accuracy can be increased by the analysis of the whole scene and taking into account neighboring regions [15]. The decision about terrain type can’t be taken using local terrain properties only. Context, neighboring terrain types and location of the considered image segment play important role in the classi ication procedure. To join information from weak classi iers Laible et al. proposed the application of Conditional Random Fields [15]. The joint application of 2D laser scanner and RGB camera to terrain classi ication is not new. Dahlkamp et al. proposed to use data from range inder to supervise learning algorithm [6]. The surface model obtained from depth data is used to ind a traversable area (road). Then, the visual data from a camera
Journal of Automation, Mobile Robotics & Intelligent Systems
is used by a learning algorithm. The classi ication method uses mixture of Gaussians in RGB space to classify the terrain. The model is updated whenever new learning dataset is provided by self-supervising procedure. The re-learning procedure allows the system to adapt even when the road changes from gray asphalt to green grass. In our case this situation is not desirable. The classi ier should determine not only the traversability, but also the terrain type. The grass (however lat) should be also considered as an obstacle for our robot. In contrast to method presented by Dahlkamp et al. we use RGB and depth data during classi ication stage. A reliable terrain classi ication can be also obtained using visual features and SVM classi ication [9]. In the method proposed by Filitchkin and Byl SURF features are used. To deal with various surfaces, which differ with number of visual features, the optimization on Hessian threshold detection is proposed. However, the feature-based classi ication is sensitive to motion blurring problem. Thus, we decided to use few independent sources of information. Most reliable classi ication methods suffer from high computational cost. Angelova et al. proposed a cascade of classi iers instead of single, multidimensional classi ication to obtain high speed and preserve high classi ication accuracy [1]. They take advantage of the fact that some terrain types might be easily separated from the others. This observation can be used to create decision tree. The classi ication starts from the fastest classi ication sub-procedure. The most computationally expensive procedures are performed at the end an only for regions which are dificult to distinguish. Additional classi ication capabilities are available for legged robots. Such robots can use force/torque sensors as an additional source of information to classi ication procedure [12, 19]. Also wheeled robots can use properties of the contact with the ground to support classi ication procedure (e.g. vibrations which propagate through suspension structure [11]).
2. Percep on and data acquisi on The environment perception is based on two sensor: a generic USB camera (Microsoft LifeCam Studio) and laser range inder. The robot acquires VGA (640×480) images. VGA resolution is suf icient for classi ication and allows to decrease the computation time of the procedure. The Hokuyo UTM-30LX laser inder used in this research can operate outdoor. The range of the sensor is up to 30 m. The angular resolution is 0.25◦ and each scan takes 25 ms. Single scan contains information about terrain pro ile. When the robot moves forward or rotates it acquires 3D shape of the environment. Both sensors are tilted down to acquire terrain properties. To create map of the environment the robot has to determine the position of the sensors in global coordinate system OG at each scan of range inder. The robot uses GPS, encoders and Inertial Measurement Unit (IMU) to localize itself. Data from all sensors are
VOLUME 8,
N◦ 3
2014
Fig. 1. Configura on of the sensors a ached to the robot’s pla orm
integrated using Kalman Filter. The robot is equipped with IMU CHR-UM6 sensor on board. It allows to measure properly the shape of the environment on rough terrain. The robot can take into account the inclination of the platform during integration of the measurements. In our research we use two various mobile platforms. However, the presented classi ication method is platform independent. The con iguration of sensors is presented in Fig. 1. The coordinate systems OK , OI and OL are attached to the camera, IMU and laser range inder, respectively. To integrate data from all sensors the correspondence between each pixel of the camera image and points of the laser scan has to be known. To this end, the pose of each sensor has to be determined by the calibration procedure. The calibration procedure also determines intrinsic parameters of the camera. We applied Camera Calibration Toolbox for Matlab by Jean-Yves Bouguet [4] to ind focal length and location of the principal point. To ind relation between camera and laser range inder a plane-to-line itting method was applied [20]. We can use checkerboard marker from intrinsic calibration to determine a plane which represents marker. From the laser scan we can ind an equation of the line located on on this plane. From the measurements set we can compute transformation between camera and laser scanner coordinate system. To present calibration results between camera and laser scanner we draw single scan on the camera image. The result is presented in Fig. 2. Moreover, we should ind orientation between exteroceptive sensors (camera and LRF) and IMU unit. To this end, we used the method proposed by Lobo 29
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
AdaBoost SVM
c1
f1
Fig. 2. Calibra on results of the camera and the laser range finder
RGB
2014
C c2
f2
N◦ 3
cn f3
fk
Depth
Fig. 4. Classifica on scheme with SVM weak classifiers and AdaBoost
Cascade of classi ers + 1
Input image and LRF data
T
Labeled image
Segmentation
Cascade of classi ers
better than random classi ier, e.g. Decision Stump). Instead we can use strong classi iers e.g. SVM or Neural Network which are appropriately weakened [7]. In our method n weak SVM classi iers (c1 ,...,cn ) are used. The output C from the classi ier is the terrain category recognized by the system.
+ 1
T
3.1. Segmenta on Fig. 3. Structure of the terrain classifica on procedure which uses RGB-D input data and returns labeled image et al. [16]. In this case we use checkerboard marker which is perpendicular to gravity vector. Taking into account orientation measured by the camera we can ind orientation offset of the IMU unit. Finally to compute pose of each measured point PL in global coordinate system OG we apply (1): PG = G AI · I AK · K AL · PL ,
(1)
where K AL is a transformation from the camera coordinate system to the laser coordinate system, I AK is a transformation from IMU unit to the camera coordinate system and G AI is the IMU unit pose in the global coordinate system obtained from the localization system.
3. Terrain classifica on The input to our system is data from RGB camera and Hokuyo laser scanner. The architecture of the classi ication procedure is presented in Fig. 3. At the beginning the segmentation is performed using RGB image. For each segment we compute k visual and depth features (f1 , ..., fk ). A set of features is then used for classi ication. We decided to use combination of SVM weak classi iers and boosting technique to join results (Fig. 4). It was shown that this approach has better performance in training time [10]. We also show that classi ication results are better. To use boosting technique we should use weak classi iers (which perform 30
In our method we perform image segmentation and then classi ication for each RGB-D segment. We avoid classi ication for each pixel separately because we don’t always have corresponding depth for each pixel. Moreover, single pixel does not contain all information about the terrain properties like roughness obtained from depth data or variance of color. We also avoid dividing the image into regular mesh [14, 17]. Constant and rectangular region may contain two separate terrains and such a cell should be classi ied as two separate classes. Instead, we perform image segmentation and then we classify each region separately. We use a method proposed by Pedro Fenzenszwalb and Daniel Huttenlocher for image segmentation [8]. The segmentation method used in our system divides an image into components. The behavior of the segmentation method is speci ied only by two parameters. First parameter kc is responsible for preferred component size. Second parameter Smin represents the minimal size of components. Before segmentation the image is smoothed by Gaussian ilter with σ = 0.8. At the beginning of the segmentation the algorithm creates initial graph G (Fig. 5). Each edge E of the graph connects two vertices, representing single neighboring pixels of the image. The weight w of an edge is computed using difference in pixel color values [r,g,b] with an Euclidean distance. Then, edges of the graph are sorted in an ascending order with respect to weights w. A set of segments is initialized – each vertex of the graph represent separate segment. Next, for each edge which belongs to separate components weight wq is computed. The computed weight wq is compared with threshold
Journal of Automation, Mobile Robotics & Intelligent Systems
Start
Create Initial Init set of Graph G(V,E) components S0, q=0
Merge components smaller than Smin Finish
VOLUME 8,
a
N◦ 3
2014
b
q<size(E) ?
Compute weight wq and threshold wt
Fig. 7. Intensity values obtained from Hokuyo laser range finder – observed scene (a) and registered intensity values (b)
wq<wt ?
one-dimensional vector, merge two components connected by edge eq, q=q+1
Fig. 5. Segmenta on procedure used for image par oning a
b
Fig. 6. Segmenta on results – original image (a) and output components (b) weight wt : kc kc , IN T (vj )+ ), Size(vi ) Size(vj ) (2) where vi is the considered vertex, vj is the neighboring vertex, Size(v) is the size of component represented by vertex v and IN T (v) is the maximal weight between vertices which create the whole component. If wq is smaller or equal wt vertices are merged into single component S. In an opposite case, the con iguration of segments does not change. Finally, the algorithm removes components which are smaller than threshold Smin . Components which are too small are merged with neighboring components. The algorithm returns a set of components S. The results of the segmentation procedure are presented in Fig. 6. wt = min(IN T (vi )+
3.2. Classifica on For classi ication purpose we use Support Vector Machine supervised learning algorithm [3]. We decided to use SVM because it works well with multidimensional input vector. The output from the classiier is the value of assignment to each category of terrains. We created ive weak classi iers. The input for each classi ier is de ined as follows: 1) two dimensional histogram of values in HueSaturation color space (4 × 4 bins) converted to
2) 8 bin histogram of Value in HSV space, 3) mean and covariance matrix for pixels in HSV space converted to a 1 × 12 vector, 4) mean and covariance matrix for values of depth and intensity from Hokuyo laser scanner converted to a 1 × 6 vector, 5) 25 bin histogram of intensity values from Hokuyo laser scanner. First three classi iers use color image feature as an input for classi ication. The next two classi iers use data from depth sensor. We use depth data directly as well as intensity values which are provided by the Hokuyo driver. The intensity value depends on the color and texture of the surface. Thus, intensity value provides important information about observed surface [15]. Example intensity values for the various terrain types are presented in Fig. 7. Using intensity values we can easily distinguish between various types of terrain without direct information about color of the surface. For boosting we use improved version of AdaBoost algorithm [18] based on MultiBoost implementation [2] which deals with multi-class weak-learning.
4. Results First experiments were performed indoor. Our goal was to avoid problems with uncertainty of the localization system which introduces mapping error. We created mockup with various terrain types like arti icial grass, elastic gum, timber and tile loor (Fig. 7a). For training classi iers we used 33 manually marked scenes. Next 35 scenes were used for testing. For segmentation we set k = 50 which allows to obtain 3000 training samples. For testing we use k = 200 which allows to obtain segments which represent bigger area. Thus, we avoid situations when grass is divided into green patches representing grass and small black patches representing soil. We are interested in classi ication of the whole region with heterogeneous texture. Example classi ication results are presented in Fig. 8. Colors in Fig. 8b represent various type of terrain: green – grass, brown – timber loor, blue – asphalt, yellow – rocky terrain. Only some small areas are classi ied improperly. The component of the timber loor is classi ied as a rocky terrain. Also small re31
Journal of Automation, Mobile Robotics & Intelligent Systems
a
VOLUME 8,
b
gion between the grass and the loor is classi ied improperly as an asphalt (mainly because of black color of this part). Tab. 1. Confusion matrix for indoor experiment grass 94% 0% 0% 0%
timber 5% 90% 3% 1%
rocks 1% 10% 96% 2%
asphalt 0% 0% 1% 96%
We also performed statistical analysis for the whole testing set. The results are presented in Tab. 1. Each row in the table represent the terrain type marked by the expert. Each column represent output from proposed classi ication method. The classi ication results pc1 ,c2 presented in Tab. 1 are computed as follows: pc1 ,c2 =
Nc1 · 100%, Nc2
(3)
where Nc1 is the number of pixels classi ied as class c1 and Nc2 is the number of pixels marked by expert as class c2 . It means that 94% of pixels which belong to grass are classi ied properly as a grass, 5% of pixels are classi ied as timber and 1% as rocks. Tab. 2. Comparison between various configura ons of the proposed classifier and input features terrain type grass timber rocks asphalt average
Cc 43% 94% 39% 64% 60%
classi icatory type Cl M ON Cone 95% 86% 95% 16% 75% 71% 85% 98% 98% 97% 96% 89% 75% 86% 89%
Ckl 94% 90% 96% 96% 94%
We also compared various con igurations of classi iers and input features. We compared six con igurations: 1) Cc – SVM classi ication with AdaBoost and features computed using RGB image only 2) Cl – SVM classi ication with AdaBoost and features computed using depth data only 3) M ON – SVM classi ication without AdaBoost using single vector of features computed using RGB and depth data 32
2014
4) Cone – SVM classi ication with AdaBoost and single features vector computed using RGB and depth data
Fig. 8. Results – the example scene (a) and classifica on results
terrain type grass timber rocks asphalt
N◦ 3
5) Ckl – SVM classi ication with AdaBoost and features computed using RGB and depth data (solution proposed in the paper) The results of the comparison experiment are presented in Tab. 2. The best performance is obtained by classi icator proposed in the paper. The average classi ication accuracy is 94% while the performance for standard SVM classi icator is 86%. Tab. 3. Computa on me task segmentation
features extraction classi ication
sorting segmentation merging pre-processing computation computation
total
time [s] 0,699 0,414 0,274 0,122 0,022 0,260 1,791
We also checked the computation time of each element of the proposed procedure. The results are presented in Tab. 3. The most consuming part is the segmentation procedure. Sorting of edges takes 0.7 s, segmentation 0.4 s and removing segments smaller than threshold takes almost 0.3 s. Features extraction is faster and takes only 0.144 s including preparation of depth and color data and features computation. The classi ication takes 0.26 s. The whole classi ication procedure takes 1.791 s. It is fast enough to implement the method on the real robot because the robot needs at least 2 s to acquire information about new terrain. 4.1. Outdoor experiment We also performed outdoor experiment on the robot with the inal setup of sensors. The robot classiies grass, asphalt and two types of pavements (pave1 and pave2 in Tab. 4). The color and the geometrical properties of the pavements and the asphalt are similar. Thus we added a new set of features which allows to distinguish between similar terrain types. The new inputs of the classi ier are related to shape of the segmented regions. For each region we detect line segments using RANSAC. The line segments are used to compute additional input features. The input values are as follows: 1) regularity coef icient which is computed as a sum of line segments lengths divided by the total number of pixels which belong to border of the region, 2) mean of line segments lengths, 3) variance of line segments lengths, 4) number of line segments, 5) 10 bin histogram of line segments orientations. The classi ication results are presented in Tab. 4. The average classi ication precision is 82%. It is significantly smaller in comparison to results of the experiments performed indoor. The outdoor experiments on
Journal of Automation, Mobile Robotics & Intelligent Systems
Tab. 4. Confusion matrix for an outdoor experiment terrain type grass pave1 pave2 asphalt 1a
grass 99% 18% 4% 0% 1b
pave1 1% 81% 25% 3%
pave2 1% 2% 68% 37%
asphalt 0% 0% 3% 60%
1c
2a
2b
2c
3a
3b
3c
4a
4b
4c
5a
5b
5c
VOLUME 8,
N◦ 3
2014
lem. The performance of the classi ication increases as a result. To show performance and advantages of our method we performed experiments indoor on terrain mockup and outdoor in real environment. We carried out the analysis of classi ication results. We compared various combinations of classi ication input and conigurations of the classi ier. We conclude that the classi ication results are better when depth data are used. The advantages of the method which uses the data from LRF are mainly due to the intensity values. They provide information about properties of the object’s surface which is well utilized by the classi ier. We also show the computation time of each element of the procedure. The most expensive part is segmentation. It takes more than 1 s to divide the image into segments. From the application point of view we are going to replace existing procedure by the faster one. On the other hand our goal is to increase performance of the segmentation procedure. To this end, we are going to use methods which take into account depth and color data simultaneously during segmentation. In future we are going to add next layer to classiication method. Our goal is to take into account classi ication results of neighboring segments as well as depth and color of the considered segment. We believe that context-aware segmentation will bring better eficiency of the classi ication procedure.
AUTHORS Fig. 9. Results of the outdoor experiment – the example scene (a) segmenta on (b) and classifica on (c) results. the real robot are more challenging. The irst dif iculty is connected to similarity between classi ied regions. The other dif iculties are caused by imprecise localization system (odometry and IMU). The robot moves in irregular terrain. Thus, the imprecise measurements of the inclination of the robot’s platform causes incorrect location of the 3D points obtained from range measurements. The example classi ication results are presented in Fig. 9. Colors in Fig. 9c represent various type of terrain: green – grass, yellow – pavement 1, red – pavement 2, blue – asphalt. The classi ication results are accurate enough to use the proposed method on the real robot dedicated to robotic competition.
5. Conclusions and future work In the paper we presented the terrain classi ication method for the mobile robot. We show that the performance of the classi ication can be increased by using boosting technique to combine output from weak SVM classi iers. The results for SVM and AdaBoost classiier are better than for single SVM classi ier with multidimensional features vector. SVM algorithm works eficiently with multi-dimensional problems. By using our method we reduce the dimensionality of the prob-
Jan Wietrzykowski – Poznań University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań , Poland, e-mail: Jan.Wietrzykowski@student.put.poznan.pl. Dominik Belter∗ – Poznań University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań , Poland, e-mail: Dominik.Belter@put.poznan.pl. ∗
Corresponding author
REFERENCES [1] A. Angelowa, L. Matthies, D. Helmick, and P. Perona, “Fast terrain classi ication using variable-length representation for autonomous navigation”. In: Proceedings of the conference on Computer Vision and Pattern Recognition, Minneapolis, USA, 2007, pp. 1–8, doi: 10.1109/CVPR.2007.383024. [2] D. Benbouzid, R. Busa-Fekete, N. Casagrande, F.-D. Collin, and B. Kegl, “Multiboost: a multipurpose boosting package”, Journal of Machine Learning Research, vol. 13, 2012, pp. 549–553. [3] C. Bishop, Pattern Recognition and Machine Learning, Springer, 2006. [4] J.-Y. Bouguet. “Camera calibration toolbox for matlab”, 2014, www.vision.caltech.edu/bouguetj/calib_doc. 33
Journal of Automation, Mobile Robotics & Intelligent Systems
N◦ 3
2014
[5] J. Chetan, M. Krishna, and C. Jawahar, “Fast and spatially-smooth terrain classi ication using monocular camera”. In: Proceedings of 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 2010, pp. 4060–4063.
[16] J. Lobo and J. Dias, “Relative pose calibration between visual and inertial sensors”, International Journal of Robotics Research, vol. 26, no. 6, 2004, pp. 561–575, doi: 10.1177/0278364907079276.
[6] H. Dahlkamp, A. Kaehler, D. Stavens, S. Thrun, and G. Bradski, “Self-supervised monocular road detection in desert terrain”. In: Proceedings of Robotics: Science and Systems, Philadelphia, USA, 2006.
[17] D. Maier, C. Stachniss, and M. Bennewitz, “Vision-based humanoid navigation using selfsupervised obstacle detection”, International Journal of Humanoid Robotics, vol. 10, no. 2, 2013, doi: 10.1142/S0219843613500163.
[7] T. Dietterich, “An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization”, Machine Learning, vol. 40, no. 2, 2000, pp. 139–157.
[18] R. Schapire and Y. Singer, “Improved boosting algorithms using con idence-rated predictions”, Machine Learning, vol. 37, 1999, pp. 297–336, doi: 10.1145/279943.279960.
[8] P. Felzenszwalb and D. Huttenlocher, “Eficient graph-based image segmentation”, International Journal of Computer Vision, vol. 59, no. 2, 2004, pp. 167–181, doi: 10.1023/B:VISI.0000022288.19776.77.
[19] K. Walas, A. Schmidt, M. Kraft, and M. Fularz, “Hardware implementation of ground classi ication for a walking robot”. In: Proceedings of the 9th International Workshop on Robot Motion and Control, Wąsowo, Poland, 2013, pp. 110–115, doi: 10.1109/RoMoCo.2013.6614594.
[9] P. Filitchkin and K. Byl, “Feature-based terrain classi ication for littledog”. In: Proceedings of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Vilamoura, Portugal, 2012, pp. 1387–1392, doi: 10.1109/IROS.2012.6386042.
[20] Q. Zhang and R. Pless, “Extrinsic calibration of a camera and laser range inder”. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, 2004, pp. 2301–2306, doi: 10.1016/j.proeng.2012.01.669.
[10] E. Garcı́a and F. Lozano, “Boosting support vector machines”. In: Proceedings of 5th International Conference on Machine Learning and Data Mining in Pattern Recognition, Leipzig, Germany, 2007, pp. 153–167. [11] I. Halatci, C. Brooks, and K. Iagnemma, “Terrain classi ication and classi ier fusion for planetary exploration rovers”. In: Proceedings of 2007 IEEE Aerospace Conference, Big Sky, USA, 2007, pp. 1–11, doi: 10.1109/AERO.2007.352692. [12] M. Hoep linger, C. Remy, M. Hutter, L. Spinello, and R. Siegwart, “Haptic terrain classi ication for legged robots”. In: Proceedings of 2010 IEEE International Conference on Robotics and Automation (ICRA), Anchorage, USA, 2010, pp. 2828–2833, doi: 10.1109/ROBOT.2010.5509309. [13] R. Karlsen and G. Witus, “Terrain understanding for robot navigation”. In: Proceedings of 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Diego, USA, 2007, pp. 895–900, doi: 10.1109/IROS.2007.4399223. [14] Y. Khan, P. Komma, and A. Zell, “High resolution visual terrain classi ication for outdoor robots”. In: Proceedings of IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, 2011, pp. 1014–1021, doi: 10.1109/ICCVW.2011.6130362. [15] S. Laible, Y. Khan, and A. Zell, “Terrain classi ication with conditional random ields on fused 3d lidar and camera data”. In: Proceedings of European Conference on Mobile Robots, Barcelona, Spain, 2013, pp. 172–177, doi: 10.1109/ECMR.2013.6698838. 34
VOLUME 8,
Journal of Automation, Mobile Robotics & Intelligent Systems
P
S
S
E
N◦ 3
VOLUME 8,
D
P
2014
F
Submi ed: 30th June 2014; accepted: 10th July 2014
Piotr Kozierski, Marcin Lis, Dariusz Horla DOI: 10.14313/JAMRIS_4-2013/25 Abstract: The paper presents a new approach to par cle filtering, i.e. Dispersed Par cle Filter. This algorithm has been applied to the power system, but it can also be used in other transmission networks. In this approach, the whole network must be divided into smaller parts. As it has been shown, use of Dispersed Par cle Filter improves the quality of the state es ma on, compared to a simple par cle filter. It has been also checked that communica on between subsystems improves the obtained results. It has been checked by means of simula on based on model, which has been created on the basis of knowledge about prac cally func oning power systems. Keywords: par cle filter, power system, state observer, state es ma on, dispersed es ma on
1. Introduc on The Power System State Estimation (PSSE) problem is relatively old, because it has over 40 years, and the irst idea of state estimation in power system has been proposed by Fred Schweppe in 1970 [21]. But PSSE is still used also in more advanced calculations, such as Optimal Power Flow (OPF). In this case, correct state vector estimation directly affects inal solution. PSSE is also important in terms of energy security, to prevent so-called “blackouts” – this is the highest degree of failure of the power system, in which many villages and towns can be without access to electric power. To prevent such accidents, current control of the results obtained in the PSSE calculations is needed. A lot of different algorithms have been created so far in order to solve the problem of PSSE, such as Weighted Least Squares (WLS), varieties of Kalman Filter (Extended Kalman Filter (EKF) [11], Unscented Kalman Filter (UKF) [24]). In [23] authors presented the use of a different estimator than typically used in the WLS method, and apart from the typical Newton method they suggested the use of Primal-dual Interior Point Method (PDIPM). In [4], authors proposed the variety of Particle Filter (PF) as a state observer in relatively small power system. In this article a new algorithm, i.e. Dispersed Particle Filter (DPF) has been proposed. It involves the use not just one, but several different PF instances to run in parallel for different parts of the power system (each instance can be carried out in another computational unit, which can be placed in another area of power system). This approach is consistent with indi-
Fig. 1. Branch in network between i-th and j-th buses cated in [10] need of PSSE based on data from different control centres. The second Section is devoted to the power system and is followed by Section presenting basic information about particle ilter, while the fourth Section describes simulations that have been carried out, presenting results and the conclusions. Section no. 5 summarizes the entire article.
2. Power System Power system has been selected as object. Network is composed of B buses (nodes) and L branches (lines) that connect speci ied buses. Branch scheme has been shown in Fig. 1, where y ′ij /2 is a half total line charging susceptance [25], whereas y ij is a line admittance, which can be expressed by the equation y ij =
1 1 = . Z ij Rij + jXij
(1)
Based on y ij and y ′ij /2 admittance matrix Y can be created accordingly to equations Y ij = −y ij
i ̸= j ,
(
)
Y ii =
B ∑ j=1 j̸=i
y ′ij 2
+ y ij
(2)
(3)
.
Set of all voltages U and angles δ at the buses unambiguously describe state of the power system T
x = [x1 x2 . . . x2B−1 ]
T
= [U1 . . . UB δ2 . . . δB ]
,
(4)
because based on them it is possible to calculate all the other values in power system. But there is a problem with the angles, because the only thing that can be calculated is the difference between them. Therefore, one 35
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
reference node must be chosen, in which angle is always equal to 0 (in expression (4) irst node has been chosen as reference, hence δ1 is always equal to 0, and this angle is not included in the state vector). Accordingly, the state vector is limited to 2B − 1 variables. The measured values in the network are power injections in the nodes, power lows through the branches and voltage values in the buses. The last type of measurement (voltage magnitude) is special, because it is directly state variable. In other cases, the measured values can be expressed by the equations
B ∑
Ui Uj Yij cos (δi − δj − µij ) ,
(5)
j=1
Qi (U, δ) = Qi = =
B ∑
Ui Uj Yij sin (δi − δj − µij ) ,
(6)
j=1
Pij (U, δ) =Pij = Ui2 Yij cos (−µij ) − Ui Uj Yij cos (δi − δj − µij ) ,
(7)
Qij (U, δ) = Qij = Ui2 Yij sin (−µij ) − Ui Uj Yij sin (δi − δj − µij ) + Ui2
′ yij , 2
(8)
where (5-6) are the power injections (active and reactive), while (7-8) are the power lows. It should be noted that the Pij and Pji are two different power lows (as well as Qij and Qji ), and irst index speci ies the node where the measurement is made. Yij and µij values are given in admittance matrix based on Y ij = Yij · exp (jµij ) .
(9)
For more information about the power system, references [1, 18, 25] are recommended.
3. Par cle Filter The principle of operation is based on Bayesian iltering, and the PF is one of possible implementation of the Bayes ilter [3] posterior
z ( }| ){ (k) (k) = p x |Y
Posterior probability density function (PDF) is represented by the set of particles in PF, where each particle is composed of the value xi (vector of state variables) and the weight q i . Therefore it can be written that the set of particles corresponds to the posterior PDF. With higher number of particles N , this approximation is more accurate ( ) ( ) N →∞ p x(k) |Y(k) = pˆ x(k) |Y(k) = =
N ∑
( ) q i,(k) δ x(k) − xi,(k) ,
(11)
where δ(·) is a Dirac delta. In [12] the authors point out that the prior should be chosen so that as many particles as possible were drawn in the area where likelihood has signi icant values. Prior can be written as [2] ( ) ∫ ( ) (k) (k−1) p x |Y = p x(k) |x(k−1) ( ) p x(k−1) |Y(k−1) dx(k−1) , (12) where p(x(k) |x(k−1) ) is a transition model and p(x(k−1) |Y(k−1) ) is posterior from previous time step. First who suggest something what today can be considered as a particle ilter was Norbert Wiener, and it was as early as in irst half of the twentieth century [22]. However, only a few decades later, the power of computers made it possible to implement these algorithms. Algorithm proposed in [9] by Gordon, Salmond and Smith, named by them Bootstrap Filter (BF), is considered as the irst PF. The operation principle of the algorithm is presented below. Algorithm 1 (Bootstrap Filter) 1) Initialization. Draw N initial particle values from the PDF p(x(0) ). Set iteration number k = 1. 2) Prediction. Draw N new particles based on transition model: xi,(k) ∼ p(x(k) |xi,(k−1) ). 3) Actualization. Calculate weights of particles based on equation q˜i,(k) = p(y(k) |xi,(k) ). 4) Normalization. Normalize particle weights so that their sum be equal to 1 q˜i,(k) q i,(k) = ∑N . ˜j,(k) j=1 q
(13)
5) Resampling. Draw N new particles based on posterior PDF obtained in steps 2–4.
prior
likelihood
z ( }| }| ){ z ( ){ (k) (k) (k) (k−1) · p x |Y p y |x ) ( , = p y(k) |Y(k−1) {z } |
(10)
evidence
where x(k) is vector of state variables in k-th time step, y(k) is vector of measurements, whereas Y(k) is a set of output vectors from the beginning of simulation to kth time step. 36
2014
i=1
Pi (U, δ) = Pi = =
N◦ 3
6) End of iteration. Calculate estimated vector value, increase number of iteration k = k + 1, go to step 2. BF algorithm belongs to a class of Sequential Importance Resampling (SIR) algorithms and is one of its simpler implementation. Of course, also more complicated algorithms are proposed in the literature, for example Auxiliary PF [20], Rao-Blackwellised PF [8] or the Gaussian PF [13]. However, for purpose of this
Journal of Automation, Mobile Robotics & Intelligent Systems
article, BF algorithm and its modi ications were used – results which have been obtained in previous studies [14, 16] are satisfactory. In addition, the simplicity of implementation with such a complex problem, which is the PSSE, led to the choice of algorithm proposed in [9]. Resampling (5 step in Algorithm 1) also can be made in several different ways. In algorithm the strati ied resampling has been used. In order to learn more about the resampling, see [6, 17, 19]. To ind more information about particle ilter, references [2, 7, 15] are recommended. 3.1. Dispersed Par cle Filter Network composed of 16 buses has been proposed, as shown in Fig. 2, so there are 2B − 1 = 31 state variables. To use DPF the whole power system has been divided into 3 parts – PS1 , PS2 and PS3 . This division is one of many possible, as well as the number of subsystems in this example. The main purpose in this article was to show how the division of the system into smaller parts affects on obtained results. For the DPF implementation the assumption has been made that only measurements inside the subnet are available. However, for a good modeling, border lines and nodes at their ends are also required. For example, in the irst subsystem there are 9 modeled nodes (1, 2, 3, 4, 5, 6 and additionally 7, 8 and 13) and the number of state variables in PS1 is 17 (in the other two subsystems there are 8 nodes, and 15 state variables – reference node must exist in each subnet). As one can see, the number of state variables in each subsystem has been decreased and this should has positive in luence on estimation quality.
4. Simula ons 16-nodal system has been proposed for simulations (see Fig. 2). The numbers in circles indicate numbers of the nodes and values of R, X and y ′ /2 are, respectively, line resistance, line reactance and half of the total line charging susceptance. The double circles represent the location of the generators, single circles are the loads. There are also locations and types of measurements – the gray squares mark measurements of the power lows, while the grey circles mark the measurements of power injections and voltage values. One simulation, which consists of 100 time steps has been prepared and has been used for all calculations. Simulation computations for all subsystems have been made not in parallel, but sequentially one by one. 4.1. Simula on Results The irst simulations have been performed for the simple PF algorithm. The results have been shown in Fig. 3. The simulation for each number of particles N has been repeated 100 times with different values of the seed of random number generator. The value D, which is shown in the graph, was calculated based on
VOLUME 8,
N◦ 3
2014
mean square errors (MSE) of each of the state variables (there are 31 state variables in whole power system). This can be written as D = 106 ·
31 ∑
2
(MSEi ) .
(14)
i=1
Thanks to this, estimation quality of whole system can be represented by one coef icient. Next the another approach has been proposed, in which simultaneously three different particle ilters operate, each in a different subsystem, thus creating Dispersed Particle Filter. The assumption has been made that the individual subnets does not communicate with each other and does not exchange any information. Values of the state variables in each node were obtained based on the estimated values in the subnets. The simulation results have been shown in Fig. 4. As before, the graph shows the averaged results of D, based on 100 different simulations for each value of N . Signi icant improvement is visible, both in terms of the mean and standard deviation. In the third approach the exchange of data between subnets has been implemented. For the border branch information about the power lows was passed to another subsystem, and was taken as additional measurements. For example, in PS3 one of such border branch is the line (1,13). Information, which was sent from PS1 to PS3 , was the estimated values of P1,13 and Q1,13 . Both of these values were regarded in PS3 as another measurements. Similar information was transferred from the PS3 to the PS1 , but this time the values P13,1 and Q13,1 . The results have been shown in Fig. 5. In the last approach the impact of the additional measurement (transferred as in the previous case) – voltage magnitude in bus – has been checked. The results have been shown in Fig. 6. All results (excluding the standard deviation) have been shown in Fig. 7, for comparison. Based on obtained results, one can see that the use of DPF signi icantly improves the quality of estimation, in comparison to the standard PF algorithm. As it can be seen, performance improves even for the DPF without any communication between subnets. This can be explained by the fact that in the case of standard PF particles have to move in a 31-dimensional space (the number of state variables). In the case of DPF number of particles was smaller, but the number of dimensions of the state vector was also decreased – to 17 (for the PS1 ) and 15 (for the PS2 and PS3 ). Similar conclusions can be found in [5] (case without communication). The results obtained for DPF with additional measurements of power lows and voltage are very similar to those in which the voltage measurement is not passed. This is understandable, because in the values of the power lows are already contained information about state variables in this node.
5. Summary The article presents a new approach to particle iltering in the problem of Power System State Esti37
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
Fig. 2. Power system used in simula ons composed of 16 buses, with marked measurements
Fig. 3. Mean D values for standard PF
Fig. 4. Mean D values for DPF without any communicaon between subsystems.
mation – Dispersed Particle Filter. In each con iguration of communication, results obtained by the DPF were better than for the standard PF. This is because power system division causes reduction of state vector length, and improvement of estimation quality is ob38
Fig. 5. Mean simula on results for DPF with addi onal measurements of power flows
Fig. 6. Mean simula on results for DPF with addi onal measurements of power flows and voltages
served with decrease of object order. The best results have been obtained for cases with additional measurements. Further studies on the DPF are being planned, in-
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
Fig. 7. Simula on results for different algorithms. cluding the impact of the number of subsystems on simulation results. For a smaller number of state variables into subsystem simulation results should be better, hence the best division will be probably one subsystem for every bus. There are also plans for FPGA algorithm implementation and veri ication of the proposed algorithm performance of parallel computing for several computational units.
AUTHORS
Piotr Kozierski∗ – Poznan University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3a, 60-965 Poznan, Poland, e-mail: Piotr.Kozierski@gmail.com. Marcin Lis – Poznan University of Technology, Institute of Electrical Engineering and Electronics, ul. Piotrowo 3a, 60-965 Poznan, Poland, e-mail: mail.dla.studenta@gmail.com. Dariusz Horla – Poznan University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3a, 60-965 Poznan, Poland, e-mail: Dariusz.Horla@put.poznan.pl. ∗
Corresponding author
REFERENCES [1] Abur A., Exposito A.G., “Power System State Estimation: Theory and Implementation”, Marcel Dekker, Inc., 2004, pp. 17–49. DOI: 10.1201/9780203913673. [2] Arulampalam S., Maskell S., Gordon N., Clapp T., “A Tutorial on Particle Filters for On-line Nonlinear/Non-Gaussian Bayesian Tracking”, IEEE Proceedings on Signal Processing, vol. 50, no. 2, 2002, pp. 174–188. [3] Candy J.V., “Bayesian Signal Processing”, WILEY, New Jersey, 2009, pp. 36–44. DOI: 10.1002/9780470430583.
[4] Chen H., Liu X., She C., Yao C., “Power System Dynamic State Estimation Based on a New Particle Filter”, Procedia Environmental Sciences, vol. 11, Part B, 2011, pp. 655–661. [5] Djuric P. M., Lu T., Bugallo M. F., “Multiple particle iltering”, In: 32nd IEEE ICASSP, April 2007, III pp. 1181–1184. DOI: 10.1109/ICASSP.2007.367053. [6] Douc R., Cappe O., Moulines E., “Comparison of Resampling Schemes for Particle Filtering”, In: Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, September 2005, pp. 64–69. DOI: 10.1109/ISPA.2005.195385. [7] Doucet A., Freitas N., Gordon N., “Sequential Monte Carlo Methods in Practice”, SpringerVerlag, New York, pp. 225–246 (2001). DOI: 10.1007/978-1-4757-3437-9. [8] Doucet A., Freitas N., Murphy K., Russell S., “RaoBlackwellised Particle Filtering for Dynamic Bayesian Networks”, In: Proceedings of the Sixteenth conference on Uncertainty in arti icial intelligence, 2000, pp. 176–183. [9] Gordon N.J., Salmond D.J., Smith A.F.M., “Novel Approach to Nonlinear/Non-Gaussian Bayesian State Estimation”, IEE Proceedings-F, vol. 140, no. 2, 1993, pp. 107–113. DOI: 10.1049/ip-f2.1993.0015. [10] Horowitz S., Phadke A., Renz B., “The Future of Power Transmission”, IEEE Power and Energy Magazine, vol. 8, no. 2, 2010, pp. 34–40. DOI: 10.1109/MPE.2009.935554. [11] Huang Z., Schneider K., Nieplocha J., “Feasibility Studies of Applying Kalman Filter Techniques to Power System Dynamic State Estimation”, In: Power Engineering Conference, IPEC 2007, December 2007, pp. 376–382. 39
Journal of Automation, Mobile Robotics & Intelligent Systems
[12] Imtiaz S.A., Roy K., Huang B., Shah S.L., Jampana P., “Estimation of States of Nonlinear Systems using a Particle Filter”, In: IEEE International Conference on Industrial Technology, ICIT 2006, December 2006, pp. 2432–2437. DOI: 10.1109/ICIT.2006.372687. [13] Kotecha J.H., Djuri� P.M., “Gaussian Particle Filtering”, IEEE Trans Signal Processing, vol. 51, no. 10, 2003, pp. 2592–2601. DOI: 10.1109/TSP.2003.816758. [14] Kozierski P., Lis M., “Auxiliary and RaoBlackwellised Particle Filters Comparison”, Poznan University of Technology Academic Journals: Electrical Engineering, Issue 76, 2013, pp. 79–88. [15] Kozierski P., Lis M., “Filtr Czasteczkowy w Problemie Sledzenia – Wprowadzenie”, Studia z Automatyki i Informatyki, vol. 37, 2012, pp. 79–94. [16] Kozierski P., Lis M., Krolikowski A., Gulczynski A., “Resampling – Essence of Particle Filter”, CREATIVETIME, Krakow, vol. 1, 2013, pp. 174–185. [17] Kozierski P., Lis M., Zietkiewicz A., “Resampling in Particle Filtering – Comparison”, Studia z Automatyki i Informatyki, vol. 38, 2013, pp. 35–64. [18] Kremens Z., Sobierajski M., “Analiza Systemow Elektroenergetycznych”, Wydawnictwa Naukowo-Techniczne, Warszawa, 1996, pp. 39–191. [19] Murray L., Lee A., Jacob P., “Rethinking Resampling in the Particle Filter on Graphics Processing Units”, arXiv preprint, arXiv:1301.4019, 2013. [20] Pitt M., Shephard N., “Filtering via Simulation: Auxiliary Particle Filters”, Journal of the American Statistical association, vol. 94, no. 446, 1999, pp. 590–599. DOI: 10.1080/01621459.1999.10474153. [21] Schweppe F.C., Rom D.B., “Power System StaticState Estimation, Part II: Approximate Model”, IEEE Transactions on Power Apparatus and Systems, vol. 89, no. 1, January 1970, pp. 125–130. DOI: 10.1109/TPAS.1970.292679. [22] Simon D., “Optimal State Estimation”, WILEY–INTERSCIENCE, New Jersey, 2006, pp. 461–484. DOI: 10.1002/0470045345. [23] Singh R., Pal B.C., Jabr R.A., “Choice of Estimator for Distribution System State Estimation”, IET Generation, Transmission & Distribution, vol. 3, Iss. 7, 2009, pp. 666–678. DOI: 10.1049/ietgtd.2008.0485. [24] Valverde G., Terzija V., “Unscented Kalman Filter for Power System Dynamic State Estimation”, IET Generation, Transmission & Distribution, vol. 5, Iss. 1, 2011, pp. 29–37. DOI: 10.1049/iet-gtd.2010.0210. [25] Wood, A.J., Wollenberg B., “Power Generation, Operation and Control”, John Wiley & Sons Inc., 1996, pp. 91–130, 453–513.
40
VOLUME 8,
N◦ 3
2014
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
Face Detection in Color Images Using Skin Segmentation Submitted: 6th July 2014; accepted: 15th July 2014
Mohammadreza Hajiarbabi, Arvin Agah DOI: 10.14313/JAMRIS_3-2014/26 Abstract: Face detection which is a challenging problem in computer vision, can be used as a major step in face recognition. The challenges of face detection in color images include illumination differences, various cameras characteristics, different ethnicities, and other distinctions. In order to detect faces in color images, skin detection can be applied to the image. Numerous methods have been utilized for human skin color detection, including Gaussian model, rule-based methods, and artificial neural networks. In this paper, we present a novel neural network-based technique for skin detection, introducing a skin segmentation process for finding the faces in color images. Keywords: skin detection, neural networks, face detection, skin segmentation, and image processing
1. Introduction Face recognition is an active area of research in image processing and computer vision. Face recognition in color images consists of three main phases. First is skin detection, in which the human skin is detected in the image. Second is face detection in which the skin components found in the first phase are determined to be part of human face or not. The third phase is to recognize the detected faces. This paper focuses on skin detection and face detection phases. Face detection is an important step not only in face recognition systems, but also in many other computer vision systems, such as video surveillance, humancomputer interaction (HCI), and face image retrieval systems. Face detection is the initial step in any of these systems. The main challenges in face detection are face pose and scale, face orientation, facial expression, ethnicity and skin color. Other challenges such as occlusion, complex backgrounds, inconsistent illumination conditions, and quality of the image further complicate face detection in images. The skin color detection is also an important part in many computer vision applications such as gesture recognition, hand tracking, and others. Thus, skin detection is also challenging due to different illumination between images, dissimilar cameras and lenses characteristics, and the ranges of human skin colors due to ethnicity. One important issue in this field is the pixels’ color values which are common between human skin and other entities such as soil and other common items [2].
There are a number of color spaces that can be used for skin detection. The most common color spaces are RGB, YCbCr, and HSV. Each of these color spaces has its own characteristics. The RGB color space consists of red, green and blue colors from which other colors can be generated. Although this model is simple, it is not suitable for all applications [16]. In the YCbCr color space, Y is the illumination (Luma component), and Cb and Cr are the Chroma components. In skin detection, the Y component can be discarded because illumination can affect the skin. The equations for converting RGB to YCbCr are as follows: Y = 0.299R+0.587G+0.114B Cb = B – Y Cr = R – Y
The HSV color space has three components, namely, H (Hue), S (Saturation), and V (Value). Because V specifies the brightness information, it can be eliminated in skin detection. The equations for converting RGB to HSV are as follows: =
255
=
255
=
255
Cmax = max (R', ', B'), Cmin = min (R', G', B'), x = Cmax – Cmin
for x not being zero other wise S = 0 V = Cmax
2. Skin Color Detection Three skin detection methods of Gaussian, rule-based, and neural networks are discussed in this section.
41
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
2.1. Gaussian Methods This technique which was proposed in [20], uses the Gaussian model in order to find the human skin in an image. The RGB image is transformed to the YCbCr color space. The density function for Gaussian variable X = (Cb Cr)T∈ R2 is:
1
= | |
1 2
}
Where
and the parameters are:
The parameters are calculated using sets of training images. For each pixel value, the density function is calculated; however, only the (Cb Cr) value is used because the Y component has the illumination information which is not related to skin color. The probability value of more than a specified threshold is considered as skin. The final output is a binary image where the non-skin pixels are shown by black and human skin by white. It is worth to mention that the amount for parameters µ and C are calculated using a specified training samples and can be varied by using other training samples.
2.2. Rule-Based Methods
Skin detection based on rule-based methods has been used in several research efforts as the first step in face detection. Chen et al. analyzed the statistics of different colors [5]. They used 100 images for training, consisting of skin and non-skin in order to calculate the conditional probability density function of skin and non-skin colors. After applying Bayesian classification, they determined the rules and constraints for the human skin color segmentation. The rules are:
r(i)>α, β1< r(i) – g(i)) < β2, γ1 < r(i) – b(i)) < γ2 σ1< (g(i) – b(i)) < σ2 With
α = 100, β1 = 10, β2 = 70, γ1 = 24, γ2 = 112, σ1 = 0 and σ2 = 70
Although this method works on some images perfectly, the results are not reliable on images with complex background or uneven illumination. Kovac et al. introduced two sets of rules for images taken indoor or outdoor [11]. These rules are in RGB space, where each pixel that belongs to human skin must satisfy certain relations. For indoor images:
42
Articles
R>95 ,G>40 ,B>20, max {R, G, B} – min {R, G, B} > 15, |R-G|>15 , R>G , R>B
N° 3
2014
For images taken in daylight illumination:
R>220 ,G>210 ,B>170 ,|R–G|≤15 , R>B , G>B
Kong et al. presented rules that use the information from both HSV and normalized RGB color spaces [10]. They suggested that although in normalized RGB the effect of intensity has been reduced, it is still sensitive to illumination. Therefore, they also use HSV for skin detection. Each pixel that satisfies these rules is considered to be a human skin pixel:
2.3. Neural Network Methods Neural network has been used in skin color detection in a number of research projects. Doukim et al. use YCbCr as the color space with a Multi-Layer Perceptron (MLP) neural network [6]. They used two types of combination strategies, and several rules are applied. A coarse to fine search method was used to find the number of neurons in the hidden layer. The combination of Cb/Cr and Cr features produced the best result. Seow et al. use the RGB as the color space which is used with a three-layered neural network [14]. Then the skin regions are extracted from the planes and are interpolated in order to obtain an optimum decision boundary and the positive skin samples for the skin classifier. Yang et al. use YCbCr color space with a back propagation neural network [21]. They take the luminance Y and sort it in ascending order, dividing the range of Y values into a number of intervals. Then the pixels whose luminance belong to the same luminance interval are collected. In the next step, the covariance and the mean of Cb and Cr are calculated and are used to train the back propagation neural network. Another example of methods of human skin color detection using neural network can be found in [1].
3. Skin Color Detection
The novel approach presented in this paper is based on skin detection using neural networks with hybrid color spaces. Neural network is a strong tool in learning, so it was decided to use neural network for learning pixels’ colors, in order to distinguish between what is face skin pixel and what is a non-face skin pixel. We decided to use information from more than one color space instead of using just the information from one color space. We gathered around 100,000 pixels for face and 200,000 for non-face pixels from images chosen from the Web. Choosing images for the non-skin is a rather difficult task, because that is an enormous category, i.e., everything which is not human skin is non-skin. We tried to choose images from different categories, es-
Journal of Automation, Mobile Robotics & Intelligent Systems
pecially those which are very similar to human skin color, such as sand, surfaces of some desks, etc. We used such things in training the neural network so that the network can distinguish them from human skin. For the implementation, a multi-layer perceptron (MLP) neural network was used. Several entities can be used as the input to the neural network, namely, RGB, HSV (in this case V is not used because it has the illumination information which is not suitable for skin detection), YCbCr (Y is also not used because it has the illumination information). The number of outputs can be one or two. If there is just one output, then a threshold can be used. For example, an output greater than 0.5 indicates that the input pixel belongs to skin, and less than that shows that it belongs to non-skin. For two outputs, one output belongs to skin and the other to non-skin. The larger value of the two outputs identifies the class of the pixel. Around half of the samples were used for training and the rest for testing/validation. Different numbers of neurons were examined in the hidden layer, ranging from two nodes to 24 nodes. The networks which produced better results were chosen for the test images. For most of the networks, having 16 or 20 nodes in the hidden layer produced better results in comparison to other number of neurons in the hidden layer. A combination of the different color space CbCrRGBHS was used as the input. Y and V were eliminated from YCbCr and HSV because they contain illumination information. We trained several different neural networks [7] and tested the results on the UCD database, using MATLAB (developed by MathWorks) for implementation. The UCD database contains 94 images from different ethnicities. The images vary from one person in the image to multiple people. The UCD database also contains the images after cropping the face skin. The Feed Forward neural network was used in all the experiments. We considered one node in the output. If the value of the output node is greater than 0.5, then the pixels belongs to human skin, otherwise it is not a human skin. The experimental results are reported as precision, recall, specificity and accuracy. Precision or positive predictive value (PPV): PPV = TP⁄((TP+FP)) Sensitivity or true positive rate (TPR) equivalent with hit rate, recall: TPR = TP⁄P = TP⁄((TP+FN))
Specificity (SPC) or true negative rate:
SPC = TN⁄N = TN⁄((FP+TN))
Accuracy (ACC):
ACC = ((TP+TN))⁄((P+N))
In the skin detection experiments, P is the number of the skin pixels; N is the number of the non-skin pix-
VOLUME 8,
N° 3
2014
els. TP is the number of the skin pixels correctly classified as skin pixels. TN is the number of the non-skin pixels correctly classified as non-skin pixels. FP is the number of the non-skin pixels incorrectly classified as skin pixels. FN is the number of the skin pixels incorrectly classified as non-skin pixels.
4. Experimental Results
We generated a vector consisting of the information of the color spaces CbCrRGBHS and yielded the results in Table 1. Another neural network was designed with having the same input but different nodes in the output. In this experiment two nodes were chosen for the output, one for the skin and the other for the non-skin (higher value determines class). The results for CbCrRGBHS vector are listed in Table 2. The results show that in case of recall and precision we have some improvement, but the precision has decreased. Table 3 shows the result of other methods discussed compared to our best results on using the UCD database. Comparing the other methods with the result we have from the CbCrRGBHS vector shows that our result is better in precision, specificity and accuracy. Our method [7] accepts fewer non-skin pixels as skin comparing to other methods. It should be noted that there is a tradeoff between precision and recall. If we want to have high recall (recognizing more skin pixels correctly) then it is highly possible to recognize many non-skin pixels as human skin which will reduce the precision and vice versa. Figures 1 to 7 illustrate some of our experimental results on images from the UCD database. These are produced using the CbCrRGBHS vector and two outputs for the neural network. The second image is the output from the neural network and the third image is after applying morphological operation. We first filled the holes that were in the image. After that, we applied erosion, followed by dialation operation [7]. The structuring element which was used by us was 3*3. This size had better results than other structuring elements.
5. Methods for Face Detection
After the skin detection phase, the next step is to use a face detection algorithm to detect the faces in the image. Several methods have been used for face detection. In this section we discuss the common methods which have been used in this field, namely, Rowley et al. [13] and Viola, Jones [19].
5.1. Rowley Method for Face Detection
Rowley et al. used neural networks, detecting upright frontal faces in gray-scale images [13]. One or more neural networks are applied to portions of an image and absence or presence of a face is decided. They used a bootstrap method for the training, which means that they add the images to the training set as the training progresses. Their approach has two stages. First a set of neural network-based filters are Articles
43
Journal of Automation, Mobile Robotics & Intelligent Systems
CbCrRGBHS
Recall
Specificity
Accuracy
77.73
41.35
95.92
81.93
Precision
Recall
Specificity
Accuracy
71.30
50.25
93.43
82.36
Precision
Recall
Specificity
Accuracy
54.96
66.82
81.12
77.46
62.51
69.09
85.71
81.45
Table 2. Accuracy results for CbCrRGBHS
Gaussian Chen Kovac Kong CbCrRGBHS
63.75
37.47
71.30
Table 3. Accuracy results for other methods
44
applied to an image. These filters look for faces in the image at different scales. The filter is a 20*20 pixel region of the image and the output of the image is 1 or -1, where 1 indicates that the region belongs to face and -1 indicates that the region contains no face. This filter moves pixel by pixel through the entire image. To solve the problem for faces bigger than this size, the image is subsampled by a factor of 1.2. They use a preprocessing method [17], where first the intensity values across the window are equalized. A linear function is fitted to the intensity values in the window and then it is subtracted, which corrects some extreme lightening conditions. Then histogram equalization is applied, in order to correct the camera gain and also to improve the contrast. Also an oval mask is used to ignore the background pixels. The window is then passed to a neural network. Three types of hidden units are used. Four units which looked at 10*10 sub regions, 16 which looked at 5*5 sub regions, and 6 which looked at overlapping 20*5 horizontal stripes. These regions are chosen in order to detect different local features of the face. For example the horizontal stripes were chosen to detect eyes, eyebrows, etc. Around 1050 face images are used for training. Images (black and white) are chosen from Web and some popular databases. For the non-face images another method is used. The reason is the face detection is quite different from other problems. The set of nonface images is much larger than face images. The steps of their method are: – An initial set consisting of 1000 random images are generated. The preprocessing method is applied to these images. Articles
N° 3
Precision
Table 1. Accuracy results for CbCrRGBHS
CbCrRGBHS
VOLUME 8,
51.13
14.58
50.25
89.98
91.61 93.43
2014
80.02
71.87 82.36
– A network is trained using the face and non-face images. The network is tested on an image that contained no face. The misclassified sub images is chosen (those which were considered as faces wrongly). – 250 of these sub images are chosen randomly, the preprocessing methods are applied to them and these sub images are added into the training set as negative examples. The process is continued from the second step. Other new ideas were utilized by [13]. In the areas which contain face, there are lots of detection because of the moving pixel by pixel of the window over the image. Also several detections are required because of using different scales over the image. They used a threshold and counted the number of the detections in each location. If the number of detections is above a specified threshold, then that location is considered as a face otherwise rejected. Other nearby detections that overlap the location classified as face are considered as error, because two faces cannot overlap. This method is called overlap elimination. Another method that is used to reduce the number of false positives is using several networks and arbitration method between them to produce the final decision. Each network is trained with different initial weights, different random sets of nonface images, and different permutations of the images that are presented to the network. Although the networks had very close detection and error rates, the errors were different from each other. They use a combination of the networks using AND, OR and voting methods.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
5.2. Viola Method for Face Detection Viola et al. trained a classifier using a large number of features [19]. A set of large weak classifiers are used, and these weak classifiers implemented a threshold function on the features. They use three types of Haar features. In two-rectangle feature, the value is the subtraction between the sums of the pixels within two regions. In three-rectangle feature, the value is the sum within two outside rectangles subtracted from the inside rectangle. In four-rectangle feature, the value is the difference between diagonals of the rectangle. They use the integral image that allowed the features to be computed very fast. Their AdaBoost learning algorithm is as follows: – The first step is initializing the weights Where m is the number of positive examples and l is the number of negative examples. yi = 0 for negative examples and yi = 1 for positive examples. – For t = 1,…,T
– Normalize the weights using
If example xi is classified correctly then ei = 0, otherwise ei = 1 and
=
– The final classifier is: h
∑
∑
Viola et al. made another contribution which was constructing a cascade of classifiers which is designed to reduce the time of finding faces in an image [19]. The beginning cascades reject most of the images, images which pass the first cascade will go to the second one, and this process continues till to the end cascade of classifiers. Similar to the Rowley method, the Viola method includes a window that is moving on the image and decides if that window contains a face. However, Viola showed that their method is faster than Rowley [19].
5.3. Other Methods Now the value of the weights will be between 0 and 1 and so is a probability distribution. – For each feature j a classifier hj is trained and uses just a single feature. The error is: – The classifier ht with the lowest error Ɛt is chosen and the weights are updated using
There are some other methods which have been used for face detection. Principal component analysis (PCA) method which generates Eigen faces has been used in some approaches for detecting faces [3]. Other types of neural networks have also been used in [12] and [22]. Shavers used Support Vector Machines (SVM) for face detection [15]. Jeng used geometrical facial features for face detection [9]. They have shown that their method works for detecting faces in different poses. Hjelmas has a survey in face detection methods from single edge based
Figure 1. Experimental results Articles
45
Journal of Automation, Mobile Robotics & Intelligent Systems
Figure 2. Experimental results
Figure 3. Experimental results 46
Articles
VOLUME 8,
N째 3
2014
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N째 3
2014
Figure 4. Experimental results
Figure 5. Experimental results, two faces with complex background Articles
47
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N째 3
2014
Figure 6. Experimental results, when two faces are close to each other algorithms to high level approaches [8]. Yang also has published another survey in face detection and numerous techniques have been described in it [23].
6. Face Detection Research Approach
48
Rowley and Viola methods both search all areas of image in order to find faces; however, in our approach, we first divide the image into two parts, the parts than contain human skin and the other parts. After this step the search for finding human face would be restricted to those areas that just contain human skins. Therefore face detection using color images can be faster than other approaches. As mentioned in [8] Articles
due to lack of standardized test, there is not a comprehensive comparative evaluation between different methods, and in case of color images the problem is much more because there are not many databases with this characteristic. Because of this problem it is not easy to compare different methods like Viola and Rowley methods with color based methods. But face detection using color based is faster than other methods because unlike Viola and Rowley method it does not need a window to be moved pixel by pixel on the whole image. Other papers such as [4] has also mentioned that color based methods is faster comparing to other methods.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
Figure 7. Experimental results, when two faces are close to each other In color images, we use the idea that we can separate the skin pixels from the other part of the image and by using some information we can recognize the face from other parts of the body. The first step in face detection is region labeling. In this case the binary image, instead of having values 0 or 1, will have value of 0 for the non-skin part and values of 1, 2â&#x20AC;Ś for the skin segments which was found in the previous step [4]. The next step that can be used is the Euler test [4]. Because there are some parts in image like the eyes, eyebrows, etc. that their colors differ from the skin. By using Euler test one can distinguish face components from other components such as the hands and
arms. The Euler test counts the number of holes in each component. One main problem in Euler test is that there may be some face components which has no holes in them and also some components belonging to hands or other parts of the body with holes in them. So Euler test cannot be a reliable method and we did not use it. At the next step, the cross correlation between a template face and grayscale image of the original image is calculated. The height, width, orientation and the centroid of each component are calculated. The template face is also resized and rotated. The center of the template is then placed on the center of the Articles
49
Journal of Automation, Mobile Robotics & Intelligent Systems
50
component. The cross correlation between these two region is then calculated. If that is above a specified threshold, the component would be considered as a face region; otherwise it will be rejected [4]. We have modified this algorithm. The first revision is that we discard the components where the proportion of the height to the width was larger than a threshold, except for some of these components which will be discussed later. In this case we were sure that no arms or legs would be considered as face. Second, the lower one fourth part of image is less probable to have faces and so we set a higher threshold for that part, as that part of the image most likely belongs to feet and hands. Increasing the threshold for the lower one forth part decreased the false positive of that part of the images. Third, for the components which are rejected we used a window consisting of the upper part of the face. We move the window across each bit of the components and calculate the correlation between the window and the area with the same size of the window. If the correlation is above certain threshold that part is considered to be face. For covering different sizes, we down sample the image (size*0.9) seven times. In this case, there may be some parts with overlapping rectangles. The rectangles around the face which had more than a specified area in common with each other are deleted and just one of them is kept. This novel method is useful for those components where the skin detection part has not distinguished between the skin pixels and the other pixels correctly. For example, in some images some pixels from the background are also considered as skin pixels, in this case these components will fail the template correlation test. Although this method increases the detection time and it is not guaranteed to work always, but it can be useful in some images where the background has a color similar to human skin. This method is similar to the method that was used by Rowley [13], however Rowley did it for the whole image and used a neural network to check that the component belongs to a face or not. Images included in Figures illustrate our method on several images from the UCD database [18]. The first image is the original image. The second image is produced after applying the skin detection algorithm. The third image is after filling the holes and applying the morphological operations. The forth image shows the black and white image, as the background changes to black and the skin pixels to white. The fifth image shows each component with a color. The sixth image show placing the template on the component which has been considered as a face. The last (seventh) image is the final output of the proposed and implemented detection process. In some images there may be two or three or more faces which are so close to each other that can become one component. The method that we have introduced to detect faces in these cases is as follows: 1. Compute the height to the width of the component. Articles
VOLUME 8,
N째 3
2014
2. If the ratio is smaller than a threshold, then it means that the components may belong to more than one face. Due to the value of the ratio this component can consist of three or more faces; however it is not probable that a component consists of more than three faces. 3. Find the border of the two faces which are stuck together. For finding the border we count the number of pixels on the horizontal axes. The two maximums (if the threshold suggests there are two faces) are placed on the first and last third of the component, and the minimum on the second third of the component. The minimum is found and the two new components are now tested to find faces on them. If the correlations of these two new components (or more) are more than the correlation of the single component before splitting, then it means there were two (or more) faces in this component. This part is done so we can differ between two faces which are stuck together and a face which is rotated 90 degree. In this case the correlation of the whole component would be more than the correlation of the two new components, so the component will not be separated. Face obeys the golden ratio, so the height to the width of the face is around 1.618. Sometimes with considering the neck which is mostly visible in images this ratio will increase to 2.4. So a height to width ratio between 0.809 and 1.2 shows that the component may belong to two faces. A value less than 0.809 shows that the component may consist of more than two faces. These values are calculated along the image orientation. Figure 6 illustrate this situation. This image is not part of the UCD database. The same approach can be applied when the ratio of the height to the width is higher than a certain threshold. In this case it means that there may be two or more faces which are above each other. The same algorithm can be applied with some modification. Figure 7 shows this case. This image is not part of the UCD database. For two faces this threshold will be between 3.2 and 4. Finding the border as mentioned in part 3 is faster and more accurate than using erosion (using as another method to separate two or more faces) because when using erosion, the process may need to be done several times so the two faces become separated, also in erosion while separating the two faces, some other parts of the image is also being eroded.
5. Conclusion
In this paper we have presented a novel methodology to detect faces on color images, with certain modifications in order to improve the performance of the algorithm. In some images when the faces are so close to each other that they cannot be separated after skin detection, we introduced a method for separating the face components. For the skin detection phase we used neural networks for recognizing human skin in color images. For future work, the face recognition phase can be added, where the faces which are detected and cropped can be recognized.
Journal of Automation, Mobile Robotics & Intelligent Systems
Unfortunately there is no database in this field. There are databases for face detection and databases for face recognition, but no database that covers both. So a database should be generated for this purpose.
AUTHORS
Mohammadreza Hajiarbabi – Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, Kansas, USA. E-mail: mehrdad.hajiarbabi@ku.edu Arvin Agah * – Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, Kansas, USA. E-mail: agah@ku.edu *Corresponding author
References [1] Al-Mohair H., Saleh J., Suandi S., “Human skin color detection: A review on neural network perspective”, International Journal of Innovative Computing, Information and Control, vol. 8, no. 12, 2012, pp. 8115–8131. [2] Alshehri S., “Neural Networks Performance for Skin Detection”, Journal of Emerging Trends in Computing and Information Sciences, vol. 3, no. 12, 2012, pp. 1582–1585. [3] Bakry E., Zhao H. M., Zhao Q., “Fast Neural implementation of PCA for face detection”. In: International Joint Conf. on Neural Networks, 2006, pp. 806–811. [4] Chandrappa D. N., Ravishankar M., RameshBabu D. R., “Face detection in color images using skin color model algorithm based on skin color information”. In: 2011 3rd International Conference on Electronics Computer Technology(ICECT), 2011, pp. 254–258. [5] Chen H., Huang C., Fu C., “Hybrid-boost learning for multi-pose face detection and facial expression recognition”, Pattern Recognition, Elsevier, vol. 41, no. 3, 2008, pp. 1173–1185. DOI: http:// dx.doi.org/10.1016/j.patcog.2007.08.010. [6] Doukim C. A., Dargham J. A., Chekima A., Omatu S., “Combining neural networks for skin detection”, Signal & Image Processing: An International Journal (SIPIJ), vol. 1, no. 2, 2010, pp. 1–11. [7] Hajiarbabi M., Agah A., “Human Skin Color Detection using Neural Networks with Boosting”, Journal of Intelligent Systems, under review, 2014. [8] Hjelmas E., Kee Low B., “Face Detection: A Survey”, Computer Vision and Image Understanding (CVIU), vol. 83, no. 3, 2001, pp. 236–274. DOI: http://dx.doi.org/10.1006/cviu.2001.0921. [9] Jeng S., Liao H., Liu Y., Chern M., “An efficient approach for facial feature detection using Geometrical Face Model”, Pattern Recognition,
VOLUME 8,
N° 3
2014
vol. 31, no. 3, 1998, pp. 273. DOI: http://dx.doi. org/10.1016/S0031-3203(97)00048-4. [10] Kong W., Zhe S., “Multi-face detection based on down sampling and modified subtractive clustering for color images”, Journal of Zhejiang University, vol. 8, no. 1, 2007, pp. 72–78. [11] Kovac J., Peer P., Solina F., “Human Skin Color Clustering for Face Detection”, EUROCON 2003. Computer as a Tool. The IEEE Region, vol. 8, no. 2, 2003, pp. 144–148. DOI: http://dx.doi. org/10.1109/EURCON.2003.1248169. [12] Lin S. H., Kung S. Y., Lin L. J., “Face recognition/ detection by probabilistic decision- based neural network”, IEEE Trans. on Neural Networks, vol. 8, 1997, pp. 114–132. [13] Rowley H., Baluja S., Kanade T., “Neural network-based face detection”, IEEE Pattern Analysis and Machine Intelligence, vol. 20, no. 1, 1998. pp. 22–38. DOI: http://dx.doi. org/10.1109/34.655647. [14] Seow M., Valaparla D., Asari V., “Neural network based skin color model for face detection”. In: Proceedings of the 32nd Applied Imagery Pattern Recognition Workshop (AIPR’03), 2003, pp. 141–145. [15] Shavers C., Li R., Lebby G., “An SVM-based approach to face detection”. In: 2006 Proc. of 38th Southeastern Symposium on System Theory, 2006, pp. 362–366. DOI: http://dx.doi. org/10.1109/SSST.2006.1619082. [16] Singh S., Chauhan D. S., Vatsa M., Singh R., “A Robust Skin Color Based Face Detection Algorithm”, Tamkang Journal of Science and Engineering, vol. 6, no. 4, 2003, pp. 227–234. [17] Sung K., Learning and example selection for object and pattern detection, PhD Thesis, MIT AI Lab, 1996 [18] UCD database: http://ee.ucd.ie/~prag/ [19] Viola P., Jones M. J., “Robust real-time object detection”. In: Proc. of IEEE Workshop on Statistical and Computational Theories of Vision, 2001. [20] Wu Y., Ai X., “Face detection in color images using Adaboost algorithm based on skin color information”. In: 2008 Workshop on Knowledge Discovery and Data Mining, 2008, pp. 339–342. [21] Yang G., Li H., Zhang L., Cao Y., “Research on a skin color detection algorithm based on selfadaptive skin color model”. In: International Conference on Communications and Intelligence Information Security, 2010, pp. 266–270. [22] Yang K., Zhu H., Y. J. Pan, “Human Face detection based on SOFM neural network”. In: 2006 IEEE Proc. Of International Conf. on Information Acquisition, 2006, pp. 1253–1257. [23] Yang M. H., Kriegman D. J., Ahuja N., “Detecting Faces in Images: A Survey”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 1, 2002, pp. 34–58.
Articles
51
Journal of Automation, Mobile Robotics & Intelligent Systems
C T
N
W N
VOLUME 8,
,P L
L D
B D
D M
S
N◦ 3
2014
: K
T
Submi ed: 15th May 2014; accepted: 24th June 2014
Janusz Kacprzyk, Sławomir Zadrożny DOI: 10.14313/JAMRIS_4-2013/27 Abstract: We show how Zadeh’s idea of compu ng with words and percep ons, based on his concept of a precisiated natural language (PNL), can lead to a new direc on in the use of natural language in data mining, linguis c data(base) summaries. We emphasize the relevance of Zadeh’s another idea, that of a protoform, and show that various types of Yager type linguis c data summaries may be viewed as items in a hierarchy of protoforms of summaries. We briefly present an implementa on for a sales database of a computer retailer as a convincing example that these tools and techniques are implementable and func onal. These summaries involve both data from an internal database of the company and data downloaded from external databases via the Internet. Keywords: compu ng with words, linguis c summaries, protoform
1. Introduc on The purpose of this article is to shortly present our opinion on what might be considered to be the most in luential and far reaching idea conceived by Zadeh, i.e. computing with words (CWW), and – on a more technical level – protoforms. We do not mention here his ”grand inventions” like fuzzy sets and possibility theories or foundations of the state space approach in systems modeling, which has been probably more relevant in a general sense, for various ields of science. To follow the spirit of this volume, our exposition will be concise and comprehensible. This article is an extended version of a short reserach note by Kacprzyk and Zadroż ny [34] Computing with words (and perceptions), introduced by Zadeh in the mid-1990s, and best and most comprehensively presented in Zadeh and Kacprzyk’s books [48], may be viewed to be a new ”technology” in the representation, processing and solving of various real life problems when a human being is a crucial element, one that makes it possible to use natural language, with its inherent imprecision, an an effective and ef icient way. To formally represent elements and expressions of natural language, Zadeh proposed to use the socalled PNL (precisiated natural language) in which statements about values, relations, etc. between variables are represented by constraints. In PNL, statements -– written ”x isr R” -– may be different, and correspond to numeric values, intervals, possibility disc52
tributions, verity distributions, probability distributions, usuality quali ied statements, rough sets representations, fuzzy relations, etc. For our purposes and in most our works, the usuality quali ied representation has been be of special relevance. Basically, it says ”x is usually R” that is meant as ”in most cases, x is R”. PNL may play various roles among which crucial are: description of perceptions, de inition of sophisticated concepts, a language for perception based reasoning, etc. Notice that the usuality is an example of modalities in natural language. Clearly, the above tools are meant for the representation and processing of perceptions. Another concept that Zadeh has subsequently introduced is that of a protoform. In general, most perceptions are summaries, exempli ied by ”most Swedes are tall” which is clearly a summary of the Swedes with respect to height. It can be represented in Zadeh’s notation as ”most As are Bs”. This can be employed for reasoning under various assumptions. One can go a step further, and de ine a protoform as an abstracted summary. In our case, this would be ”QAs are Bs”. Notice that we now have a more general, deinstantiated form of our point of departure (most Swedes are tall), and also of ”most As are Bs”. Needless to say that most human reasoning is protoform based, and the availability of such a more general representation is very valuable, and provides tools that can be used in many cases. Basically, the essence of our work over the years boiled down to showing that the concept of a precisiated natural language, and in particular of a protoform, viewed from the perspective of CWW, can be of use in attempts at a more effective and ef icient use of vast information resources, notably through linguistic data(base) summaries which are very characteristic for human needs and comprehension abilities. We will brie ly discuss an approach based on the concept of a linguistic data(base) summary that has been originally proposed by Yager [43,44] and further developed mainly by Kacprzyk and Yager [19], and Kacprzyk, Yager and Zadroż ny [20]. The essence of such linguistic data summaries is that a set of data, e.g., concerning employees, with (numeric) data on their age, sex, salaries, seniority, etc., can be summarized linguistically with respect to a selected attribute or attributes, say age and salaries, by linguistically quantiied propositions, e.g., ”almost all employees are well quali ied”, ”most young employees are well paid”, etc. which are simple, extremely human consistent and intuitive, and do summarize in a concise yet very informative form what we may be interested in. This will be
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
done from the perspective of Zadeh’s CWW paradigm (cf. Zadeh and Kacprzyk [48]), and we will in particular indicate the use of Zadeh’s concept of a protoform of a fuzzy linguistic summary (cf. Zadeh [47], Kacprzyk and Zadroż ny [23]) that can provide an easy generalization, portability and scalability. We will mention both the classic static linguisic summaries, notably showing that a class of summaries of interest is mined via Kacprzyk and Zadroż ny’s [22, 25] FQUERY for Access, and that by relating various types of linguistic summaries to fuzzy queries, with various known and sought elements, we can arrive at a hierarchy of protoforms of linguistic data summaries. Moreover, we will also brie ly mention new protoforms of linguistic summaries of time series as proposed by KAcprzyk, Wilbik and Zadroż ny [17, 18].
2. Linguis c Data Summaries via Fuzzy Logic with Linguis c Quan fiers The linguistic summary is meant as a sentence [in a (quasi)natural language] that subsumes the very essence (from a certain point of view) of a set of data. Here this set is assumed to be numeric, large and not comprehensible in its original form by the human being. In Yager’s approach (cf. Yager [43], Kacprzyk and Yager [19], and Kacprzyk, Yager and Zadroż ny [20]) we have: - Y = {y1 , . . . , yn } is a set of objects (records) in a database, e.g., the set of workers; - A = {A1 , . . . , Am } is a set of attributes characterizing objects from Y , e.g., salary, age, etc. in a database of workers, and Aj (yi ) denotes a value of attribute Aj for object yi . A linguistic summary of data set D consists of: - a summarizer S, i.e. an attribute together with a linguistic value (fuzzy predicate) de ined on the domain of attribute Aj (e.g. “low salary” for attribute “salary”); - a quantity in agreement Q, i.e. a linguistic quanti ier (e.g. most); - truth (validity) T of the summary, i.e. a number from the interval [0, 1] assessing the truth (validity) of the summary (e.g. 0.7); usually, only summaries with a high value of T are interesting; - optionally, a quali ier R, i.e. another attribute together with a linguistic value (fuzzy predicate) deined on the domain of attribute Ak determining a (fuzzy subset) of Y (e.g. “young” for attribute “age”). Thus, the linguistic summary may be exempli ied by T (most of employees earn low salary) = 0.7
(1)
A richer form of the summary may include a quali ier as in, e.g., T (most of young employees earn low salary) = 0.7 (2) Thehe core of a linguistic summary is a linguistically quanti ied proposition in the sense of Zadeh [46],
N◦ 3
2014
the one corresponding to (1)written as Qy’s are S
(3)
and the one corresponding to (2) written as QRy’s are S
(4)
The T , i.e., the truth value of (3) or (4), m may be calculated by using either original Zadeh’s calculus of linguistically quanti ied statements (cf. [46]), or other interpretations of linguistic quanti iers (cf. Liu and Kerre [38]), including Yager’s OWA operators [45] and Dubois et al. OWmin operators [6], or via generalized quanti ier, cf. Há jek and Holeň a [13] or Glö ckner [12]. Recently, Zadeh [47] introduced a relevant concept of a protoform which is de ined as a more or less abstract prototype (template) of a linguistically quantiied proposition. The most abstract protoforms correspond to (3) and (4), while (1) and (2) are examples of fully instantiated protoforms. Thus, evidently, protoforms form a hierarchy, where higher/lower levels correspond to more/less abstract protoforms. Going down this hierarchy one has to instantiate particular components of (3) and (4), i.e., quanti ier Q and fuzzy predicates S and R. The instantiation of the former one boils down to the selection of a quanti ier. The instantiation of fuzzy predicates requires the choice of attributes together with linguistic values (atomic predicates) and a structure they form when combined using logical connectives. This leads to a theoretically in inite number of potential protoforms. However, for the purposes of mining of linguistic summaries, there are obviously some limits on a reasonable size of a set of summaries that should be taken into account. These results from a limited capability of the user in the interpretation of summaries as well as from the computational point of view. The concept of a protoform may provide a guiding paradigm for the design of a user interface supporting the mining of linguistic summaries. It may be assumed that the user speci ies a protoform of linguistic summaries sought. Basically, the more abstract protoform the less should be assumed about summaries sought, i.e., the wider range of summaries is expected by the user. There are two limit cases, where: - a totally abstract protoform is speci ied, i.e., (4), - all elements of a protoform are totally speci ied as given linguistic terms, and in the former case the system has to construct all possible summaries (with all possible linguistic components and their combinations) for the context of a given database (table) and present to the user those verifying the validity to a degree higher than some threshold. In the second case, the whole summary is speci ied by the user and the system has only to verify its validity. Thus, the former case is usually more interesting from the point of view of the user but at the same time more complex from the computational point of view. There is a number of intermediate cases that may be more practical. In Table 1 basic types 53
Journal of Automation, Mobile Robotics & Intelligent Systems
of protoforms/linguistic summaries are shown, corresponding to protoforms of a more and more abstract form. Basically, each of fuzzy predicates S and R may be de ined by listing its atomic fuzzy predicates (i.e., pairs of ”attribute/linguistic value”) and structure, i.e., how these atomic predicates are combined. In Table 1 S (or R) corresponds to the full description of both the atomic fuzzy predicates (referred to as linguistic values, for short) as well as the structure. For example: ”Q young employees earn a high salary” is a protoform of Type 2, while ”Most employees earn a ”?” salary” is a protoform of Type 3. In the irst case the system has to select a linguistic quanti ier (usually from a prede ined dictionary) that when put in place of Q makes the resulting linguistically quanti ied proposition valid to the highest degree, and in the second case, the linguistic quanti ier as well as the structure of summarizer S are given and the system has to choose a linguistic value to replace the question mark (”?”) yielding a linguistically quanti ied proposition as valid as possible. Thus, the use of protoforms makes it possible to devise a uniform procedure to handle a wide class of linguistic data summaries so that the system can be easily adaptable to a variety of situations, users’ interests and preferences, scales of the project, etc. Usually, most interesting are linguistic summaries required by a summary of Type 5. They may be interpreted as fuzzy IF-THEN rules, and many interpretations are proposed (cf., e.g., Dubois and Prade [8]) there are considered many possible interpretations for fuzzy rules), and some of them were directly discussed in the context of linguistic summaries later on. There are many views on the idea of a linguistic summary, for instance a fuzzy functional dependency, a gradual rule, even a typical value. Though they do relect the essence of a human perception of what a linguistic summary should be, they are beyond the scope of this paper which focuses on a different approach.
3. Mining of Linguis c Data Summaries In the process of mining of linguistic summaries, at the one extreme, the system may be responsible for both the construction and veri ication of summaries (which corresponds to Type 5 protoforms/summaries given in Table 1). At the other extreme, the user proposes a summary and the system only veri ies its validity (which corresponds to Type 0 protoforms/summaries in Table 1). The former approach seems to be more attractive and in the spirit of data mining meant as the discovery of interesting, unknown regularities in data. On the other hand, the latter approach, obviously secures a better interpretability of the results. Thus, we will discuss now the possibility to employ a lexible querying interface for the purposes of linguistic summarization of data, and indicate the implementability of a more automatic approach. 54
VOLUME 8,
N◦ 3
2014
3.1. A fuzzy querying add-on for formula ng linguis c summaries In Kacprzyk and Zadroż ny’s [24, 29] approach, the interactivity, i.e. a user assistance, in the mining of linguistic summaries is a key point, and is in the de inition of summarizers (indication of attributes and their combinations). This proceeds via a user interface of a fuzzy querying add-on. In Kacprzyk and Zadroż ny [22, 25, 30], a conventional database management system is used with a fuzzy querying tool, FQUERY for Access. An important component of this tool is a dictionary of linguistic terms to be used in queries. They include fuzzy linguistic values and relations as well as fuzzy linguistic quanti iers. There is a set of built-in linguistic terms, but the user is free to add his or her own. Thus, such a dictionary evolves in a natural way over time as the user is interacting with the system. For example, an SQL query searching for troublesome orders may take the following WHERE clause: WHERE Most of the conditions are met out of PRICE*ORDERED-AMOUNT IS Low DISCOUNT IS High ORDERED-AMOUNT IS Much Greater Than ON-STOCK Obviously, the condition of such a fuzzy query directly correspond to summarizer S in a linguistic summary. Moreover, the elements of a dictionary are perfect building blocks of such a summary. Thus, the derivation of a linguistic summary of type (3) may proceed in an interactive (user-assisted) way as follows: - the user formulates a set of linguistic summaries of interest (relevance) using the fuzzy querying addon, - the system retrieves records from the database and calculates the validity of each summary adopted, and - a most appropriate linguistic summary is chosen. Referring to Table 1, we can observe that Type 0 as well as Type 1 linguistic summaries may be easily produced by a simple extension of FQUERY for Access. Basically, the user has to construct a query, a candidate summary, and it is to be determined which fraction of rows matches that query (and which linguistic quanti ier best denotes this fraction, in case of Type 1). For Type 3 summaries, a query/summarizer S consists of only one simple condition built of the attribute whose typical (exceptional) value is sought. For example, using: Q = ”most” and S = ”age=?” we look for a typical value of ”age”. From the computational point of view Type 5 summaries represent the most general form considered: fuzzy rules describing dependencies between speci ic values of particular attributes. The summaries of Type 1 and 3 have been implemented as an extension to Kacprzyk and Zadroż ny’s [26–28] FQUERY for Access. The discovery of general, Type 5 rules is dif icult, and some simpli ications about the structure of fuzzy predicates and/or quanti ier are needed, for instance to obtain association rules which have been initially
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
Tab. 1. Classifica on of protoforms/linguis c summaries
Type 0 1 2 3 4 5
Parameters Protoform QRy’s are S Qy’s are S QRy’s are S Qy’s are S QRy’s are S QRy’s are S
Given All S S and R Q and structure of S Q, R and structure of S Nothing
de ined for binary valued attributes as (cf. Agraval and Srikant [1]): A1 ∧ A2 ∧ . . . ∧ An −→ An+1
(5)
and note that much earlier origins of that concept are mentioned in the work by Há jek and Holeň a [13]). The use of fuzzy association rules to mine linguistic summaries through a fuzzy q uerying interface was proposed by Kacprzyk and Zadroż ny [26–28,31] advocated the use of fuzzy association rules for mining linguistic summaries in the framework of lexible querying interface. In particular, fuzzy association rules may be considered: A1 IS R1 ∧ A2 IS R2 ∧ . . . ∧ An IS Rn −→ An+1 IS S (6) where Ri is a linguistic term de ined in the domain of the attribute Ai , i.e. a quali ier fuzzy predicate in terms of linguistic summaries (cf. Section 2) and S is another linguistic term corresponding to the summarizer. The con idence of the rule may be interpreted in terms of linguistic quanti iers employed in the definition of a linguistic summary. Thus, a fuzzy association rule may be treated as a special case of a linguistic summary of type de ined by (4). The structure of the fuzzy predicates Ri and S is to some extent ixed but due to that ef icient algorithms for rule generation may be employed. These algorithms are easily adopted to fuzzy association rules. Usually, the irst step is a preprocessing of original, crisp data. Values of all attributes considered are replaced with linguistic terms best matching them. Additionally, a degree of this matching may be optionally recorded and later taken into account. Then, each combination of attribute and linguistic term may be considered as a Boolean attribute and original algorithms, such as Apriori [1], may be applied. They, basically, boil down to an ef icient counting of support for all conjunctions of Boolean attributes, i.e., so-called itemsets (in fact, the essence of these algorithms is to count support for as small a subset of itemsets as possible). In case of fuzzy association rules attributes may be treated strictly as Boolean attributes –- they may appear or not in particular tuples -– or interpreted in terms of fuzzy logic as in linguistic summaries. In the latter case they appear in a tuple to a degree and the support counting should take that into account. In our context we employ basically the approach by Lee and LeeKwang [37] and Au and Chan [2], Hu et al. [14] who
Sought validity T Q Q linguistic values in S linguistic values in S S, R and Q
simplify the fuzzy association rules sought by assuming a single speci ic attribute (class) in the consequent. Kacprzyk, Yager and Zadroż ny [20, 26–28, 31, 36] advocated the use of fuzzy association rules for mining linguistic summaries in the framework of lexible querying interface. Chen et al. [5] investigated the issue of generalized fuzzy rules where a fuzzy taxonomy of linguistic terms is taken into account. Kacprzyk and Zadroż ny [32] proposed to use more lexible aggregation operators instead of conjunction, but still in context of fuzzy association rules.More information on fuzzy association rules, from various perspectives, may be found later in this volume. As to some other approaches to the derivation of fuzzy linguistic summaries, we can mention the following ones. George and Srikanth [10], [11] use a genetic algorithm to mine linguistic summaries in which the summarizer is a conjunction of atomic fuzzy predicates. Then, they search for two linguistic summaries: the most speci ic generalization and the most general speci ication, assuming a dictionary of linguistic quanti iers and linguistic values over domains of all attributes. Kacprzyk and Strykowski [15,16] have also implemented the mining of linguistic summaries using genetic algorithms. In their approach, the itting function is a combination of a wide array of indices: a degree of imprecision (fuzziness), a degree of covering, a degree of appropriateness, a length of a summary, etc. (cf. also Kacprzyk and Yager [19]). Rasmussen and Yager [41, 42] propose an extension, SummarySQL, to SQL to cover linguistic summaries. Actually, they do not address the mining linguistic summaries but merely their veri ication. The SummarySQL may also be used to verify a kind of fuzzy gradual rules (cf. Dubois and Prade [7]) and fuzzy functional dependencies. Raschia and Mouaddib [40] deal with the mining of hierarchies of summaries, and their understanding of summaries is slightly different than here because they consider them as a conjunction of atomic fuzzy predicates (each referring to just one attribute). However, these predicates are not de ined by just one linguistic value but possibly by fuzzy sets of linguistic values (i.e., fuzzy sets of higher levels are considered). The mining of summaries (a whole hierarchy of summaries) is based on a concept formation (conceptual clustering) process. An interesting extension of the concept of a linguistic summary to the linguistic summarization of time series data was shown in a series of works by 55
Journal of Automation, Mobile Robotics & Intelligent Systems
Kacprzyk, Wilbik and Zadroż ny [17, 18]. In this case the array of possible protoforms is much larger as it re lects various perspectives, intentiones, etc. of the user. Just to give an examples, the protoforms used in those works may be exemplifed by: “Among all y’s, Q are P ”, exempli ied by “among all segments (of the time series) most are slowly increasing”, and “Among all R segments, Q are P ”, exempli ied by “among all short segments almost all are quickly decreasing”, as well as more sphisticated protoforms, for instance temporal ones like: “ET among all y’s Q are P ”, exempli ied by “Recently, among all segments, most are slowly increasing”, and “ET among all Ry’s Q are P ”, exempli ied by “Initially, among all short segments, most are quickly decreasing”; they both go beyond the classic Zadeh’s protoforms. It is easy to notice that the mining of linguistic summaries may be viewed to be closely related to natural language generation (NLG) and this path was suggested in Kacprzyk and Zadroż ny [33]. This may be a promising direction as NLG is a well developed area and software is available. A very relevant issue of comprehensiveness of linguistic data summaries, in Michalski’s sense, that is related to how well they can be understandable to an average user is considered in a recent paper by Kacprzyk and Zadroż ny [35].
4. Concluding Remarks We have shown how Zadeh’s idea of computing with words, often called computing with words and perceptions, based on his concepts of a precisiated natural language (PNL) and linguistically quanti ied propositions can lead to a new direction in the use of natural language in data mining and knowledge discovery, namely a linguistic data(base) summary. We have in particular focused our attention on the relevance of Zadeh’s another idea, that of a protoform, and show that various types of linguistic data summaries may be viewed as items in a hierarchy of protoforms of linguistic data summaries. We have brie ly presented an implementation of linguistic data summaries for a sales database of a computer retailer as a convincing example that these tools and techniques are implementable and practically functional. These summaries can involve both data from a company database and data downloaded from external databases via the Internet.
ACKNOWLEDGEMENTS This research has been partially supported by the National Centre of Science under Grant No. UMO2012/05/B/ST6/03068.
AUTHORS
Janusz Kacprzyk∗ – Systems Research Institute Polish Academy of Sciences, ul.Newelska 6, 01–447 Warszawa, Poland, e-mail: kacprzyk@ibspan.waw.pl. Sławomir Zadrożny – Systems Research Institute Polish Academy of Sciences, ul. Newelska 6, 01–447 56
VOLUME 8,
N◦ 3
2014
Warszawa, Poland, e-mail: zadrozny@ibspan.waw.pl. ∗
Corresponding author
REFERENCES [1] Agrawal R., Srikant R., ”Fast algorithms for mining association rules”. In: Proceedings of the 20th International Conference on Very Large Databases, Santiago de Chile, 1994. [2] Au W.-H., Chan K.C.C., ”FARM: A data mining system for discovering fuzzy association rules”. In: Proceedings of the 8th IEEE International Conference on Fuzzy Systems, Seoul, Korea, 1999, 1217–1222. [3] Berzal F., Cubero J.C., Marı́n N., Vila M.A., Kacprzyk J. and Zadroż ny S. , ”A General framework for computing with words in objectoriented programming”, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 15, 2007, 111–131. DOI: http://dx. doi.org/10.1142/S0218488507004480. [4] Bosc P. , Dubois D., Pivert O. , Prade H., de Calmes M., ”Fuzzy summarization of data using fuzzy cardinalities”. In: Proceedings of IPMU 2002, , Annecy, France, 2002, 1553–1559. [5] Chen G., Wei Q., Kerre E., ”Fuzzy data mining: discovery of fuzzy generalized association rules”. In: G. Bordogna and G. Pasi (Eds.): Recent Issues on Fuzzy Databases. Springer-Verlag, Heidelberg and New York, 2000, 45–66. DOI: http://dx. doi.org/10.1007/978-3-7908-1845-1. [6] Dubois D., Fargier H., Prade H., ”Beyond min aggregation in multicriteria decision: (ordered) weighted min, discri-min,leximin”. In: R.R. Yager and J. Kacprzyk (Eds.): The Ordered Weighted Averaging Operators. Theory and Applications, Kluwer, Boston, 1997, 181–192. [7] Dubois D., Prade H., ”Gradual rules in approximate reasoning”, Information Sciences, vol. 61, 1992, 103–122. [8] Dubois D., Prade H., ”Fuzzy sets in approximate reasoning, Part 1: Inference with possibility distributions”, Fuzzy Sets and Systems, vol. 40, 1991, 143–202. DOI: http://dx.doi.org/10.1016/ 0165-0114(91)90050-Z. [9] Bosc P., Dubois D., Prade H., ”Fuzzy functional dependencies – an overview and a critical discussion”. In: Proceedings of 3rd IEEE International Conference on Fuzzy Systems, Orlando, USA, 1994, 325–330. DOI: http://dx.doi.org/ 10.1109/FUZZY.1994.343753. [10] George R. , Srikanth R., ”Data summarization using genetic algorithms and fuzzy logic”. In: F. Herrera, J.L. Verdegay (Eds.): Genetic Algorithms and Soft Computing, Springer-Verlag, Heidelberg, 1996, 599–611. [11] George R. , Srikanth R., ”A soft computing approach to intensional answering in databases”,
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N◦ 3
2014
Information Sciences, vol. 92, no. 1–4, 1996, 313–328. DOI: http://dx.doi.org/10.1016/ 0020-0255(96)00049-7.
Systems, Springer-Verlag, Heidelberg, 1995, 415-433. DOI: http://dx.doi.org/10.1007/ 978-3-7908-1897-0_18.
[12] Glö ckner I., ”Fuzzy quanti iers, multiple variable binding, and branching quanti ication”. In: T.Bilgi c et al.: IFSA 2003. LNAI 2715, SpringerVerlag, Berlin and Heidelberg, 2003, 135–142.
[23] Kacprzyk J., Zadroż ny S., ”Protoforms of linguistic data summaries: towards more general natural-language-based data mining tools”. In: A. Abraham, J. Ruiz-del-Solar, M. Koeppen (Eds.): Soft Computing Systems, pp. 417 - 425, IOS Press, Amsterdam, 2002.
[13] Há jek P., Holeň a M., ”Formal logics of discovery and hypothesis formation by machine”, Theoretical Computer Science, vol. 292, 2003, 345–357. DOI: http://dx.doi.org/10.1007/ 3-540-49292-5_26. [14] Hu Y.-Ch., Chen R.-Sh., Tzeng G.-H, ”Mining fuzzy association rules for classi ication problems”, Computers and Industrial Engineering, vol. 43, no. 4, 2002, 735–750. DOI: http://dx.doi.org/ 10.1016/S0360-8352(02)00136-5. [15] Kacprzyk J., Strykowski P., ”Linguistic data summaries for intelligent decision support”. In: R. Felix (Ed.): Fuzzy Decision Analysis and Recognition Technology for Management, Planning and Optimization - Proceedings of EFDAN’99, Dortmund, Germany, 1999, 3–12. [16] Kacprzyk J., Strykowski P., ”Linguitic summaries of sales data at a computer retailer: a case study”. In: Proceedings of IFSA’99, vol. 1, 1999, Taipei, Taiwan R.O.C, 29–33. [17] Kacprzyk J., Wilbik A., Zadroż ny S., ”Linguistic summarization of time series using a fuzzy quanti ier driven aggregation”, Fuzzy Sets and Systems, vol. 159, no. 12, 2008, 1485–1499. DOI: http:// dx.doi.org/10.1016/j.fss.2008.01.025. [18] Kacprzyk J., Wilbik A., Zadroż ny S., ”An approach to the linguistic summarization of time series using a fuzzy quanti ier driven aggregation”, International Journal of Intelligent Systems, vol. 25, no. 5, 2010, 411–439. DOI: http://dx.doi.org/ 10.1002/int.20405. [19] Kacprzyk J., Yager R.R., ”Linguistic summaries of data using fuzzy logic”, International Journal of General Systems, vol. 30, no. 2, 2001, 33–154. DOI: http://dx.doi.org/10.1080/ 03081070108960702. [20] Kacprzyk J., Yager R.R., Zadroż ny S., ”A fuzzy logic based approach to linguistic summaries of databases”, International Journal of Applied Mathematics and Computer Science, 10, 2000, 813–834. [21] Kacprzyk J., Yager R.R., Zadroż ny S., ”Fuzzy linguistic summaries of databases for an ef icient business data analysis and decision support. In W. Abramowicz and J. Zurada (Eds.): Knowledge Discovery for Business Information Systems, pp. 129-152, Kluwer, Boston, 2001. [22] Kacprzyk J., Yager R.R., Zadroż ny S., ”FQUERY for Access: fuzzy querying for a Windowsbased DBMS”. In: P. Bosc and J. Kacprzyk (Eds.): Fuzziness in Database Management
[24] Kacprzyk J., Zadroż ny S., ”Data Mining via Linguistic Summaries of Data: An Interactive Approach”. In: T. Yamakawa and G. Matsumoto (Eds.): Methodologies for the Conception, Design and Application of Soft Computing. Proc. of IIZUKA’98, Iizuka, Japan, 1998, 667–668. [25] Kacprzyk J., Zadroż ny S., ”The paradigm of computing with words in intelligent database querying”. In: L.A. Zadeh and J. Kacprzyk (Eds.): Computing with Words in Information/Intelligent Systems. Part 2. Foundations, Springer–Verlag, Heidelberg and New York, 1999, 382–398. DOI: http://dx.doi.org/10. 1007/978-3-7908-1872-7. [26] Kacprzyk J., Zadroż ny S., ”Computing with words: towards a new generation of linguistic querying and summarization of databases”. In: P. Sinč ak and J. Vaš čak (Eds.): Quo Vadis Computational Intelligence?, Springer-Verlag, Heidelberg and New York, 2000, 144–175, . [27] Kacprzyk J., Zadroż ny S., ”On a fuzzy querying and data mining interface”, Kybernetika, vol. 36, 2000, 657–670. [28] Kacprzyk J., Zadroż ny S., ”On combining intelligent querying and data mining using fuzzy logic concepts”. In: G. Bordogna and G. Pasi (Eds.): Recent Research Issues on the Management of Fuzziness in Databases, Springer–Verlag, Heidelberg and New York, 2000, 67–81. [29] Kacprzyk J., Zadroż ny S., ”Data mining via linguistic summaries of databases: an interactive approach”. In: L. Ding (Ed.): A New Paradigm of Knowledge Engineering by Soft Computing, World Scienti ic, Singapore, 2001, 325–345. DOI: http: //dx.doi.org/10.1142/4606. [30] Kacprzyk J., Zadroż ny S., ”Computing with words in intelligent database querying: standalone and Internet-based applications”, Information Sciences, vol. 134, no. 1–4, 2001, 71–109. DOI: http://dx.doi.org/10.1016/ S0020-0255(01)00093-7. [31] Kacprzyk J., Zadroż ny S., ”On linguistic approaches in lexible querying and mining of association rules”’. In: H.L. Larsen, J. Kacprzyk, S. Zadroż ny, T. Andreasen and H. Christiansen (Eds.): Flexible Query Answering Systems. Recent Advances, Springer-Verlag, Heidelberg and New York, 2001, 475–484. [32] Kacprzyk J., Zadroż ny S., ”Linguistic summarization of data sets using association rules”. In: Pro57
Journal of Automation, Mobile Robotics & Intelligent Systems
ceedings of The IEEE International Conference on Fuzzy Systems, St. Louis, USA, 2003, 702–707. [33] Kacprzyk J., Zadroż ny S., ”Computing with words is an implementable paradigm: fuzzy queries, linguistic data summaries, and natural language generation”, IEEE Transactions on Fuzzy Systems, vol. 18, no. 3, 2010, 461–472. DOI: http://dx. doi.org/10.1109/TFUZZ.2010.2040480. [34] Kacprzyk J., Zadroż ny S., ”Computing with words and protoforms: powerful and far reaching ideas”. In: Rudolf Seising, Enric Trillas, Claudio Moraga, and Settimo Termini (Eds.): On Fuzziness. Springer-Verlag, Berlin Heidelberg 2013, 255–270. DOI: http://dx.doi.org/10.1007/ 978-3-642-35641-4. [35] Kacprzyk J., Zadroż ny S., ”Comprehensiveness of Linguistic Data Summaries: A Crucial Role of Protoforms”. In: Ch. Moewes and A. Nü rnberger (Eds.): Computational Intelligence in Intelligent Data Analysis. Springer-Verlag, Berli, Heidelberg 2013, 207–221. DOI: http://dx.doi.org/10. 1007/978-3-642-32378-2. [36] Kacprzyk J., Zadroż ny S., ”Derivation of Linguistic Summaries is Inherently Dif icult: Can Association Rule Mining Help?” In: Borgelt Ch., Gil M. A., Sousa J. M. C., Verleysen M. (Eds.): Towards Advanced Data Analysis by Combining Soft Computing and Statistics, Springer-Verlag, 2013, 291–303. DOI: http://dx.doi.org/10.1007/ 978-3-642-30278-7. [37] Lee J.-H., Lee-Kwang H., ”An extension of association rules using fuzzy sets”. In: Proceedings of the Seventh IFSA World Congress, Prague, Czech Republic, 1997, 399–402. [38] Liu Y., Kerre E.E., ”An overview of fuzzy quantiiers. (I). Interpretations”, Fuzzy Sets and Systems, vol. 95, 1998, 1–21. [39] Mannila H., Toivonen H., Verkamo A.I., ”Ef icient algorithms for discovering association rules”. In: U.M. Fayyad and R. Uthurusamy (Eds.): Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, Seattle, USA, 1994, 181–192. [40] Raschia G., Mouaddib N., ”SAINTETIQ: a fuzzy set-based approach to database summarization”, Fuzzy Sets and Systems, vol. 129, no. 2, 2002, pp. 137–162. DOI: http://dx.doi.org/10.1016/ S0165-0114(01)00197-X. [41] Rasmussen D., Yager R.R, ”Fuzzy query language for hypothesis evaluation”. In: Andreasen T., H. Christiansen and H. L. Larsen (Eds.): Flexible Query Answering Systems, pp. 23 - 43, Kluwer, Boston, 1997. DOI: http://dx.doi.org/10. 1007/978-1-4615-6075-3_2. [42] Rasmussen D., Yager R.R, ”Finding fuzzy and gradual functional dependencies with SummarySQL”, Fuzzy Sets and Systems, vol. 106, no. 2, 1999, 131–142. DOI: http://dx.doi.org/10. 1016/S0165-0114(97)00268-6. 58
VOLUME 8,
N◦ 3
2014
[43] Yager R.R, ”A new approach to the summarization of data”, Information Sciences, vol. 28, no. 1, 1982, 69–86. DOI: http://dx.doi.org/10. 1016/0020-0255(82)90033-0. [44] Yager R.R, ”On linguistic summaries of data”. In: W. Frawley and G. Piatetsky-Shapiro (Eds.): Knowledge Discovery in Databases. AAAI/MIT Press, 1991, 347–363. [45] Yager R.R, Kacprzyk J. (Eds.), The Ordered Weighted Averaging Operators: Theory and Applications, Kluwer, Boston, 1997. [46] Zadeh L.A., ”A computational approach to fuzzy quanti iers in natural languages”, Computers and Mathematics with Applications, vol. 9, no. 1, 1983, 149–184. DOI: http://dx.doi.org/10.1016/ 0898-1221(83)90013-5. [47] Zadeh L.A., A prototype-centered approach to adding deduction capabilities to search engines -– the concept of a protoform, BISC Seminar, 2002, University of California, Berkeley, 2002. DOI: http://dx.doi.org/10.1109/IS. 2002.1044219. [48] Zadeh L.A., Kacprzyk J. (Eds.), Computing with Words in Information/Intelligent Systems. Part 1. Foundations. Part 2. Applications, Springer – Verlag, Heidelberg and New York, 1999. DOI: http: //dx.doi.org/10.1007/978-3-7908-1873-4 (part 1) and DOI: http://dx.doi.org/10. 1007/978-3-7908-1872-7. [49] Zadroż ny S., Kacprzyk J., ”Computing with words for text processing: an approach to the text categorization”, Information Sciences, vol. 176, no. 4, 006, 415–437. DOI: http://dx.doi.org/10. 1016/j.ins.2005.07.017. [50] Zadroż ny S., Kacprzyk J., ”Issues in the practical use of the OWA operators in fuzzy querying”, Journal of Intelligent Information Systems, vol. 33, no. 3, 2009, 307–325. DOI: http://dx. doi.org/10.1007/s10844-008-0068-1.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
The Bi-partial Version of the p-median / p-center Facility Location Problem and Some Algorithmic Considerations Submitted: 15rh April 2014; accepted: 20th May 2014
Jan W. Owsinski
DOI: 10.14313/JAMRIS_3-2014/28 Abstract: The paper introduces the bi-partial version of the well known p-median or p-center facility location problem. The bi-partial approach, developed by the author, primarily to deal with the clustering problems, is shown here to work for a problem that does not possess some of the essential properties, inherent to the bi-partial formulations. It is demonstrated that the classical objective function of the problem can be correctly interpreted in terms of the bi-partial approach, that it possesses the essential properties that are at the core of the bi-partial approach, and, finally, that the general algorithmic precepts of the bi-partial approach can also be applied to this problem. It is proposed that the use of bi-partial approach for similar problems can be beneficial from the point of view of flexibility and interpretation. Keywords: facility location, p-median, p-center, clustering, bi-partial approach
1. Introducing the Bi-partial Approach The bi-partial approach was developed by the present author at the beginning of the 1980s (see [5], [6]) primarily as a way of dealing with the problems of cluster analysis, its strongest point being the capacity of providing the solution to the clustering problem including the optimum number of clusters, without the need of referring to any external (usually statistical) criteria. The approach has been recently described in a formal manner in Owsiński [7], [8], and its application to some special task in data analysis was provided in Owsiński [9]. Dvoenko [1] applied the approach to the well-known k-means-type procedure. The approach is based on the use of the bi-partial objective function, which is composed, according to the name, of two terms, which, in very general way, can be subsumed for clustering as representing the inner cohesion of the clusters and the outer separation of the clusters1. If cohesion within clusters is measured by some function of distances between the objects, or measurements, or samples, inside individual clusters, denoted QD(P), where P is a partition of the set of n objects, indexed i = 1,…,n, into clusters Aq, q = 1,…,p, and subscript D means that we consider distances inside clusters, then we put as measure of separation of different clusters QS(P), meaning a function
of similarities of objects in different clusters, and the sum of the two, QDS(P), is minimised (possibly small distances inside clusters and possibly small similarities among clusters). This function, QDS(P), has a natural dual, namely D QS (P), in which the two components represent, respectively, cohesion within clusters, measured with similarities (proximities) inside the particular clusters, QS(P), and distances between different clusters, measured with distances between objects, belonging to different clusters, QD(P). The function QSD(P) is, of course, maximised. Even though this concept, at its general level, may appear to be close to trivial, there exist concrete implementations of the two dual objective functions, which form novel and interesting approaches, especially regarding cluster analysis. Moreover, if the components of the objective functions are endowed with definite, quite plausible properties, the approach leads to effective solution algorithms.
2. Problem Formulation
The problem we address here is different from the majority of problems taken as instances of application of the bi-partial approach. Namely, the problem we address is a classical question in operations research, related to location analysis. Not only, though, the interpretation of the problem is quite specific, but also the very form is in a way not appropriate for the treatment through the bi-partial formalism, as introduced here. We deal, namely, in a very simplistic, but also very general manner, with the following problem min Σq(Σi∈Aq d(xi,xq) + c(q))
(1)
with minimisation being performed over the choice of the set of p points (objects) xi that are selected as the central or median points xq, q = 1,…,p. For our further considerations it is of no importance whether these points belong to the set X of objects (medians) or not – i.e. they are only required to be the elements of the space EX (centers), to which all the objects, either actually observed, or potentially existing, belong. It is, however, highly important that the second component of the objective function, namely Σqc(q), does not involve any notion of distance or proximity. While d(.,.) is some distance, like in the general formulation of the bi-partial approach, where it en-
59
Journal of Automation, Mobile Robotics & Intelligent Systems
3. Some Hints at Cluster Analysis
Any Reader with a knowledge in cluster analysis shall immediately recognise the first component of (1) as corresponding to the vast family of the so-called “kmeans” algorithms, where such a form is taken as the minimised objective function. Indeed, this fact is the source of numerous studies, linking facility location problems with clustering approaches. One can cite in this context, for instance, the work of Pierre Hansen (e.g. [2]), but most to the point here is the recent proposal from Liao and Guo [3], this proposal explicitly linking k-means with facility location, similarly as this was done several decades ago by Mulvey and Beck [4]. The latter proposal by Liao and Guo [3] is insofar interesting as the facility of realisation of the basic k-means algorithm allows for the relatively straightforward accommodation of additional features of the facility location problem (e.g. definite constraints on facilities and their sets). Thus, while the first component of the function (1) could be treated with some clustering approaches, e.g. those based on the k-means type of procedure, the issue is in the way the entire function (1) is to be minimised.
4. An Example
60
For the sake of illustration, we shall consider the problem (1) in the following more concrete, even though very simple, indeed, form: Articles
N° 3
2014
minP Σq(Σi∈Aq d(xi,xq) + c1 + c2card(Aq))
(2)
min (ΣqΣi∈Aq d(xi,xq) + pc1 + c2n),
(2a)
minP QSD(P) = QD(P) + QS(P),
(3)
where c1 is the (constant) “facility setup cost”, while c2 is the (constant) unit cost, associated with the servicing of each object i ∈ Aq, except for the “first one”, this cost being included in the setup cost. Such a formulation, even if still quite stylised, seems to be fully plausible as an approximation. It can, of course, be transformed to where it is obvious that we could deal away with the component, associated with the unit cost c2. We shall keep it, though, for illustrative purposes, since the part, related to unit costs may, and usually does, take more intricate, nonlinear forms. The problem (2) can be, quite formally, and with all the obvious reservations, mentioned, anyway, before, moulded into the general bi-partial scheme, i.e. where partition P encompasses, in this case, both the composition of Aq, q = 1,…,p, taken together with the number p of facilities, and the location of these facilities, i.e. choice of locations from (say) X as the places for facilities q. Consider the simple case, shown in Fig. 1, with d(.,.) defined as Manhattan distance, the cost component of (2) being based on the parameter values c1 = 3, c2 = 1. Again, these numbers, if appropriately interpreted, can be considered plausible (e.g. distance, corresponding to annual transport cost, and c1 corresponding to annual write-off value). Locations 10 9
2; 9
8
1; 8
6; 8
7
2; 7
5; 7
7; 7
6 Y
ters either QD(P) or QD(P), c(q) is a non-negative value, interpreted as some cost, related to a facility q. The problem regarding (1) is to find a set of p (q = 1,…,p) locations of facilities, such that the overall cost, composed of the sum of distances between points, assigned to the individual facilities, and these facilities, and the sum of costs, related to these facilities, is minimised. It is, of course, assumed that the costs c(q) and distances d(.,.) are appropriately scaled, in order for the whole to preserve interpretative sense. The costs may be given in a variety of manners: as equal constants for each arbitrary point from X or from EX, i.e. c, so that the cost component in (1) is simply equal pc, or as (more realistically) the values, determined for each point separately, i.e. c(i), or as a function, composed of the setup component (say, c1, if this setup cost is equal for all locations) and the component that is proportional to the number of locations, assigned to the facility q, with the proportionality coefficient equal c2 (i.e. the cost for a facility is then c1 + cardAqc2). Of course, more complex, nonlinear cost functions, also with c1 replaced by c1(i), can, as well, be (and sometimes are) considered. This problem has a very rich literature, with special numerical interest in its “pure” form, without the cost component, mainly devoted to mathematical and geometric properties and the respective (approximation) algorithms and their effectiveness. Notwithstanding this abundant tradition, the issues raised and the results obtained, we shall consider here the form of (1) in one of its basic variants.
VOLUME 8,
5
Locations
4
3; 4
3
2; 3
2
3; 3
1; 2
1 0
0; 0 0
1; 0 1
2
3
4
5
6
7
8
X
Figure 1. A simple academic example for the facility location problem Table 1 shows the exemplary values of QSD(P) = QD(P) + QS(P), according to (2), for a series of partitions P. This is a nested set of partitions, i.e. in each consecutive partition in the series one of the subsets of objects, a cluster Aq, is the sum of some of the clusters from the preceding partition, with all the other clusters being preserved. Such a nested sequence of
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
Table 1. Values of QSD(P) = QD(P) + QS(P) for a series of partitions, according to (2) QD(P)
QS(P) – calculation
QS(P) value
QSD(P)
Partitions (facility locations in bold)
1
11*3+10*1+1*2
45
46
Merger of (0,0) and (1,0)
3
9*3+7*1+2+3
0
2
13
12*3+12*1
10*3+8*1+2*2
48
42
39
48
44
42
4*3+4*3
24
37
22
3*3+6+3+3
21
43
55
1*3+12
15
70
partitions is characteristic for a very broad family of cluster algorithms – the progressive merger or progressive split algorithms. The character of results from Table 1, even if close to trivial, is quite telling, and indeed constitutes a repetition of the observations made for other cases, in which the bi-partial approach has been applied. Note that the values of QD(P) increase along the series of partitions, while the values of QS(S) – decrease, and QSD(P) has a minimum, which, for his simple case, corresponds, indeed, to the solution to the problem.
5. Some Algorithmic Considerations: the Use of the k-means Procedure
As indicated before, the problem lends itself to the k-means-like procedure, which, in general and quite rough terms, at that, takes the following course: 0o Generate p 2 points as initial (facility location) seeds (in this case, the case of p-centers, the points generated belong to X), usually p << n 1o Assign to the facility location points all the n points from the set X, based on minimum distance, establishing thereby clusters Aq, q = 1,…, p 2o If the stop condition is not fulfilled, determine the representatives (facility locations) for the clusters Aq, otherwise STOP 3o Go to 1o. This procedure, as we know, converges very quickly, although it can get stuck in a local minimum. Yet, owing to its positive numerical features, it can be restarted from various initial sets of p points many times over, and the minimum values of the objective function obtained indicate the proper solution. In the here analysed problem of facility location, since such problems rarely are really large in the standard sense of data analysis problems, it is quite feasible to run the k-means procedure, as outlined above, for consecutive values of p in order to check whether a minimum over p can be found for a definite formu-
p
All locations are facility locations
12
Merger of (2,3) and (3,3)
10
11
Addition of (3,4) to (2,3) and (3,3)
9
{(0,0), (1,0), (1,2)} {(2,3), (3,3), (3,4)}, {(5,7), (6,8), (7,7)}, {(1,8), (2,7), (2,9)}
4
{(0,0), (1,0), (1,2), (2,3), (3,3), (3,4)}, {(5,7), (6,8), (7,7)}, {(1,8), (2,7), (2,9)}
3
{(0,0), (1,0), (1,2), (2,3), (3,3), (3,4), (5,7), (6,8), (7,7), (1,8), (2,7), (2,9)}
1
lation of the facility-location-related QSD(P). Although we shall not be demonstrating this here, in view of the opposite monotonicity of both components of QSD(P) along p, the minimum found over p is a global minimum (although, of course, it is not necessarily the solution to the problem considered, since we deal here only with an approximation of the actual objective function). This procedure can be simplified so as to encompass only a part of the sequence of values of p, starting, say from p = 2 upwards, until a minimum is encountered.
6. Algorithmic Considerations Based on the Bi-partial Approach
We shall now present the algorithmic approach that is based on the basic precepts of the bi-partial approach. Assuming, namely, the property that we have observed for the case of the concrete objective function (2), that is – the opposite monotonicity of the two components of the objective function, we can reformulate it, obtaining, in the general case, the following parametric problem: minP QSD(P,r) = rQD(P) + (1-r)QS(P),
(4)
where the parameter r∈[0,1] corresponds to the weights we may attach to the two components of the objective function. Actually, it is used only for algorithmic purposes, and not to express any sort of weight, and we assume that we weigh equally the two components (r = ½). Here, we make no a priori assumptions as to the value of p, in distinction from the approach, outlined above, based on the k-means procedure. The form (4) enables the construction of a suboptimisation algorithm, provided the two components of the objective function are endowed with certain properties. We shall outline the construction of this algorithm for the case of the objective function (2). Articles
61
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
Thus, the above general form is equivalent, for (2), to the following one:
minP (rΣqΣi∈Aq d(xi,xq) + (1-r)Σq(c1 + c2cardAq)). (5)
Now, take the iteration step index, t, starting with t = 0. Consider (5) for r0 = 1. We obtain
minP (1⋅ΣqΣi∈Aq d(xi,xq) + 0⋅Σq(c1 + c2cardAq) = ΣqΣi∈Aq d(xi,xq)). (6)
Since we did not make any assumptions, concerning the value of p, we can easily see that the global minimum for (6) is obtained for p = n, i.e. when each object (location) contains a facility (each location constitutes a separate cluster). Denote this particular, extreme partition by P0. The situation described is illustrated in the first line of Table 1. The value of the original objective function is, therefore, equal n(c1 + c2), since the first component disappears, we deal with n facilities, and all cardAq = cardAi are equal 1. Then, we decrease the value of r from r0 = 1 down. At some point, for r1, the value of the parameter is low enough to make the value of the second component of the objective function, (1-r)Σq(c1 +c2cardAq), weigh sufficiently to warrant aggregation of two locations into one cluster, with one facility, serving the two locations. This happens when the following equality holds: QSD(P0,r1) = QSD(P1,r1),
(7)
where P is the partition, which corresponds to the aggregation operation mentioned, the equality from (7) being equivalent, in the case here considered, to 1
r1⋅0 + (1-r1) n(c1 + c2) = r1d(i*,j*) + (1-r1) (n(c1 + c2) – c1) (8) where i*,j* is the pair of locations, for which the value of r1 is determined. This value, conform to (8) equals r1(i*,j*) = c1/(d(i*,j*) + c1).
(9)
QSD(Pt-1,rt) = QSD(Pt,rt),
(10)
This relation is justified by the fact that for each passage from p to p-1, accompanying aggregation, the value of the second component decreases by c1, while a value of distance, or a more complex function of distances, is added to the first component. As we look for the highest possible r1, which follows r0 = 1, it is obvious, that the d(i*,j*) we look for must be smallest one among those not yet contained inside the clusters (i.e., for this step – among all distances). In the subsequent steps t we use the equation (7) in its more general form, i.e.
62
and derive from it the expression analogous to (9). In this particular case – which is, anyway, quite similar to several of the implementations of the bi-partial approach for clustering – the equation, analogous to (9) is obtained from (10), meaning that at each step t the minimum of distance is being sought, exactly as in the classical progressive merger procedures, like single link, complete link etc. Articles
N° 3
2014
The procedure stops when, for the first time, rt is obtained in the decreasing sequence of r0, r1, r2,…, lower than ½ (the sequence of rt, if realised until the aggregation of all locations into one cluster, will, of course, end at t = n-1). Falling below ½ means, namely, that “on the way” the partition Pt was obtained, which was generated by the algorithm for r = ½, corresponding to the equal weights of the two components of the objective function. Thus, we deal with a procedure that is entirely analogous to the simple progressive merger algorithms, but has an inherent capacity of indicating the “solution” to the problem, without any reference to an external criterion. We used the quotation marks, when speaking of “solution”, because the procedure does not guarantee in any way the actual minimum of (2), since the operations, performed at each step, are limited to aggregation. The experience with other cases shows that a simple search in the neighbourhood of the suboptimal solution found suffices for finding the actual solution, if it differs from the suboptimal one.
7. Some Comments and the Outlook
The illustration, here provided, even though extremely simple, is sufficient to highlight the capacity of the bipartial approach to deal with the p-median / p-center type of facility location problems. In fact, for (slightly) more complex formulations of the problem, like minP Σq(Σi∈Aq d(xi,xq) + c1(q) + c2f(card(Aq))) (11)
i.e. where setup costs are calculated for each potential facility location separately, and f(.) is an increasing concave function, the relation analogous to (10) yields only marginally more intricate procedure, analogous to that based on (9), where for each aggregation the minimum has to be found for the two locations or clusters aggregated. The issue, worth investigation, which arises there from is: what realistic class of the facility location problems can be dealt with through the bi-partial approach? Concerning the comparison with the here proposed procedure, based on the classical k-means, the following points must be raised: – k-means outperform progressive merger procedures for data sets with numerous objects (locations), but not too many dimensions (here: by virtue of definition, either very few, or just two), when storing of the distance matrix and operating on it is heavier than calculating np (much less than n2) distances at each iteration; in the cases envisaged n would not exceed thousands, and p is expected not to be higher than 100, so that the two types of procedures might be quite comparable; – there exists a possibility of constructing a hybrid procedure, in which k-means would be performed for a sequence of values of p at the later stages of the bipartial procedure, with the result of the aggregation, performed by the bi-partial procedure being the starting point for the k-means algorithm; – given the proposal by Dvoenko [1], there exists also a possibility of implementing directly the bi-partial
Journal of Automation, Mobile Robotics & Intelligent Systems
version of k-means, with specially designed form of the two components of the objective function; this, however, would require, indeed, additional studies.
ACKNOWLEDGMENT This research has been partially supported by the National Centre of Science of the Republic of Poland under Grant No. UMO-2012/05/B/SsT6/03068.
Notes
VOLUME 8,
N° 3
2014
rithms,. Statistica & Applicazioni, 2011, Special Issue, 43–59. [8] Owsiński J. W., “Clustering and ordering via the bi-partial approach: the rationale, the model and some algorithmic considerations”. In: J. Pociecha & Reinhold Decker, eds., Data Analysis Methods and its Applications, Wydawnictwo C.H. Beck, Warszawa, 2012a, 109–124. [9] Owsiński J. W., “On the optimal division of an empirical distribution (and some related problems)”, Przegląd Statystyczny, Special Issue 1, 2012b, 109–122.
In some other circumstances the two can be referred to as “precision” and “distinguishability”, which brings us quite close, indeed, to the standard oppositions, known from various domains of data analysis, such as “fit” and “generalisation” or “precision” and “recall”. 2 We use the classical name of the k-means algorithm, although the number of clusters, referred to in this name as “k”, is denoted in the present paper, conform to the notation adopted in the bi-partial approach, by p. 1
AUTHOR Jan W. Owsinski – Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01–447 Warszawa, Poland. E-mail: owsinski@ibspan.waw.pl.
REFERENCES [1] Dvoenko S., “Meanless k-means as k-meanless clustering with the bi-partial approach”. In: Proceedings of PRIP 2014 Conference, Minsk, May 2014. [2] Hansen P., Brimberg J., Urosević D., Mladenović, N., “Solving large p-median clustering problems by primal-dual variable neighbourhood search”, Data Mining and Knowledge Discovery, vol. 19, 2009, 351–375. [3] Liao K., Guo D., “A clustering-based approach to the capacitated facility location problem”, Transactions in GIS, vol. 12, no. 3, 2008, 323–339. [4] Mulvey J. M., Beck M. P., “Solving capacitated clustering problems”, European Journal of Operational Research, vol., 18, no. 3, 1984, 339– 348. DOI: http://dx.doi.org/10.1016/03772217(84)90155-3. [5] Owsiński J.W., Regionalization revisited: an explicit optimization approach, CP-80-26. IIASA, Laxenburg 1980. [6] Owsiński J.W., “Intuition vs. formalization: local and global criteria of grouping”, Control and Cybernetics, vol. 10, no. 1–2, 1981, 73–88. [7] Owsiński, J.W., “The bi-partial approach in clustering and ordering: the model and the algoArticles
63
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
A Novel Generalized Net Model of the Executive Compensation Design Submitted: 3rd April 2014; accepted: 30th May 2014
Krassimir T. Atanassov, Aleksander Kacprzyk, Evdokia Sotirova DOI: 10.14313/JAMRIS_3-2014/29 Abstract: In the paper we are concerned with a structured approach to the process of design of an executive compensation system in a company which is one of most relevant issues in corporate economics that can have a huge impact on a company, with respect to finances, competitiveness, etc. More specifically, we present a novel application of Atanassov’s concept of a Generalized Net (GN) which is a powerful tool for the representation and handling of dynamic discrete event problems and systems. First, to present the problem specifics, a broader Total Reward system is discussed together with the importance of proper structuring of the compensation system for executives to support company’s goals, allowing attracting, motivating and retaining managers. The proposed compensation design model starts from incorporating a broad spectrum of benchmarks, expectations and constraints to those already incorporated in the early phase of the design of the executive compensation. In the design and testing phase a significant emphasis is placed on the flexibility and adjustability of the executive compensation package to external factors by testing, dynamically adjusting and stress testing the proposed compensation package already in the design phase. Then, we apply some elements of the theory of Generalized Nets (GNs) to construct the model of executive compensation design using the proposed approach. Keywords: rewards systems, compensation design, banking activities, corporate activities, Generalized Net, modelling, Discrete Event System Modeling
1. Introduction
64
The present paper is a continuation of our previous investigations on the use of some elements of the theory of Generalized Nets (GNs) proposed by Atanassov [2, 3] for the mathematical modeling of banking activities [8, 11], executive compensation [9], as well some technological and business activities of a petrochemical company (cf. [5]). In this paper, by extending the ideas of [9], we focus on the development of a system for compensation design for bank executives. We star by discussing the role of compensation in the reward systems to identify the key objectives placed on the executive compensation as well as key requirements of the compensation design process. Later we discuss the importance Articles
of updating procedures in the compensation design, which includes continuous cycle of developing, implementing, using, evaluating and adjusting of executive compensation. Based on those principles and objectives we propose a comprehensive approach to executive compensation design. We continue show the application of some elements of Atanassov’s theory of Generalized Nets to construct the model of executive compensation design which incorporates the process for executive compensation design to be proposed. Finally we identify some promising areas for future research.
2. Reward Systems and Their Role in Attaining Company Goals
The primary goal of a reward system in a company, firm, corporation, etc. is commonly described in the literature as the supporting of business goals, and attracting, motivating and retaining competent employees (cf. [13]). It is also referred to as a system that aligns the rewards to executives with is critical for the company to succeed in both a short-term and longterm perspective, and to accomplish its strategic plan (cf. [12]). Yet another approach is presented by Ellig (2007) who defines the Reward Management as the one concerned with the formulation and implementation of strategies and policies that aim at rewarding people fairly, equitably and consistently in accordance with their value to the organization (cf. [7]). Rewards strategy is a significantly complex issue which ties the business strategy with medium and short term tactics and with day to day tasks and decision, and therefore a proper rewards strategy requires the incorporation of large number of elements a list of which is listed below as proposed by Armstrong [1] and WorldatWork [13]: • Rewards strategy philosophy – statement about how rewards strategy will support business strategy and needs of the company’s stakeholders, • Goals of Rewards Strategy – their prioritization and success criteria for evaluation, • Types of Rewards – list of reward types including their description and relative importance, • Relative importance of various rewards -setting the importance of rewards relative to other tools applied in influencing employees behaviors, • Selection of measures – selection of measures that should be used in the design of rewards includ-
Journal of Automation, Mobile Robotics & Intelligent Systems
ing decision about the level in the organization at which the criteria will be measured (organizationwide, SBU, team, individual) and decision about which elements of total rewards will be associated with those measures, • Selection of competitive market reference points – selection of peers and competitors that should form the benchmarks and to which employees will benchmark their compensation in terms of its competitiveness, • Competitiveness of rewards strategy – decision on desired competitive position versus selected competitive market reference points, company’s level of rewards to be be below, on par or above the market, • Updating of Rewards Strategy – defining criteria and process for updating of Rewards Strategy or decision of which elements can be updated individually, • Data and information management – selection of information sources, approach and methods of data processing, tools used in decision support as well as reporting, • Guidelines for solving conflicts – methods for approaching to conflict and processes for resolving conflicts, • Communication Strategy – decision about the intensity of communication of rewards strategy with key stakeholders as well as content of such communication. The Total Rewards approach targets very closely the issues faced today by majority of banks operating in fast changing environment with increased scrutiny of shareholders, regulators and public on their performance and in particular on compensation of their executives. While banks are expected to be more modest in their compensation they also operate in a highly complex and fast changing environment where they need to attract, develop and retain top talent. Therefore the historical simplified approach to rewards management needs to be expanded into a total rewards system. The Total Rewards approach proposed by WorldatWork [13] promises to address key concerns of today’s banks in managing their executive workforce with: 1. Increased flexibility 2. Improved Recruitment and Retention 3. Reduced Labor Costs/Cost of Turnover 4. Heightened Visibility in a Tight Labor Market 5. Enhanced Profitability Given our task of structuring and codifying the bank process that we started in our previous paper, as well as with the current significant visibility and public scrutiny of compensation of executives in banks, we commence our analysis with compensation processes and systems.
3. Role of Compensation Systems in Motivating Executives While executives are motivated by diverse elements, compensation program, when properly structured and controlled, remains the most potent
VOLUME 8,
N° 3
2014
weapon for CEO and HR department in their arsenal of reward and punishment devices. Compensation is highly effective at motivating individual executives to higher levels of performance as described by Bruce R. Ellig [6]. This approach is consistent with agency theory that suggests performance pay as a substitute to monitoring [7]. Compensation is by any mean the largest component of rewards system and a major cost for the organization [13]. At the same time, while Total Rewards Strategy as presented earlier is highly complex to design and implement, the well-designed compensation system can benefit even smallest organizations and can become a centerpiece of human resource strategy when it comes to attracting and retaining top talent and good performers. The challenge of proper structuring and implementation of compensation system is further complicated in the professional organization, such as bank, where there is a significant number of professionals, which not only have various targets set but they also tend to work with different lines of responsibility and reporting to multiple superiors or operating in cross functional teams. In addition to this, given the various ownership changes and mergers divestments the typical career in “siloses” or within certain departments or parts of organization is no longer the rule. Today’s professionals tend to change assignments or specialties, levels of responsibility regularly; they also take advantage of horizontal promotions. In those cases what seems natural from the organizational point of view, that certain position has attached to it compensation package is not accepted by employees that are to relocated from the department or position with more attractive or just differently structured compensation package. Those challenges of today’s banks call for a highly dynamic and flexible compensation design process. Another important external factor faced by banks in the US and in Western Europe, in particular banks that required state bailout or struggling with lack of growth is an increased public scrutiny of compensation in banks. At the same time banks in the emerging economies face different sets of challenges related to reduced access to liquidity, increased regulatory oversight, foreign ownership and need for operational excellence [9]. The famous year-end bonuses enjoyed by many bank professionals for reaching sales or profit targets are questioned as they promote taking sizeable risk that only later are realized and that do not impact the executives that took those risks. One more characteristic of banks today is the need to quickly react to changes in the marketplace and to changing bank objectives that put an additional pressure on compensation systems as recent research from Towers Perrin [12] points out that compensation and benefits can be easily copied by competitors vs. other types of rewards, in particular intangible, that maybe more difficult to imitate. Therefore the compensation systems while remains the most important element of reward systems at banks require a support in the design process performed by HR professionals. We believe Articles
65
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
Fig. 1. Compensation design process steps and tasks
that well-structured compensation design process that incorporates broad information sourcing with high level of flexibility and adjustability that can be implemented in a decision support system would be of significant help and would improve decision making and help banks in increasing their results and efficiency while providing well balanced motivation to bank executives.
4. Proposed Approach to the Structuring Process of the Executive Compensation Design
66
In our approach to the structuring of the process of executive compensation design we have decided to first focus on internal company goals and initially set aside external constituencies and considerations which we will analyze in our future works. While setting the goals for structuring and codifying the executive compensation design process, based on the available literature and research results reported (in particular: [13], [6], and [10]) we have identified and set three goals for the process and model considered: 1. To optimize executive compensation to maximize the value to a company (to fit its goals) and to an executive (to be able to attract and retain the best people). 2. To dynamically calculate the cost of executive compensation to the company and benefits to an executive to respond to a fast changing and highly competitive environment. Articles
3. To provide a tool for a compensation committee/ CEO/HR department to evaluate alternatives and conditions of the executive pay package and their impact on the company in static and highly dynamic scenarios. With the three goals for the structuring and codifying of executive compensation design process presented above, we wish to propose an approach that focuses on the incorporation of a diverse sets of source date but also that puts a significant effort into dynamic analyses of the incorporation of those sets of sources, evaluation and readjustments that can be performed throughout the compensation design process. The proposed process, presented in Fig. 1 below, is composed of five action steps: 1. Description of the current compensation model – an important goal of this step is to understand the current drivers, variables and constraints of the existing compensation model as well as employee expectations and past performance. 2. Benchmarks and constraints – this step allows for the introduction of various benchmarks, survey data as well as external and internal constraints. 3. Design phase – the most important element that reshapes the standard blue print for the compensation model with data inputs from the existing compensation system and external benchmarks, with internal and external rules and constraints in an iterative and dynamic process of designing, analyzing and testing.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
Fig. 2. Description of current compensation model 4. Finalization â&#x20AC;&#x201C; in this phase the proposed new compensation model is codified as well as alternatives are modeled and it is stress tested for extreme cases. This phase ends with the implementation. 5. Assessment â&#x20AC;&#x201C; in this step the new compensation model is used, its effectiveness is monitored and potential weaknesses are spotted, documented and evaluated.
on this data the first partial analysis can be performed to identify if the current compensation model is acting properly to stimulate the performance of the individual executive, its efficiency and effectiveness. The key outputs of this step are tables with pay levels and pay grades together with rewards and benefits (primarily monetary), sets of rules for the calculation of benefits and their eligibility, and a list of condition rules for testing in the new model.
4.1. Description of the Current Compensation Model
4.2. Benchmarks and Constraints
The first step in the proposed process, depicted in Fig. 2, focuses on the compilation of source information about the current salary levels for different positions and grades of the executives together with benefits as well as short term (ST) and long term (LT) rewards such as target and result oriented bonuses. By compiling those sets of information, first, trends or inconsistencies of the existing model can be spotted and properly marked for future analyses. This data set also allows for performing the verification of the existing compensation model to targets and budgets of the company in question, as well as its fit with company goals and strategy. The second element of this step is the compilation of employee expectations, both monetary and nonmonetary ones, as well as related to the structure of their compensation or mechanics of pay for their performance together with information of the employee performance related to targets of the company. Based
The second step, presented in Fig. 3, is focused on the assembling of sets of benchmarks as well as rules and constraints that describe the competitive environment and allow for a dynamic modeling of the new compensation model. The comparable universe of benchmarks is to include data from internal benchmarking (between bank subsidiaries or countries of operation), industry benchmarking (primary in the same country or a similar financial center) and position specific benchmarking as well as information about company/bank sizes and compensation budgets for similar sized competitors. This set of data will allow the determination of sets of ranges, medians and distributions to be later used in the modeling process. Another group of data to be elicited and included is the external constraints that need to be considered in the design of compensation model. In particular this should include the local legal and tax considerations as well as industry specific requirements. Those sets Articles
67
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N째 3
2014
Fig. 3. Benchmarks and constraints of rules and constraints will be used in the following step to adjust and test the proposed compensation model for compliance and efficiency.
4.3 Design Phase
The design phase, shown in Fig. 4, is the most important and the most complex element of the proposed approach as it is the process in which the data inputs together with rules and constraints are used to
Fig. 4. Design Phase 68
Articles
develop the compensation model blueprint which is transformed into a proposal of a new compensation model and finally into the new compensation model. The proposed approach starts with a compensation model template which includes all elements of the compensation system such as a base pay, base pay modifiers (such as pay grades or bands), target-related and results related rewards, etc. but without any numerical data.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
Fig. 5. Finalization and Assessment The first phase of the design process is an iterative inclusion of the data inputs and rules/constraints that forms a blue print of the new compensation model, highlighting the elements consistently meeting the criteria and elements that are contradictory or outside of the constraints placed. The second phase includes an evaluation of preferences and trade-offs to eliminate the criteria that cannot be met and to finalize core elements of the compensation model. This phase of the process involves an iterative testing of the proposed model versus present goals and a present system and new targets and goals to verify its applicability and efficiency (in particular a cost â&#x20AC;&#x201C; effect type analysis). The final product of this action step is a proposal of a new compensation model which consists of a core model and sets of variable elements together with performance criteria and rules/constraints.
4.3. Finalization and Assessment
The final action steps in the design and implementation of the new compensation model, shown in Fig. 5, start with the finalization phase in which the proposed model is stress tested to verify its flexibility and to possibly correct any improper performance for outliers and various compensation alternatives. At the same time the compensation model is codified into procedures and manuals, and at the same time its practicality and cohesiveness is verified and corrected. The final step includes the implementation and assessment which includes an implementation in the company or organization, starting with a pilot implementation and a later staged rollout. At this action step the new compensation model is constantly monitored and fine-tuned by verifying the executive performance versus the company targets and individual targets set as well as versus the past performance and also the simulated performance of the old model.
5. Application of Theory of Generalized Nets to the Proposed Approach to Executive Compensation Design The Generalized Nets (GNs) have been introduced by Atanassov [2], [3] as a powerful, general and comprehensive tool to conceptualize, model, analyze and design of all kinds of discrete event type processes and systems that evolve over time. They can effectively and efficiently model various aspects of processes whose behavior over time is triggered and influenced by some external and internal events. These characteristic features of the GNs do clearly suggest that they can be a powerful, effective and efficient model for the executive compensation problem considered in this paper. We will show this in detail in the next subsection. However, let us first start with a brief description of basic elements of the theory of GNs that will be of use for our next considerations. Some GNs may not have some of the components, thus giving rise to special classes of GNs called reduced GNs. For the needs of the present research we shall use (and describe) one of the reduced types of GNs.
Fig. 6. GN-transition Articles
69
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
Formally, each transition of this reduced class of GNs is described by (cf. Fig. 6): ,
(1)
where: (a) L´ and L´´are finite, non-empty sets of places (the transition’s input and output places, respectively). For the transition in Fig. 1 these are L' = {l1' , l2' ,..., lm' } and L' ' = {l1'' , l2'' ,..., lm'' }; (b) r is the transition’s condition determining which tokens will pass (or transfer) from the transition’s inputs to its outputs; it has the form of an Index Matrix (IM); cf. Atanassov ([2], [4]):
where ri,j is the predicate which corresponds to the i-th input and j-th output places. When its truth value is “true”’, a token from the i-th input place can be transferred to the j-th output place; otherwise, this is not possible; (c) is a Boolean expression. It contains as variables the symbols which serve as labels for transition’s input places, and it is an expression built up from variables and the Boolean connectives “conjunction” and “disjunction”. When the value of a type (calculated as a Boolean expression) is “true”, the transition can become active, otherwise it cannot. The ordered four-tuple E = (A, K, X, F) is called the simplest reduced GN (briefly, we shall use again “GN”) if: (a) A is a set of transitions; (b) K is the set of the GN’s tokens. (c) X is the set of all initial characteristics the tokens can receive when they enter the net; (d) F is a characteristic function which assigns new characteristics to each token when it transfers from an input to an output place of a given transition. Over the GNs a lot of types of operators are defined. One of these types is the set of hierarchical operators. One of them changes a given GN-place with a whole subnet, cf. Atanassov ([2], [3]). Below, having in mind this operator, we will use three places that will represent three separate GNsas shown in the authors earlier works (cf. [9]).
6. A GN-model of the Design of an Executive Compensation Scheme
70
Now we will present the use of elements of the theory of the GNS presented in Section 5, to develop a novel model of the executive compensation scheme. The essence and problems related to this design process have been extensively described in the preceding sections. Articles
N° 3
2014
The GN model (Fig. 7) consists of nine transitions that represent, respectively: – the process of Description of the current compensation model (transitions Z1 and Z2), – the analysis of Benchmarks and Constraints (transitions Z3 and Z4), – the Design phase (transitions Z5, Z6 and Z7), – the process of Finalization (transition Z8), – the process of Assessment (transition Z9). Initially, the tokens a and b stay in places l4 and l7. They will be in their own places during the whole time during which the GN functions. All tokens that enter transitions Z1 and Z2 will unite with the corresponding original token (a and b, respectively). While the a and b tokens may split into two or more tokens, the original token will remain in its own place the whole time. The original tokens have the following initial and current characteristics: – token a in place l4 with the characteristic: = “Current salary levels and benefits, List of benefits available and costs, ST rewards – bonuses (target related, results related, discretionary), LT rewards – bonuses (target related, company value related, discretionary”, – token b in place l7 with the characteristic: = “Benchmarks: Internal benchmarks, Industry benchmarks, Position specific benchmarks; Company size/compensation budget”. Transition Z1 has the form
where
Z1 = 〈{l1, l4}, {l2, l3, l4}, r1, ∨( l1, l4)〉,
r1 =
l1 l4
l2 l3 l4 , false false true W4 ,2 W4 ,3 W4 ,4
in which: W4,2 – “Tables with pay levels and pay grades, rewards and benefits are prepared”, W4,3 – “Sets of rules for calculation of benefits and eligibility are prepared”, W4,4 – W4,2 & W4,3.
The a0-token that enters place l4 (from place l1) does not obtain the new characteristic. It unites with the a-token in place l4 with the above mentioned characteristic. The a token can be split into tree tokens. As we mentioned above, the original a token continues to stay in place l4. The other tokens (a1 and a2) enter places l2 and l3 and obtain the following characteristics: – Token a1 enters place l2 with the characteristic: x1a = “Tables with pay levels and pay grades, rewards and benefits”; – Token a2 enters place l3 with the characteristic: x2a = “Sets of rules for calculation of benefits and eligibility”.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
Z5 l15
Z1 l2
l1
l3 l4
Z9
Z8 l23 Z6
Z2 l6
l5
l7 l8
l16
Z7 l18
l19
l24
l20
l17 l22
l9
l25
l28
l26
l29
l27
l30
Z3 l11
l10 Z4 l12 l14 l13
Fig. 7. A GN model of the design of the executive compensation design model
Transition Z2 has the form: where:
Z2 = 〈{l5, l8}, {l6, l7, l8}, r2, ∨( l5, l8)〉,
r2 =
l5 l8
l6 l7 l8 , false false true W8 ,6 W8 ,7 W8 ,8
in which: W8,6 – “Sets of ranges, medians, distributions are determined”, W8,7 – “Levels and rules for maximum/ minimum constraints are determined”, W8,8 – W8,6 & W8,7.
The b0-token that enters place l8 (from place l5) does not obtain the new characteristic. It unites with the b-token in place l8 with the above mentioned characteristic.
The b token can split to tree tokens. As we mentioned above, the original b token continues to stay in place l8, while the other tokens (b1 and b2) enter places l6 and l7 and obtain the following characteristics: – Token b1 enters place l6 with the characteristic: x1b = “Sets of ranges, medians, distributions”; – Token a2 enters place l3 with the characteristic: x2b = “Levels and rules for the maximum/ minimum constraints”. The g1 and g2-tokens enter the GN net via places l9 and l10 with the following characteristics, respectively: – Token g1 in place l9 with the characteristic: x1g = “Employee expectations”; – Token g2 in place l10 with the characteristic: x2g = “Employee performance – past, expected future”. Transition Z3 has the form:
where:
Z3 = 〈{l9, l10}, {l11}, r3, ( l9, l10)〉, Articles
71
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
Tokens a1 and a2 (from places l2 and l3), b1 and b2 (from places l6 and l7), g (from place l11), d (form place l14) and z0 (form place l15) merge in a z-token that enter place l16 with the characteristic in which: W9,11 = W10,11 = “The identification of strengths and weaknesses of the existing compensation model is performed”. The g1 and g2-tokens unite with token g in place l11 with the characteristic:
x g = “Lists of conditions, rules for testing in new model”.
x z = “Compensation model blueprint”. Transition Z6 has the form:
Z6 = 〈{l16, l17, l21, l24}, {l18}, r6, ∨(∧(l16, l17), ∧(l16, l21), ∧(l16, l24)〉, where:
The d1 and d2-tokens enter the GN net via places l12 and l13 with the following characteristics, respectively: – Token d1 in place l12 with the characteristic: x1d = “Tax treatment of pay and benefits”; – Token d2 in place l13 with the characteristic: x2d = “Legal/regulatory requirements”. Transition Z4 has the form:
where:
Z4 = 〈{l12, l13}, {l14}, r4, ( l12, l13)〉,
From place l17 h-token enters the net with the characteristic
x h = “Preferences and trade-offs”.
The q-token that enters place l18 obtain the characteristic
x q = “Compensation model proposal”.
Transition Z7 has the form
Z7 = 〈{l18, l19, l22}, {l19, l20, l21, l22}, r7, ∨(l18, l19, l22)〉,
in which: W12,14 = W13,14 = “The external constraints are given”.
The d1 and d2-tokens unite with token d in place l14 with the characteristic x d = “Sets of rules, constraints”. Transition Z5 has the form
Z5 = 〈{l2, l3, l6, l7, l11, l14, l15}, {l16}, r5, (l2, l3, l6, l7, l11, l14, l15)〉,
where:
72
In place l15 there is one z0-token with the characteristic x0z = “Compensation model template”. Articles
where
in which: W19,19 = “The new system is tested vs. today’s system (total compensation budget, changes per employee)”, W19,20 = “The result from testing the new system vs. today’s system is positive”, W19,21 = “The result from testing the new system vs. today’s system is negative”, W22,20 = “The result from testing the new system vs. Next year/future’s is positive”, W22,21 = “The result from testing the new system vs. Next year/future’s is negative”, W22,22 = “The new system is tested vs. next year/future’s (e.g., impact of pay progression, indexation)”.
The q1 and q2 tokens that enter places l19 and l22 obtain the following characteristics, respectively: x1q = “Test new system vs. today’s” in place l19, and x2q = “Test new system vs. Test new system vs. Next year/future’s” in place l22.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 8,
N° 3
2014
The q-token that enters place l21 (form places l19 or l22) does not obtain the new characteristic.
With the truth values of the predicates W19,20 and W22,20, the u-token enters place l20 with the characteristic x u = “New compensation model”. Transition Z8 has the form:
Z8 = 〈{l20, l25, l26, l27}, {l23, l24, l25, l26, l27}, r8, ∨( l20, l25, l26, l27)〉, where:
The k1, k2 and k3 tokens that enter places l28, l29 and l30 obtain the following characteristics, respectively: x1k = “Application of the new compensation model for implementation” in place l28, x 2k = “New compensation model for implementation, assess results against targets” in place l29, and x3k = “New compensation model for implementation, identification of weaknesses, areas of misuse” in place l30. The a0 and b0 tokens that enter places l1 and l5 obtain the characteristic: x0a = x0b = “Current compensation model”.
in which: W26,23 – “The alternatives are modeled”, W26,24 – W27,24 = “New compensation model have to be corrected”, W26,26 – W26,23, W27,23 – “The stress testing of the new compensation model is ready”, W27,27 – W27,23. The u1, u2 and u3 tokens that enter places l25, l26 and l27 obtain the following characteristics, respectively:
x1u = “New compensation model, modeled alternatives” in place l25,
x 2u = “New compensation model, evaluated impact
on executive compensation of unlikely but probable developments” in place l26, and
x3u = “New compensation model, written summary
of compensation rules and levels as well as description of targets to be achieved” in place l27. The u-token that enters place l24 (form places l26 or l27) does not obtain the new characteristic. With the truth values of the predicates W26,23 and W27,23, the k-token enters place l23 with the characteristic
x k = “New compensation model for implementation”. Transition Z9 has the form:
Z9 = 〈{l23, l28, l29, l30}, {l1, l5, l15, l28, l29, l30}, r9, ∨( l23, l28, ∧(l29, l30))〉, where:
The e token that enters place l15 obtains the characteristic x e = “Compensation model template”.
7. Concluding Remarks
In this paper we have presented a novel approach to the structuring of the design of executive compensation in companies, corporations, firms, etc., and showed that it can be effectively and efficiently implemented by using a Generalized Net model. Our purpose has been to identify, organize and structure the key components required for the development, testing, implementation and assessment of the compensation model, and to show how they can be reflected using concepts, tools and techniques of the GNs. In particular, we have identified the type of the information input, the way of processing it and types of outputs to be used in the subsequent phases of the design process. Due to the novelty of the presented approach, both in terms of the first use of the GNs for the class of problems considered as well as the first approach to the design of an executive compensation scheme by using not only GN based analyses but more generally a net analysis related models, we have concentrated on the representation of basic variables and relations. Other variables that are relevant for the problem considered, such as external stakeholders exemplified by shareholders, board of directors, international and local regulators or competition for talent, will be dealt with in subsequent papers, and included in a comprehensive model to be developed. In our future research we plan first of all to focus on a deeper analysis and testing of each step of the proposed approach to the compensation design by incorporating some findings and conclusions obtained from earlier research performed as well as by testing the approach proposed on real data of various kinds and sizes of companies and organizations. We also Articles
73
Journal of Automation, Mobile Robotics & Intelligent Systems
plan to compile and test the benchmark tables and constraints tables from the fragmented source data available and work on improving their reliability and applicability with the help of mathematical modeling.
AUTHORS
Krassimir T. Atanassov – Department of Bioinformatics and Mathematical Modelling, Institute of Biophysics and Biomedical Engineering, Bulgarian Academy of Sciences, 105 Acad. G. Bonchev Str. 1113 Sofia, Bulgaria. E-mail: k.t.atanassov@gmail.com Aleksander Kacprzyk* – Resource Partners, Zebra Tower, ul. Mokotowska 1, 00–640 Warsaw, Poland. E-mail: aleksander.kacprzyk@resourcepartners.eu,
Evdokia Sotirova – Department of Computer and Information Technologies, Faculty of Technical Sciences, “Prof. Assen Zlatarov” University, 1 Prof. Yakimov Str. 8010 Bourgas, Bulgaria. E-mail: esotirova@btu.bg ∗Corresponding author
REFERENCES
74
[1] Armstrong M., Handbook of Reward Management Practice: Improving Performance Through Reward, Kogan Page 2012. [2] Atanassov K.T., Generalized Nets, Singapore/ New Jersey: World Scientific, 1991. DOI: http:// dx.doi.org/10.1142/1357. [3] Atanassov K.T., On Generalized Nets Theory, Sofia: Prof. M. Drinov Academic Publishing House, 2007. [4] Atanassov K.T., Index Matrices: Towards an Augmented Matrix Calculus, Heidelberg and New York: Springer, 2015 (in press). [5] Atanassov K.T., Kacprzyk A., Skenderov V., Kryuchukov A., “Principles of a generalized net model of the activity of a petrochemical combine”. In: Proceedings of the 8th International Workshop on Generalized Nets, Sofia, Bulgaria, June 26, 2007, pp. 38-41. [6] Ellig B. R., The complete guide to executive compensation, McGraw Hill, 2007. [7] Holmstrom B., “Moral hazard in teams”, Bell Journal of Economics, vol. 13, no. 2, 1982. DOI: http://dx.doi.org/10.2307/3003457. [8] Kacprzyk A., Mihailov I.,“Intuitionistic fuzzy estimation of the liquidity of the banks. A generalized net model”. In: Proceedings of the 13th International Workshop on Generalized Nets, London, UK, 2012, 34-42. [9] Kacprzyk A., Sotirova E., Atanasssov K.T. , “Modelling the executive compensation design model using a generalized net”. In: Proceedings of the 14th International Workshop on Generalized Nets, Burgas, Bulgaria, 29th–30th November, 2013, 71–77. Articles
VOLUME 8,
N° 3
2014
[10] Lipman F.D., Hall S.E., Executive compensation Best Practices, John Wiley & Sons, 2008. [11] Mihailov I., “Generalized Net Model for Describing Some Banking Activities”. In: Proceedings of the, New Developments in Fuzzy Sets, Intuitionistic Fuzzy Sets, Generalized Nets and Related Topics, vol II: Applications, Warsaw, Poland, 2013, 115–122. [12] Towers Perrin, “Compensation Strategies for an Uncertain Economy: The Evolution Continues”, Towers, Watson & Co., 2009. [13] WorldatWork, The WorldatWork Handbook of Compensation, Benefits & Total Rewards: A comprehensive Guide for HR Professional, New York, John Wiley & Sons, 2007.