Journal of Automation, Mobile Robotics and Intelligent Systems
Journal of Automation, Mobile Robotics and Intelligent Systems WWW.JAMRIS.ORG • pISSN 1897-8649 (PRINT)/eISSN 2080-2145 (ONLINE) • VOLUME 16, Nº 3, 2022
Indexed in SCOPUS
pISSN 1897-8649 (print)/ eISSN 2080-2145 Volume 16, No 3, 2022 www.jamris.org
Journal of Automation, Mobile Robotics and Intelligent Systems A peer-reviewed quarterly focusing on new achievements in the following fields: • automation • systems and control • autonomous systems • multiagent systems • decision-making and decision support • • robotics • mechatronics • data sciences • new computing paradigms • Editor-in-Chief
Typesetting
Janusz Kacprzyk (Polish Academy of Sciences, Łukasiewicz-PIAP, Poland)
SCIENDO, www.sciendo.com
Advisory Board
Webmaster
Dimitar Filev (Research & Advenced Engineering, Ford Motor Company, USA) Kaoru Hirota (Tokyo Institute of Technology, Japan) Witold Pedrycz (ECERF, University of Alberta, Canada)
TOMP, www.tomp.pl
Co-Editors Roman Szewczyk (Łukasiewicz-PIAP, Warsaw University of Technology, Poland) Oscar Castillo (Tijuana Institute of Technology, Mexico) Marek Zaremba (University of Quebec, Canada)
Executive Editor Katarzyna Rzeplinska-Rykała, e-mail: office@jamris.org (Łukasiewicz-PIAP, Poland)
Associate Editor Piotr Skrzypczyński (Poznań University of Technology, Poland)
Statistical Editor
Editorial Office ŁUKASIEWICZ Research Network – Industrial Research Institute for Automation and Measurements PIAP Al. Jerozolimskie 202, 02-486 Warsaw, Poland (www.jamris.org) tel. +48-22-8740109, e-mail: office@jamris.org The reference version of the journal is e-version. Printed in 100 copies. Articles are reviewed, excluding advertisements and descriptions of products. Papers published currently are available for non-commercial use under the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0) license. Details are available at: https://www.jamris.org/index.php/JAMRIS/ LicenseToPublish
Małgorzata Kaliczyńska (Łukasiewicz-PIAP, Poland)
Editorial Board: Chairman – Janusz Kacprzyk (Polish Academy of Sciences, Łukasiewicz-PIAP, Poland) Plamen Angelov (Lancaster University, UK) Adam Borkowski (Polish Academy of Sciences, Poland) Wolfgang Borutzky (Fachhochschule Bonn-Rhein-Sieg, Germany) Bice Cavallo (University of Naples Federico II, Italy) Chin Chen Chang (Feng Chia University, Taiwan) Jorge Manuel Miranda Dias (University of Coimbra, Portugal) Andries Engelbrecht ( University of Stellenbosch, Republic of South Africa) Pablo Estévez (University of Chile) Bogdan Gabrys (Bournemouth University, UK) Fernando Gomide (University of Campinas, Brazil) Aboul Ella Hassanien (Cairo University, Egypt) Joachim Hertzberg (Osnabrück University, Germany) Tadeusz Kaczorek (Białystok University of Technology, Poland) Nikola Kasabov (Auckland University of Technology, New Zealand) Marian P. Kaźmierkowski (Warsaw University of Technology, Poland) Laszlo T. Kóczy (Szechenyi Istvan University, Gyor and Budapest University of Technology and Economics, Hungary) Józef Korbicz (University of Zielona Góra, Poland) Eckart Kramer (Fachhochschule Eberswalde, Germany) Rudolf Kruse (Otto-von-Guericke-Universität, Germany) Ching-Teng Lin (National Chiao-Tung University, Taiwan) Piotr Kulczycki (AGH University of Science and Technology, Poland) Andrew Kusiak (University of Iowa, USA) Mark Last (Ben-Gurion University, Israel) Anthony Maciejewski (Colorado State University, USA)
Krzysztof Malinowski (Warsaw University of Technology, Poland) Andrzej Masłowski (Warsaw University of Technology, Poland) Patricia Melin (Tijuana Institute of Technology, Mexico) Fazel Naghdy (University of Wollongong, Australia) Zbigniew Nahorski (Polish Academy of Sciences, Poland) Nadia Nedjah (State University of Rio de Janeiro, Brazil) Dmitry A. Novikov (Institute of Control Sciences, Russian Academy of Sciences, Russia) Duc Truong Pham (Birmingham University, UK) Lech Polkowski (University of Warmia and Mazury, Poland) Alain Pruski (University of Metz, France) Rita Ribeiro (UNINOVA, Instituto de Desenvolvimento de Novas Tecnologias, Portugal) Imre Rudas (Óbuda University, Hungary) Leszek Rutkowski (Czestochowa University of Technology, Poland) Alessandro Saffiotti (Örebro University, Sweden) Klaus Schilling (Julius-Maximilians-University Wuerzburg, Germany) Vassil Sgurev (Bulgarian Academy of Sciences, Department of Intelligent Systems, Bulgaria) Helena Szczerbicka (Leibniz Universität, Germany) Ryszard Tadeusiewicz (AGH University of Science and Technology, Poland) Stanisław Tarasiewicz (University of Laval, Canada) Piotr Tatjewski (Warsaw University of Technology, Poland) Rene Wamkeue (University of Quebec, Canada) Janusz Zalewski (Florida Gulf Coast University, USA) Teresa Zielińska (Warsaw University of Technology, Poland)
1
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16, N˚3, 2022 DOI: 10.14313/JAMRIS/3-2022
Contents 3
40
Optimal State Feedback Controller For Balancing Cube Adam Kowalczyk, Robert Piotrowski DOI: 0.14313/JAMRIS/3-2022/19
Edge Artificial Intelligence-Based Facial Pain Recognition During Myocardial Infarction Mohan H M, Shivaraj Kumara H C, Mallikarjun S H, Dr. Prasad A Y DOI: 10.14313/JAMRIS/3-2022/23
13
Neurocontrolled Car Speed System Markiyan Nakonechnyi, Orest Ivakhiv, Świsulski DOI: 10.14313/JAMRIS/3-2022/20
Dariusz
22
Robust H∞ Fuzzy Approach Design via Takagi‐ Sugeno Descriptor Model. Application for 2‐DOF Serial Manipulator Tracking Control Van Anh Nguyen Thi, Duc Binh Pham, Danh Huy Nguyen, Tung Lam Nguyen DOI: 10.14313/JAMRIS/3‐2022/21
2
56
Skin Lesion Detection Using Deep Learning Rajit Chandra, Mohammadreza Hajiarbabi DOI: 10.14313/JAMRIS/3-2022/24 65
Applicability of Augmented and Virtual Reality for Education in Robotics Norbert Prokopiuk, Piotr Falkowski DOI: 10.14313/JAMRIS/3‐2022/25
30
75
Integrated and Deep Learning–Based Social Surveillance System: a Novel Approach Ratnesh Litoriya, Dev Ramchandani, Dhruvansh Moyal, Dhruv Bothra DOI: 10.14313/JAMRIS/3-2022/22
Design of a Linear Quadratic Regulator Based on Genetic Model Reference Adaptive Control Abdullah I. Abdullah, Ali Mahmood, Mohammad A. Thanoon DOI: 10.14313/JAMRIS/3-2022/26
VOLUME 16, N° 3 2022 Journal of Automation, Mobile Robotics and Intelligent Systems
Optimal State Feedback Controller For Balancing Cube Submitted: 11th April 2022, accepted: 23rd May 2022
Adam Kowalczyk, Robert Piotrowski DOI: 10.14313/JAMRIS/3-2022/19 Abstract: In this paper, a nonlinear balancing cube system is considered, the concept for which is based on an inverted pendulum. The main purpose of this work was the modelling and construction of a balancing cube with the synthesis of the control system. The control objectives included swing-up and stabilization of the cube on its vertex at an unstable equilibrium. Execution of the intended purpose required, first, deriving a cognitive mathematical model. It was based on the Lagrange method. Next, a mathematical model for control purposes was derived. The project of the physical model of the balancing cube was presented. A stabilization system based on a linear quadratic regulator (LQR) was developed. Moreover, a swing-up mechanism was used to bring the cube close to the upper equilibrium point. The algorithm switching condition was important to enable the correct functioning of the system. The developed control system was verified in the Matlab environment. Finally, verifying experiments and comparisons among models (mathematical and physical) were performed. Keywords: balancing cube, control systems, linear quadratic regulator, mathematical model, physical model
1. Introduction A balancing cube is based on a system popular among control systems enthusiasts: the inverted pendulum. Construction-wise, the balancing cube resembles a three-dimensional pendulum. In this case, the vertex of the cube is the illusory pendulum arm. Controlling the movement of the cube is achieved by acting with an external force on the cube’s faces and therefore moving it in a controllable way. The aim of the control system is to stabilize the cube on the upper equilibrium point. From a control point of view, it is a dynamic, non-linear system and has two equilibrium points: stable and unstable. A system like this is a good reflection of real-life systems such as: balancing robots [1], [2], [3], Segway vehicles [4] or rockets [5]. Balancing cube systems have a very long history and have been widely applied to test and as a benchmark for novel control algorithms. Different approaches for controlling the movement of the cube are analyzed. In [6], the authors are pro-posing moving
weights as control elements. The weights are attached to every face of the cube. The change in position of the weight causes the shift in the centre of mass of the construction and controllable movement of the cube. Other described control elements are flywheels which can be used in different ways. The velocity of the flywheels can be controlled [7], [8], which affects production of a specified amount of torque acting on the frame. This approach uses the principle of conservation of angular momentum. The second means of control is using the principle of conservation of energy and actively braking the flywheels to transfer the gathered kinetic energy from the flywheel to the frame [9]. This approach is used in this paper. Depending on the selected way of controlling the movement of the cube, there are different approaches to modelling the balancing system. The system can be modelled as a three-dimensional system described with positions in X, Y, and Z axes and respective angles: pitch, yaw, and roll [6], [7], [8]. The other way is taking into account that when using the principle of conservation of energy and modelling with Euler Lagrange equations, the movement of one face of the cube does not affect the other faces. This approach allows for modelling a simpler system of controlling only one of the faces with the flywheel and using an identical system for controlling every axis independently [9]. This approach is applied in the paper. The remainder of this paper is organized as follows. The derivation and implementation of the mathematical model of the balancing cube are described in Section 2. The details of constructing the physical model of the cube are presented in Section 3. The design and implementation of the control system are illustrated in Section 4. In Section 5, the conducted verification tests and comparison between models are shown. The last section presents the conclusions.
2. Mathematical Model of a Balancing Cube
The balancing cube consists of six faces. Three of them are equipped with drive systems with flywheels. These faces are used for controlling the movement of the cube in three axes. Each face of the cube with the flywheel is a separate control system. The scheme of the approach to modelling is presented in Figs. 1 and 1b. Symbols used in the paper are presented in Table 1.
2022 ® Kowalczyk and Piotrowski. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)
3
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
where
N° 3
2022
∂R ⋅ models dissipative forces and t k models ∂ qk
external torques applied to the system. Fig. 1-(a). Scheme of the balancing cube – representation of è cube as angular position of the cube
The potential energy of the system can be represented as:
V = mtot × g × l × cos q
(3)
where mtot = m + mw .
Whereas kinetic energy is defined as the sum of kinetic energies of flywheel and face frame: × æ × ö 1 1 T = × I frame × q2 + × Iw × ç qw + q ÷ 2 2 è ø
Fig. 1-(b). Scheme of the balancing cube – representation of qwheel as angular position of the flywheel Tab. 1. Symbols of variables and parameters. No.
Description
Symbol
Unit
1.
Mass of the face
m
Kg
Mass of the flywheel
mw
Kg
l
m
Gravitational acceleration
g
m ⋅ s −2
2. 3.
Distance between vertex and center of mass
5.
Angular deviation of diagonal of the face from the upper equilibrium point
4.
6. 7. 8. 9.
10.
rad
Ftc
Frictional force of the flywheel
Ftw
kg ⋅ m 2 ⋅ s −1
Iframe
Moment of inertia of the flywheel
kg ⋅ m 2 kg × m 2
2.1. Model Derivation
The model was derived using the Euler-Lagrange method [10] with the following generalized coordinates: q1 = q, q2 = qw . Lagrangian L is defined as the difference of kinetic T (ql ) and potential energy
V (qi ) of the system in defined coordinates: æ ×ö L = T ç ql ÷ –V (qi ) è ø
(1)
To solve the Lagrangian, the Euler-Lagrange equation is defined, from which the equations of motion are derived. d ∂L ∂L ∂R − + = tk dt ⋅ ∂qk ⋅ ∂ ∂ q q k k
4
Articles
)
¨ d ∂L ¨ = Iw ⋅ q w + Iw ⋅ q ⋅ dt ∂ q w
(5)
(2)
(6)
(7)
¶L = mtot × g × l × sinq ¶q
kg ⋅ m 2 ⋅ s −1
Iw
2
⋅ ⋅ ⋅ 1 1 L = ⋅ I frame ⋅ q2 + ⋅ Iw ⋅ qw + q − mtot ⋅ g ⋅ l ⋅ cosq 2 2
(
qw (t )
Moment of inertia of the face
Thus, using (3) and (4), the Lagrangian of the system can be defined as:
d æ ¶L ö ¨ ¨ = I frame + Iw × q + Iw × qw ç ÷ è ø dt ¶q
Angular deviation of the flywheel from the diagonal of the face Frictional force of the cube
(4)
Solution of the defined Lagrangian is found using Euler-Lagrange equations:
rad
q (t )
2
(8)
¶L =0 ¶q w
(9)
Dissipative forces are defined as the sum of kinetic energies produced by the friction of face and flywheel movement: ⋅ ⋅ 1 1 R = ⋅ Ftc ⋅ q2 + ⋅ Ftw ⋅ q2w 2 2
(10)
× ¶R = Ftc × q ¶q
∂R
(11)
⋅
(12)
= Ftw ⋅ qw
⋅
∂ qw
Then, adding external torque from the flywheel drive and disturbance torque on cube face, EulerLagrange equations are derived:
(
)
¨
¨
M z = I frame + Iw ⋅ q + Iw ⋅ qw ⋅
−mtot ⋅ g ⋅ l ⋅ sinq + Ftc ⋅ q ××
××
×
Tmotor = Iw × qw + Iw × q + Ftw × qw
(13) (14)
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
Transferring the highest derivatives of angular positions on one side of the equations gives implicit equations of motion, describing dynamics of the modelled system:
¨ ¨ I frame + Iw ⋅ q = − Iw ⋅ qw + mtot ⋅ g ⋅ l ⋅ sinq + M z − Ftc ⋅ q ⋅ ¨ ¨ I I T F ⋅ q = − ⋅ q + − ⋅ q w w w motor tw w
(
)
(15)
N° 3
2022
equilibrium point of the system (θ = 0) is disturbed. The expected result would be exiting from the unstable point and after the expiring of transitional states, resting in the stable equilibrium point which is q = ±p and θ = 0. The results prove that the behavior of the mathematical model is similar to what is expected from the physical model (see Fig 2).
2.2. Implementation of Non-Linear Model The model of the balancing cube is implemented in the Matlab environment, thus the model has to be explicit to avoid algebraic loops. After transformation, the explicit equations of motion are obtained:
× ì mtot × g × l × sinq - Tmotor + Ftw × qw ï ï¨ ïq = -Ftc × q + M z ï I frame ï í × Tmotor × I frame + Iw - Ftw × qw × I frame + Iw ï ï ï × -mtot × g × l × sinq × Iw + Ftc × q × Iw - M z × Iw ï ¨ = q ï w I frame × Iw î
(
(
)
)
(16)
The model (16) is implemented in Matlab with numerical values of the parameters (see Table 2) to conduct a series of experiments and simulations leading to the synthesis of the control system. Tab. 2. Values of the parameters used in simulation No. 1. 2. 3. 4. 5. 6. 7. 8.
Parameter
Value
Unit
0.2
kg
l
0.106
m
I frame
0.57 · 10-3
Ftc
0.15 · 10-3
mw m g
0.4
9.81
Iw
3.34 · 10-3
Ftw
0.5 · 10-3
Fig. 2. Results of simulation of disturbing the cube’s frame resting at an unstable point of balance with the mathematical model Experiment 2. Sinusoidal input The second experiment checks if the input to the system is transferred linearly to the output. The base position of the face is in the lower equilibrium point (q = p). The result shows that the input is almost linearly transferred to the output of the system (see Fig. 3).
kg m ⋅ s −2
kg ⋅ m 2 kg ⋅ m 2
kg ⋅ m 2 ⋅ s −1 kg ⋅ m 2 ⋅ s −1
A series of experiments were conducted to prove that the mathematical model of the cube acts similarly to the actual physical system.
Experiment 1. Impulse response The first experiment checks the behavior of the model when the face of the cube resting in the unstable
Fig. 3. Mathematical model response to a sinusoidal input.
2.3. Derivation of the Linear Model for the Synthesis of Control System The considered structure of the control system is LQR. To properly apply the structure, the linear space-state model of the system is needed. Hence, linearization of the non-linear model (16) is carried out. The nonlinear space-state model is derived from equations of motion and linearized around the unstable equilibrium point. Considering state variables vector: T
⋅ ⋅ T x = x1 , x2 , x3 = q , q , qw
(17)
Articles
5
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
and using (16), state equations are derived:
ì × x1 = x 2 ï ï mtot × g × l × sinx1 - Tmotor + Ftw × x3 ï - Ftc × x2 + M z ï × ï x2 = I frame ïï (18) í Tmotor × I frame + Iw - Ftw × x3 × ï ï I frame + Iw - mtot × g × l × sinx1 ï ï ï × ×Iw + Ftc × x2 × Iw - M z × Iw ï x3 = I frame × Iw ïî
( (
) )
along with controlled outputs equations: y1 = x1 y2 = x2 y = x 3 3
(19)
Using (18) and (19), the non-linear space-state model can be described:
(
ì x = f x (t ) , u (t ) ï í ïî y = g x (t )
(
)
)
(20)
Linearization is carried out around unstable equilibrium point x u = [ x u 1 , x u 2 , x u 3 ]T = [0,0,0]T . Space-
state matrices for the considered system are specified as [11]: State matrix A:
∂f1 ∂x xu 1 ∂f A= 2 x ∂ x1 u ∂f 3 x ∂x1 u
Control matrix B:
∂f1 ∂x 2 xu
∂f2 ∂x 2 xu ∂f3 ∂x 2 xu
∂f1 ∂x3 xu ∂f 2 ∂ x3 x u ∂f3 ∂x3 xu
é ¶f1 ù ê xu ú ê ¶u ú ¶f B = êê 2 úú ¶u x u ê ú ê ¶f3 ú ëê ¶u xu ûú
Output matrix C:
é ¶g1 ê ¶x x u ê 1 ê ¶g C=ê 2 x ê ¶ x1 u ê ¶g ê 3 x ëê ¶x1 u
6
Articles
¶g1 ¶x 2 x u ¶g2 ¶x 2 x u ¶g3 ¶x 2 x u
(21)
(22)
¶g1 ù ¶x3 xu úú ¶g2 ú ú ¶ x3 x u ú ¶g3 ú ú ¶x3 xu ûú
N° 3
2022
Transfer matrix D:
é ¶g1 ù ê xu ú ê ¶u ú ê ¶g2 ú D=ê x ú ê ¶u u ú ê ¶g ú ê 3 ú x ë ¶u u û
(24)
Additionally, it is known that lim sin θ = 0 , hence θ→0
using the aforementioned equations and (21)-(24), space-state matrices are derived: 0 é ê m × g×l ê tot ê I frame A=ê ê ê -m × g × l ê tot êë I frame
ù ú ú ú I frame ú -Ftw × I frame + Iw úú ú I frame Iw úû
1 -Ftc I frame
0 Ftw
(
Ftc
I frame
)
é ù 0 ê ú -K m ê ú ê ú I frame B=ê ú ê ú ê K m × I frame + Iw ú ê ú I frame × Iw ë û
(
)
(25)
(26)
é1 0 0ù ê ú C = ê0 1 0ú ê ú ë0 0 1û
(27)
é0ù ê ú D = ê0ú ê ú ë0û
(28)
Using matrices (25)-(28), the linear state-space model can be described: ìï x = A × x (t ) + B × u (t ) í ïî y = C × x (t ) + D × u (t )
(29)
3. Physical Model of a Balancing Cube
(23)
The considered system is a cube with a side length of 15cm. Three faces of the cube are equipped with drive systems and flywheels which allow for controlling the movement of the cube in all axes (see Fig. 4). Faces and flywheels are cut from an aluminum plate with 2mm thickness. Elements linking the faces are printed on the 3D printer.
Journal of Automation, Mobile Robotics and Intelligent Systems
Fig. 4. Visualization of the physical model of the cube As actuators, three-phase brushless DC motors (BLDC) were selected due to their high precision in controlling the angular velocity and low motion resistance. The motors were equipped with Hall sensors which improved control of the angular velocity and allowed direct measurement of angular velocity. For controlling the motion of the cube, the measurements of the angular velocity and position of the cube in three axes as well as the angular velocity of the flywheels are needed. The angular velocity of the flywheels is measured with the aforementioned Hall sensors. The system measures the value of the magnetic field going through each sensor; by measuring the time between extremes, it calculates the angular velocity of the drive. Angular velocity and position of the cube were measured with the MPU6050 module. It is a system equipped with an accelerometer and a gyroscope that can calculate the needed values with great precision. The chosen control device was the STM32 microcontroller. For the implementation of the main control algorithm, the microcontroller of the STM32F4xx family was selected due to its high computing speed and 100MHz clock speed. The microcontroller was used for gathering data from the MPU module and calculating control variables for each flywheel. After computing the measurement data, the microcontroller outputs set velocities for the flywheels and transfer them to the system responsible for controlling the drives. Each drive is controlled by a system consisting of an STM32F103C8T6 microcontroller and a DRV8313 driver controller which supplies each phase of the drive. For linking all needed elements to control the drives it was decided to create a custom PCB board (see Figs. 5-6), the prototype for which was designed in the Eagle environment.
VOLUME 16,
2022
Fig. 6. Connection project in custom PCB board
4. Control System of a Balancing Cube Controlling the movement of a balancing cube is achieved with two steps: getting the frame close to the upper equilibrium point (swing-up) and intercepting it with an LQR to control the frame around this point.
4.1. Swing-up
One of the ways to swing the frame up is to increase the total energy of the system to the point where it would equalize the potential energy of the system resting in the upper equilibrium. It can be achieved by imposing an external torque with BLDC drive on the flywheel, thus increasing the kinetic energy of the system. Then, the rapid breaking of the flywheel transfers the kinetic energy of the flywheel to the frame of the cube which results in the cube “jumping up” close to the upper equilibrium point. The total energy of the cube is described as: Et = T + V
(30)
where: Et – total energy of the system, T – kinetic energy, V – potential energy. In resting condition, it is assumed that total energy is Eref = 0. In upper equilibrium point, total energy is: E g = mtot ⋅ g ⋅ ∆h
(31)
where: ∆h – change of mass center height of the system (see Fig. 7). Hence, the energy needed for swinging up the cube equals: E = E g − Eref = mtot ⋅ g ⋅ ∆h
where: mtot – total mass of the cube.
Fig. 5. Project of PCB board
N° 3
(32)
Fig. 7. Change of height of the centre of mass of the cube Articles
7
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
The flywheel has to be accelerated to a velocity that allows the system to gather the needed value of kinetic energy. Minimum angular velocity which meets this condition can be calculated by comparing the energy needed for swing-up with the energy of the flywheel: ⋅
1 mtot ⋅ g ⋅ ∆h = ⋅ Iw ⋅ q2w 2
(33) ×
where: Iw – moment of inertia of the flywheel, qw – angular velocity of the flywheel. After transformations, the minimal angular velocity is derived: ⋅
qw =
2 ⋅ mtot ⋅ g ⋅ ∆h Iw
N° 3
2022
The gain matrix K was calculated:
K = éëK 1 K 2 K 3 ùû = [ -112.2122 - 8.0649 - 0.3182] (37)
The simulation with matrices (35), (36) was carried out with the angular position of the cube at 0.25 rad (around 14°) from the upper equilibrium. Results show that the linear and non-linear systems behave almost identically (see Fig. 9). The designed control system achieves the aim which is to keep the cube in the upper equilibrium point with good control quality.
(34)
Simulation in the Matlab environment was carried out where the flywheel is accelerated for 5 seconds and then rapidly brakes transferring gathered kinetic energy to the frame of the cube. The result shows that the aim of swinging-up action was achieved by bringing the frame near the upper equilibrium point (see Fig. 8).
Fig. 9. Results of the LQR control for the mathematical model
4.3. Switching Between Algorithms
Fig. 8. Results of the swing-up experiment for the mathematical model
4.2. Stabilization in the Upper Equilibrium Point The selected structure of the control system for stabilization is LQR. This structure is linear, hence during the synthesis, the linear state-space model (25)-(29) was applied. LQR structure and its principles are widely and well described, e.g., in [11], [12], [13], so they are omitted in this paper. Weight matrices Q and R were chosen iteratively starting at the lowest weight and by observing the system response in a series of simulations (see Section 4.3). Finally, the following weight matrices were chosen: 100 0 0 Q = 0 1 0 0 0 10 R = 10
8
Articles
(35) (36)
To obtain a complete control system of the cube, both the swing-up algorithm and LQR control need to be used together. As the switching condition, after a series of simulations, the angular position of the face at 0.2 rad (around 11°) was selected. The switching algorithm was implemented as a sequential algorithm (see Figs. 10, 11). First, the swing-up is executed and then, if the conditions are met, the control signal switches to LQR stabilizing in upper equilibrium.
Fig. 10. Block Chart used for the switching algorithm The block has 3 inputs: theta – the angular position of the face, swing – control signal from the swing-up regulator and LQR – control signal from the stabilization regulator. The block switches the active step of the sequence based on the angular position of the face – indicated by the conditions above the arrows between steps (see Fig. 11). The starting step is Swing_up which is indicated by the arrow with the
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
dot. When one of the steps is active the block feeds the control signal swing or LQR to the output (control_ var) to close the feedback loop.
Fig. 11. Implementation of the switching algorithm in the block The simulation of the complete control system with the implemented switching algorithm (see Fig. 12) shows that it achieves the aim to swing up the face and stabilize it in upper equilibrium. To control the movement of the whole cube, the derived model should be implemented in the faces with flywheels.
Fig. 12. Results of the complete control system for the mathematical model
4.4. Implementation of the Control System The main control algorithm was implemented in the STM32F411 microcontroller and drives control in STM32F103 microcontrollers. For the configuration of the peripheries used in the project, the STM32CubeMX environment was applied (see Figs. 13-14).
Fig. 14. Configuration of peripheries in STM32F103 4.4.1. Measurements The implementation of the control system in a physical model can be divided into a few steps. The first step is measurement handling. The MCU6050 module communicates with the microcontroller using I2C protocol as a slave device. This is a synchronous protocol with one data line. The communication is monitored by the microcontroller which acts as a master device. The process is based on reading and writing the re-quired registers of the MPU module. The protocol clock frequency was set to 400kHz due to the necessity of handling a high number of measurements. The MPU module was configured as well. Temperature measurement was disabled for speeding up the module. The clock speed was set to 1kHz. The operating range was set to a minimum for achieving high precision of measurement. After the configuration of both I2C protocol and MPU module, the required measurements can be read from suitable registers by the microcontroller. The angular velocity in each axis is measured directly and the angular position is calculated based on the velocity and sample rate. For measuring the angular velocity of the flywheels, Hall sensors, with which the drives are equipped, were used. Generally, the sensors are integrated with a comparator and transistor but the sensor used in the project was not. A custom PCB board (see Section 3) for proper connection was used. Measuring the angular velocity is based on handling external interruptions in the microcontroller from the Hall sensors. The algorithm counts the travelled impulses of the drive between the interruptions and with a constant sample rate calculates the angular velocity. 4.4.2. Control Loop Implementation of the main control loop requires implementing the control law of the LQR in the microcontroller. New control variables for the drive are calculated in the loop with constant frequency using gain matrix K (see Subsection 4.2). The designed control system is identical for each of the three faces of the cube with the drive system.
Fig. 13. Configuration of peripheries in STM32F411
4.4.3. Actuators The control loop is implemented in the main microcontroller but for controlling the actuators, the Articles
9
Journal of Automation, Mobile Robotics and Intelligent Systems
calculated values of the control variable are required to be transferred to the master driver microcontrollers. Universal Asynchronous Receiver and Transmitter (UART) protocol with half-duplex was used. Each motor controller is connected to its master microcontroller and has isolated communication. For configuring the connection, the speed of transfer, length of the word, stop and parity bit is required to be set. To calculate the control variable for the driver controller, the actual angular velocity of the flywheels is needed. The drivers are controlled with pulse-width modulation (PWM) signals and it is based on enabling suitable phases of the motor with the required frequency to achieve a set angular velocity. Calculations are performed in the master driver microcontroller.
5. Verification Tests
To prove the functionality of the physical model of the cube with the implemented control system, a series of tests was conducted. The measurements of relevant values were per-formed in the STMStudio environment (see Fig. 15).
Fig. 15. View of the STMStudio environment
5.1. Implementation of the Control System
VOLUME 16,
N° 3
2022
Fig. 17. Stabilization of the whole cube for the physical model
5.2. Comparison Between Physical and Mathematical Models Control results of the tests on the physical model were compared to simulations with similar conditions (see Figs. 18, 19). The comparison shows that both models’ behavior is alike. In the simulated model, the settling time is shorter due to achieving higher angular velocities. Other discrepancies are the result of inaccuracy of simulating the disturbance, slight differences in parameters, and difficulties in achieving the same experimental conditions for both models.
Fig. 18. Comparison of the mathematical and physical model – 1D
The tests consist of stabilization of the cube on the edge (one flywheel used) and on the vertex (all flywheels used). Results of the experiments show that the implemented control system is allowing the cube to stabilize in upper equilibrium and rejects small disturbances up to around 0.15 rad (around 8.5°) (see Figs. 16, 17).
Fig. 19. Comparison of the mathematical and physical model – 3D
5.3. Analysis of LQR Weights Influence on Control Quality
Fig. 16. Stabilization of one face of the cube for the physical model 10
Articles
In the LQR structure, the weight matrices Q and R are selected manually. Each value in the diagonal of the matrix is a weight on the penalty function responsible for each state variable or control variable, so it is important to know the influence of each weight on the
Journal of Automation, Mobile Robotics and Intelligent Systems
control quality. Here, it is determined by conducting a series of experiments (see Figs. 20-23).
Fig. 20. Influence of change in weight Q11 – responsible for the angular position of the face θ
Fig. 21. Influence of change in weight Q11 – responsible for the angular velocity of the face θ
Fig. 22. Influence of change in weight Q33 – responsible × for the angular velocity of the flywheel q w
VOLUME 16,
N° 3
2022
of overshoot. Q11 has the greatest influence: increasing its value causes a great extension of settling time as well as an increase in control variable value. Q33 is indirectly linked with the flywheel motor and when increased, the value of the control variable also increases rapidly. However, it causes the settling time to shorten. The weight R is linked to a penalty function on the usage of input resources, so increasing it results in a lower value of control variable but with a cost of longer settling time. The control system designer, when he or she obtains the knowledge of the influence of each weight, selects the values optimally based on the requirements and limitations of the system.
6. Conclusion
This paper presented the modelling and construction of a balancing cube robot using flywheels for controlling its movement and keeping balance in the upper equilibrium point. The aim of the designed control system was to get the cube close to the upper equilibrium and stabilizing it at this point. Then, a series of tests on both mathematical and physical model were conducted. The simulations have shown that the constructed model acts similarly to the mathematical one in the field of stabilizing the cube. Differences between the models are the result of inaccuracy of simulating the disturbance, slight differences in parameters and difficulties in achieving the same experimental conditions for both models. The selected control system structure was LQR and despite the non-linear system, the control system achieves the requirement set for the system. The non-linearity of the system around the upper equilibrium was negligible and it is confirmed in the simulation. Because of technical problems, the braking system of the cube was unfinished and hence not used in the project. It is a subject of future works on the system as well as improving the control system performance. The constructed cube can also be applied as a benchmark for researching non-linear control systems, e.g., fuzzy logic.
AUTHORS
Adam Kowalczyk – Gdańsk University of Technology, Faculty of Electrical and Control Engineering, E-mail: Kowalczyk.adam.1996@wp.pl.
Fig. 23. Influence of change in weight R – responsible for the control variable Tmotor. The control results show that the change in the value of Q11 is almost negligible on control quality, the increase in value results in a slight decrease
Robert Piotrowski* – Gdańsk University of Technology, Faculty of Electrical and Control Engineering, E-mail: robert.piotrowski@pg.edu.pl. *Corresponding author
References [1]
U. Adeel, K.S. Alimgeer, O. Inam, “Autonomous Dual Wheel Self Balancing Robot Based on Articles
11
Journal of Automation, Mobile Robotics and Intelligent Systems
[2] [3] [4] [5] [6]
[7]
12
Microcontroller”, Institute of Information Technology, Pakistan, 2013.
P. Tripathy, “Self-balancing bot using concept of inverted pendulum,” National Institute of Technology Rourkela, India, 2013. A. Castro, “Modelling and dynamic analysis of a two-wheeled inverted-pendulum,” Georgia Institute of Technology, 2012.
B.W. Kim, B.S. Park, “Robust Control for the Segway with Unknown Control Coefficient and Model Uncertainties,” MDPI: Sensors – Open Access Journal, 2016. K.H. Lundberg, “History of Inverted-Pendulum Systems,” IFAC Proceedings Volumes, vol. 42, no. 24, 2010, pp. 131–135.
S. Trimpe, R. D’Andrea, “The Balancing Cube – A Dynamic Sculpture as Test Bed for Distributed Estimation and Control,” IEEE Control Systems Magazine, vol. 32, no. 6, pp. 48–75 2012. J. Mayr, F. Spanlang, H. Gattringer, “Mechatronic design of a self-balancing three-dimensional
Articles
VOLUME 16,
[8] [9]
N° 3
2022
inertia wheel pendulum,” Mechatronics, vol. 5, 2015, pp. 1–10.
Z. Chen, X. Ruan, Y. Li, “Dynamic Modelling of a Cubical Robot Balancing in Its Corner,” MATEC Web of Conferences 139, 2017. M. Gajamohan, M. Merz, I. Thommen, R. D’Andrea, “The Cubli: A Cube that can Jump Up and Balance,” Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, Algarve, Portugal, Oct. 2012.
[10] D. Morin, Introduction to Classical Mechanics with Problems and Solutions, Cambridge University Press, 2008.
[11] K. Ogata, Modern Control Engineering, Fifth Edition, Prentice Hall, 2010.
[12] J.P. Hespanha, “Lecture Notes on LQR/LQG Controller Design,” Knowledge Creation Diffusion Utilization, 2005. [13] R.M. Murray, “LQR Control,” California Institute of Technology, Control and Dynamical Systems, 2006.
VOLUME 16, N° 3 2022 Journal of Automation, Mobile Robotics and Intelligent Systems
Neurocontrolled Car Speed System Submitted: 5th October 2021; accepted: 3rd May 2022
Markiyan Nakonechnyi, Orest Ivakhiv, Dariusz Świsulski DOI: 10.14313/JAMRIS/3-2022/20 Abstract: The features of the synthesis of neural controllers for the car speed control system are considered in this article. The task of synthesis is to determine the weight coefficients of neural networks that provide the implementation of proportional and proportional-integralderivative control laws. The synthesis of controllers is based on an approach that uses a reversed model of the standard. A model of the car speed control system with the use of permitting subsystems has been developed, with the help of the synthesized controller that is connected under certain specified conditions. With the iterative programming and mathematical modeling environment in MATLAB, and using the Simulink package, a structural scheme for controlling the speed of the car was constructed and simulated using synthesized neural controllers. Keywords: neural controller, PID-algorithm of control, dynamic object, neural networks, electric car, speed control
1. Introduction In recent human activities, different computerized devices and systems have been widely incorporated in various fields, especially in the automotive and avionics industries [1-12]. These possibilities come up during the design, simulation, and testing processes [13-16] as well as in the ordinary exploitation period of a concrete product [17-23]. Controllers that use neural network systems achieve an effective speed control in both electric and traditional cars [24-30].
2. M athematical Model of the Car Movement
The task of any automatic control system’s creation is to supplement the managed object with external links that would allow processes to proceed according to certain predefined criteria. The choice of these criteria is primarily determined by the fact that the purpose of the automatic control system is to ensure the output of the controlled object at any point in time. The controlled value is as close as possible to the specified.
For the most part, automatic control systems consist of non-linear elements that are covered by complex feedbacks. The operation of such systems in the real world is affected by a variety of noises, interferences, and other disturbing factors, which significantly limit the use of modern and classical control theory in the construction of controllers [31-36]. In recent decades, management strategies have used theories based on the idea of a system linearization, which does not fully reflect its physical properties. In some cases, even when the dependencies between the inputs and outputs of the system are accurately reproduced, their use cannot provide adequate control of the system. Therefore, artificial neural networks are increasingly used in synthesizing control algorithms. This method considers object features that the network must reproduce, and its training is conducted based on the input and output data that characterizes the processes that run in the object [37, 38]. In general, the neural controller is implemented into the automatic control system (Fig. 1). The main attention here is focused on the possibility of choosing the proper controller type. Meanwhile, the creation of the controller architecture itself is out of the scope of this research because this task has already been described in detail [37]. Let us assume that the architecture of the neural controller is known and while training, it is only necessary to determine its weighting factors. In this case, the neural controller complements the nonlinear object so that when submitting any valid sequence rk, the formed system is as close as possible to the standard (ideally yrk = yk). Since one needs to know the input and output signals to train a neural network, then a neural controller can be trained if the following are known: 1. setting signal at the input of the neural controller (sequence rk); 2. feedback signal of the output object (sequence yk); 3. the object’s input and the corresponding sequence yk, which is taken from its output. Using the model of the inverted reference, based on the sequence yk, it is possible to obtain an input sequence of the reference rk, which, when fed to the controller input of the system, will cause its reaction uk.
2022 ® Nakonechnyi et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)
13
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
Fig. 1. Automatic control system with neural controller implementation
Fig. 2. The general scheme of the neural controller training, using the neural model of the inverted reference When the reference is simple, this training scheme has several significant advantages. The process of building a model of the inverted reference is much simpler than the process of building a model of the inverted object. There are two ways to convert the reference in the classical sense: the first is to solve reference equation relative to the input variable, and the second is to construct a model inversion using SIMULINK [39, 40]. One of the most effective ways to reverse a reference is to build a neural network-based inverted model (Fig. 2). Let us consider using a neural network controller to control the speed of a car [39]. For example, let us look at the car movement on an inclined plane surface (Fig. 3). Let us consider the main external forces 14
applied to the car, i.e., the thrust force of the engine Fe (or in the case of its negative value, the braking force) transmitted through the wheels; the aerodynamic force due to the wind action Fw,; and the projection of gravity on the longitudinal axis of the car Fh. The equation of a car’s motion under Newton’s second law can be written as follows:
mx = Fe − Fw − Fh , (1)
where m is the mass of the car, x is the displacement, and Fe is the magnitude of the force for which the maximum and minimum values are given, i.e., the maximum thrust of the engine and the maximum braking force, respectively.
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
of using classical methods of controller synthesis [35, 36]. That is why the synthesis of controllers is carried out using artificial neural network technology [38]. To ensure a high performance using the principle of a variable structure [39], the implementation is carried out by a permitting subsystem. While operating a car movement speed control system, the use of any controller is necessary to fulfill the following requirements: Fig. 3. Car position on an inclined plane surface There are accepted limits of thrust change and car weight [39]:
−2000 ≤ Fe ≤ 2000 and m = 1000 kg . (2)
The aerodynamic force is directly proportional to the drag coefficient CD, the frontal area of the car A and the speed pressure P, which is determined as f ollows:
P=
ρV 2 2
, (3)
where ρ is the air density, and V is the speed, which includes speed of the car and the speed of the wind Vw. Suppose that there is given a value of a relation as follows [39]:
C D Aρ 2
= 0,001, (4)
and the wind speed is described by the following expression [39]:
Vw = 20sin(0.01t). (5)
Thus, the aerodynamic force is determined by the following equation:
Fw = 0.001( x + 20sin(0,01t ))2 . (6)
Since the considered surface of the road is not horizontal, the angle between the longitudinal axis of the car and the horizontal plane surface is given by the following equation [39]:
θ = 0.0093 sin(0.0001 x ). (7)
The equation of the projected gravity force is written in the following form:
Fh = 30sin(0.0001x). (8)
As can be seen from equations (1)-(8), the car s peed simulation system cannot be represented by a linear differential equation or a transfer function. The nonlinearity of the model eliminates the possibility
1. ensuring that the car reaches set speed value without over-adjusting; 2. providing the specified system performance.
The dynamics of the processes that occur in the speed control system of the car depend on both the type and parameters of the selected controller and the given car of a nonlinear model. As such, it is necessary to determine the effectiveness of the process control system in a nonlinear object in the implementation of various control laws [41]. That means that the proportional (P) and proportional-integral-derivative (PID) controllers, based on neural networks, follow their testing in the speed control system model. This type of prepared model of the car speed control system, using P and PID controllers based on neural networks, is presented below (Fig. 4). The proper results of such a system simulation, namely the advantages of a control system using a PID controller (Fig. 5b), compared to a system based on a P controller (Fig. 5a), are shown in Figure 5. The advantages of the PID controller in comparison with P controller are the shorter transition time and the practical identity of settled and adjustable quantities behavior in the transition mode. It should also be noted that the weight coefficients of the PID and P controllers, obtained in the neural network training course, ensure the absence of oscillations of the original value in both transient and steady-state modes. Thus, based on the simulation results, it can be concluded that to provide a predetermined performance of the car speed control system using P and PID controllers, it is advisable to switch on the PID controller on the first stage of the control process, and the P controller should be used when the difference between the given and real speeds reaches a small value. Today, to solve such problems, variable structure systems are used [39]. To provide the necessary parameters of the control process, when reaching the difference between the reference and the output values of the setpoint, the switching of individual functional units is envisaged. In such systems, monitoring of the adjustable value is carried out during the operation of the system and when it reaches a certain value, the corresponding control algorithm is switched on by way of logical blocks. The switching procedure is possible due to the modular structure of the system, which can be used as a subsystem in the composition of more complex systems. These systems are used in robotic complexes, and in transport, as well as in controlling the operation of electric motors and
15
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
Fig. 4. Model of the car speed control system using P controller and PID controller based on neural networks
Fig. 5. Results of system modeling using neural PID controller (a) and P controller (b) g enerators. In such cases, considering the features of the object, the model of one object can be replaced with the model of any other object. Let us consider the features needed to create a system with a variable structure to control the speed of the car [39]. There is a constructed model of the car speed control system at the application of the principle of variable structure (Fig. 6). The model consists of a speed setting unit, a car model, controllers made based on neural networks, a subsystem of choice of operating modes of controllers and output units (oscilloscopes and display). The system model uses the mode selection subsystem (Fig. 7). The inputs of the subsystem are given a known and real value for the speed and their difference is determined, which is simultaneously fed to the inputs of the unit of calculation of the module and the differentiator. The resulting module value is fed to one of 16
the inputs of the relational operator “<=”, the second input of which is connected to the output of block C1. The output of the differentiator via the module calculator is fed to the first input of relational operator “<=”, the second input of which is connected to the output of block C2. The operation of the mode selection subsystem is as follows. If the absolute value of the speed error is greater than the threshold value set in block C1 and the rate of change of the error signal is greater than the threshold value set in block C2, then the PID controller is used. The PID controller cycle continues until the difference between the set and real speeds reaches the value set in block C1, and the rate of change of the error signal is less than the value set in block C2. In all other cases, the P controller is used. Without any grounds for justification, the established limits and thresholds were used only to demonstrate the operation of the system with variable structure.
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
Fig. 6. Model of the car speed control system using the principle of variable structure
Fig. 7. The controller operation mode choosing subsystem Relational operators “<=” determine the activation of a controller as follows: PID controller, if P controller, if
C 1 < X set − X real (9) d C2< X set − X real dt C 1 > X set − X real (10) d C2> X set − X real dt
The result of model operation with such mode selection subsystem when using synthesized neurocontrollers is shown in Fig. 8, and the graphs of the resolution signals of PID (b) and P (c) controller subsystems are shown in Fig. 9. The created mode selection subsystem accelerates the speed setting and provides the required statistic error value (Fig. 9a). Therefore, the synthesis of controllers based on artificial networks and the use of criteria for switching on and switching controllers are
17
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
Fig. 8. Operational diagram of the car speed control system with the mode selection subsystem using synthesized neurocontrollers
Fig. 9. Graphs of error (a), switch permissions of PID controller (b) and P controller (c) subsystem signals
18
2022
Journal of Automation, Mobile Robotics and Intelligent Systems
effective. It is advisable to use this type of subsystem in the model of the car speed control system using the principle of variable structure.
3. Conclusion
We have investigated a system with a variable structure that can be used to control the speed of a car described by a nonlinear differential equation using the SIMULINK environment. The structure of the system uses both proportional and proportional–integral-derivative controllers made based on neural networks. In the simulation process, the available difference between the reference and the output values of the setpoint can be achieved (for example, the speed of the car as well as the car acceleration). To provide the necessary parameters of the control process by means of permitting subsystems, its separate functional units are switched. This model can be used not only in the transport application, but also in several other areas especially in robotic complexes, as well as in the control of the operation of electric m otors and generators. In those cases, the model of the car may be replaced by the model of any other object. The switching procedure is possible due to the modular structure of the control system, which can be used as a subsystem in the composition of more complex systems.
AUTHORS
Markiyan Nakonechnyi – Computerized Automatic Systems Department, Computer Technology, Automation and Metrology Institute, Lviv Polytechnic National University, 12 Bandera str., Lviv 79013, Ukraine, markian.v.nakonechnyi@lpnu.ua.
Orest Ivakhiv* – Intelligent Mechatronics and Robotics Department, Computer Technology, Automation and Metrology Institute, Lviv Polytechnic National University, 3 kn. Romana str., Lviv 79008, Ukraine, orest.v.ivakhiv@lpnu.ua. Dariusz Świsulski – Faculty of Electrical and Control Engineering, Gdańsk University of Technology, Gdańsk 80-233, Poland, dariusz.swisulski@pg.edu.pl. *Corresponding author
REFERENCES [1]
[2]
F. J. Maldonado, S. Oonk, T. Politopoulos, “ Enhancing Vibration Analysis by Embedded Sensor Data Validation Technologies”, IEEE Instrumentation & Measurement M agazine, vol.16, no. 4, August 2013, pp. 50-60, doi: 10.1109/MIM.2013.6572957.
C. Teal, C. Satterlee, “Managed Aircraft Wiring Health Directly Relates to Improved Avionics Performance”, Proceedings of 19th Digital Avionics
VOLUME 16,
[3]
[4]
[5]
[6] [7] [8]
[9]
N° 3
2022
Systems Conference. (Cat. No.00CH37126), 7-13 October 2000, doi: 10.1109/DASC.2000.886926. L. Silver, D. W. Christenson, “Developing a Stable Architecture for Interfacing Aircraft to Commercial Personal Computers”, Proceedings AUTOTESTCON 2003. IEEE Systems Readiness Technology Conference, 22-25 September 2003, doi: 10.1109/AUTEST.2003.1243560.
K. F. Roosendaal, D. W. Christenson, “Embedded Computer Software Loader/verifier Implementation Using a Hardware and Software Architecture Based upon Best Practices Derived from Multiple Spiral Developments and the Joint Technical Architecture”, Proceedings AUTOTESTCON 2003. IEEE Systems Readiness Technology Conference, 22-25 September 2003, doi: 10.1109/AUTEST.2003.1243622. A. N. Srivastava, R. W. Mah, C. Meyer, “Integrated Vehicle Health Management”, National Aeronautics and Space Administration, Aeronautics Research Mission Directorate, Aviation Safety Program, Technical Plan, Version 2.03, November 2009, pp. 1-73.
F. Gustafsson, “Automotive Safety Systems, R eplacing Costly Sensors with Software Algorithms”, IEEE Signal Processing Magazine, vol. 26, 2009, pp. 32-47. A. Leone, “Automotive Design: At the Beginning Only was Light”, IEEE Instrumentation & Measurement Magazine, vol. 22, iss. 1, February 2019, pp. 28-32, doi: 10.1109/MIM.2019.8633348.
D. Serritiello, “Human-machine Interaction, Methods and International Standards”, IEEE Instrumentation & Measurement M agazine, vol. 22, iss. 1, February 2019, pp. 33-35, doi: 10.1109/MIM.2019.8633349.
ISO 15006:2011, “Road Vehicles Ergonomics Aspects of Transport Information and Control Systems: Specifications for In-vehicle Auditory Presentation”, International Organization of Standardization, https://www.iso.org/standard/55322.html.
[10] A. M. Di Natale, “The Evolution of Passive Safety in I&M”, IEEE Instrumentation & Measurement Magazine, vol.22, iss. 1, February 2019, pp. 5-10, doi: 10.1109/MIM.2019.8633324. [11] S. Stahlschmidt, A. Gromer, M. Walz, “WorldSID 50th vs. ES-2. A Comparison Based on Simulations”. Proceedings LS-Dyna Forum, Bamberg 2010, pp. 13-32.
[12] M. Dudzik, “Methodology of calculating maximal possible acceleration limited by the adhesion condition for a traction vehicle on the example
19
Journal of Automation, Mobile Robotics and Intelligent Systems
of the FLIRT ED 160 model produced by stadler”, 2018 International Symposium on Electrical Machines (SME), Andrychów, Poland, 10 -13 June 2018, 2018, doi: 10.1109/ISEM.2018.8442891.
[13] C. R. Parkey, D. B. Chester, M. T. Hunter, W. B. Mikhael, “Simulink modeling of analog to digital converters for post conversion correction development and evaluation”, IEEE 54-th International Midwest Symposium on Circuits and Systems (MWSCAS), 2011, pp. 1-4, 7-10, doi: 10.1109/ MWSCAS.2011.6026634. [14] D. A Tagliente, C. Lyding, J. Zawislak, D. M arston. “Expanding Emulation from Test to Create Realistic Virtual Training Environments”, 2014 IEEE AUTOTEST, 15-18 September 2014, doi: 10.1109/AUTEST.2014.6935140. [15] I. McGregor, “The Relationship between Simulation and Emulation”, Proceedings of 2002 Winter Simulation Conference, vol. 2, 8-11 December 2002, pp. 1683-1688, doi: 10.1109/ WSC.2002.1166451.
[16] W. J. Headrick G. Garcia, “Automated Configuration of Modern ATE”, IEEE Instrumentation & Measurement Magazine, vol. 21, iss. 4, August 2018, pp. 22-26, doi: 10.1109/ MIM.2018.8423742. [17] J. W. Sheppard, S. Strasser, “Multiple Fault Diagnosis Using Factored Evolutionary Algorithms”, IEEE Instrumentation & Measurement Magazine, vol. 21, iss. 4, August 2018, pp. 27-38, doi: 10.1109/MIM.2018.8423743.
[18] H. King, N. Fortier, J. W. Sheppard, “An AI-ESTATE Conformant Interface for Net-Centric Diagnostic and Prognostic Reasoning”, IEEE Instrumentation & Measurement Magazine, vol. 18, iss. 4, August 2015, pp. 18-24, doi: 10.1109/ MIM.2015.7155768. [19] IEEE Standard for Artificial Intelligence Exchange and Service Tie to All Test Environments (AI-ESTATE), IEEE Standard 1232, 2010, https:// standards.ieee.org/standard/1232-2010.html.
[20] M. Galvani, “History and Future of Driver Assistance”, IEEE Instrumentation & Measurement Magazine, vol. 22, iss. 1, February 2019, pp. 11-16, doi: 10.1109/MIM.2019.8633345.
[21] A. Secondi, “Vehicle Suspension”, IEEE Instrumentation & Measurement Magazine, vol. 22, iss. 1, February 2019, pp. 19-27, doi: 10.1109/ MIM.2019.8633347. [22] T. D. Gillespie, Fundamentals of Vehicle Dynamics, Society of Automotive Engineers, Inc.: Warrendale, PA, USA, 1992. 20
VOLUME 16,
N° 3
2022
[23] E. Helmers, P. Marx, “Electric cars: technical characteristics and environmental impacts”, Environmental Sciences: Europe 24, 14, 2012, doi: 10.1186/2190-4715-24-14.
[24] G. Zhou “A Neural Network Approach to Fault Diagnosis for Power Systems”, Proceedings of TENCON ‘93. IEEE Region 10 International Conference on Computers, Communications and Automation, 19-21 October 1993, pp. 885-888, doi: 10.1109/TENCON.1993.320155.
[25] M. Nakonechnyi, O. Ivakhiv, T. Repetylo, I. Strepko, “Car speed control with neurocontroller”, Computer technology of printing, no. 33, 2015, pp. 18-27 (in Ukrainian). [26] S. Omatu, B. M. Khalid, R. Yusof, Neuro-Control and its Applications. Advances in Industrial Control, Springer Verlag: New York, 1996.
[27] W. T. Miller, R. S. Sutton, P. J. Werrbos. Neural Networks for Control. MIT Press: Cambridge, MA, 1990.
[28] M. Norgaard, O. Ravn, N. Poulsen, L. Hansen. Neural Networks for Modelling and Control of Dynamic Systems. Springer: London, 2000.
[29] Y. Hirnyak, O. Ivakhiv, M. Nakonechnyi, T. Repetylo, “Control System of Robot Movement”, IEEE 7th International Conference on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS), 12-14 September 2013, pp. 334337, doi: 10.1109/IDAACS.2013.6662700.
[30] M. Nakonechnyi, O. Ivakhiv, T. Repetylo,”Car Speed Control with Different Types of Controllers”, XI International Conference on Perspective Technologies and Methods in MEMS Design (MEMSTECH), 2-6 September 2015, pp. 72-74.
[31] S. M. Shinners, Modern Control System Theory and Design. John Wiley and-Sons, Inc. New York / Chichester / Brisbane / Toronto / Singapore, 1992. [32] B. C. Kuo, Automatic Control Systems. Prentice Hall, New Jersey, 1996. [33] C. L. Phillips, R. D. Harbor, Feedback Control Systems. Prentice Hall, Upper Saddle River, New Jersey, 2000.
[34] G. C. Goodwin, S. F. Graebe, M. E. Salgado, Control System Design. Prentice Hall, Upper Saddle River, New Jersey, 2001.
[35] M. H. Popovich, O. V. Kovalchuk, The theory of automatic control. Textbook. Lybid, Kyiv, 2007 (in Ukrainian). [36] K. Ogata, Modern Control Engineering, Pearson, 2010.
Journal of Automation, Mobile Robotics and Intelligent Systems
[37] J. Su, M. Nakonechnyi, O. Ivakhiv, A. Sachenko, “Developing the Automatic Control System Based on Neural Controller”, Information Technology and Control, vol. 44, no. 3, 2015, pp. 262-270, doi: 10.5755/j01.itc.44.3.7717. [38] M. Nakonechnyi, O. Ivakhiv, Y. Nakonechnyi, Neural Network Control Systems for Nonlinear Objects: Monograph. Raster - 7. Publishing House: Lviv, 2017 (in Ukrainian). [39] J. B. Dabney, T. L. Harman, Mastering Simulink 4. Prentice Hall, Upper Saddle River, New Jersey, 2001.
VOLUME 16,
N° 3
2022
[40] H. Demuth, M. Beale, Neural Network Toolbox for use with MATLAB. The Math-Works, Inc., Natick, 1992.
[41] M. Nakonechnyi, O. Ivakhiv, R. Velgan, M. Geraimchuk, O. Viter, “Investigation of the control law influence on the dynamic characteristics of vehicle movement control system model”, 21st International Conference on Research and Education in Mechatronics (REM), 9-11 December 2020, pp. 100-107, doi: 10.1109/REM49740.2020. 9313908.
21
VOLUME 16, N° 3 2022 Journal of Automation, Mobile Robotics and Intelligent Systems
ROBUST H∞ FUZZY APPROACH DESIGN VIA TAKAGI‐SUGENO DESCRIPTOR MODEL. APPLICATION FOR 2‐DOF SERIAL MANIPULATOR TRACKING CONTROL Submitted: 12th May 2022; accepted: 20th June 2022
Van Anh Nguyen Thi, Duc Binh Pham, Danh Huy Nguyen, Tung Lam Nguyen DOI: 10.14313/JAMRIS/3‐2022/21 Abstract: This paper focuses on trajectory tracking control for ro‐ bot manipulators. While much research has been done on this issue, many other aspects of this field have not been fully addressed. Here, we present a new solution using feedforward controller to eliminate parametric uncer‐ tainties and unknown disturbances. The Takagi‐Sugeno fuzzy descriptor system (TSFDS) is chosen to describe the dynamic characteristics of the robot. The combination of this fuzzy system and the robust H∞ performance makes the system almost isolated from external factors. The li‐ near matrix inequalities based on the theory of Lyapunov stability is considered for control design. The proposed method has proven its effectiveness through simulation results. Keywords: Tracking control, serial manipulator robot, fuzzy control, Takagi‐Sugeno, Lyapunov stability
1. Introduction The output regulation issue, often known as one of the core problems in control theory, involves follo‑ wing speci ied tracking signals and rejecting undesi‑ red disturbances in a dynamical system’s output while preserving closed‑loop stability. Up to now, nume‑ rous research investigations have been devoted to the tracking control using a fuzzy approach in literature [1, 2]. Let us mention for examples, the works [3] ap‑ plied on spacecraft system, [4] on robotic manipula‑ tor systems, [5] on stochastic synthetic biology sys‑ tems, [6] on air‑breathing hypersonic vehicle, [7] on servo motors, [8] on linear motor systems, [9] on to‑ wer cranes, [10] on nuclear reactor, [11] on memris‑ tive recurrent neural network, [12] on electrically dri‑ ven free‑ loating space manipulators, [13] on networ‑ ked control systems. Our goal is to propose a new de‑ sign framework in robot tracking control in order to ensure trajectory tracking and disturbance rejection. In the experimental environment, the in luence from the external disturbance on the system is inevita‑ ble. Researchers have also studied this issue very ca‑ refully, commonly using active disturbance rejection control (ADRC) controllers. Many applications utilized the ADRC controller and its variations have shown po‑ sitive results. For instance, a linear ADRC has been ap‑ plied in an Electro‑Mechanical Actuator (EMA) [14], an ADRC‑based backstepping control for fractional‑ order systems [15], and in sliding mode ADRC in the trajectory tracking of a quadrotor UAV [16]. Another approach, which is the same one used in this paper, is 22
to use H∞ performance. The paper presents the problem for the manipu‑ lator tracking control using Takagi‑Sugeno (T‑S) fuzzy approach [17–19]. In this paper, the T‑S model in des‑ criptor form has been introduced [18, 20] and is used in many robotics applications [21]. Although both the standard form and the descriptor form can describe the object model well, the latter shows the advantage in reducing the complexity of the description equa‑ tions. The trajectory tracking issue has been one of the foci in the controls ield for many decades, and in‑ tensive research on this topic has yielded productive results [1, 2, 13]. These researches solve the tracking problem with H∞ tracking performance [22–24]. Ne‑ vertheless, it is not possible to ensure tracking and rejection speci ications using the H∞ performance in the case of nonlinear closed loop systems [25]. As the objective is to ensure tracking and rejection speci i‑ cations for a nonlinear system, we propose the new control structure, which includes a feedback part and a feedforward one. The feedforward part holds a vi‑ tal role to reject reference input and the feedback part maintains the closed‑loop tracking error stability. If a suitable Lyapunov function is chosen, the closed sy‑ stem will be proved to be stable and hence use the li‑ near matrix inequalities (LMI) to rewrite the conditi‑ ons. The LMI matrix is built from pre‑existing conditi‑ onal equations and its calculation can be done through some simple programming steps. Once we have the feedback gains from the LMI, a distributed compen‑ sation controller (PDC) [26], which is commonly used in T‑S fuzzy system, is generated as the feedback part of the control structure. There are many researches using feedforward which do not bring high ef iciency in the tracking control problem. In the case without disturbance of a dynamical system, the robot tracking control is asymptotically stable. And in the case of any bounded disturbance, we can reduce a minimal value tracking error. This paper provides some major contributions: 1) Used descriptor equations for modelling the control object. It is also noticed that when using the descriptor model, the number of fuzzy rules is less than when utilizing the standard one. 2) This paper proposed new formulations to gene‑ rate a new control law that includes a feedforward part and a feedback part. Hence, the control signal now not only stabilizes the system but also can dismiss the dis‑ turbance of the reference model. 3) The problem of disturbance rejection for the fuzzy system is also handled using H∞ performance.
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
The paper is organized as follows: Section 2 menti‑ ons the T‑S fuzzy descriptor system with external dis‑ turbance. Problem formulation and a new control ap‑ proach are represented in section 3. Section 4 is asso‑ ciated with linear matrix inequality and PDC control‑ ler. Simulation results when applying the proposed control theory in the 2‑DoF manipulator are shown in section 5. Some conclusions of this paper are in section 6.
2. Fuzzy system description The T‑S fuzzy descriptor system can be derived as follows: Ev ẋ(t) = Ah x(t) + Bh u(t) + Dh w(t)
N° 3
2022
Remark 1. To reject the disturbance of the model w(t), the inequality of H∞ performance [27] is considered and has the following form: ∫ ∞
p(t)T p(t)dt 6 ζ 2
0
∫ ∞
w(t)T w(t)dt.
(11)
0
where p(t) is the desired control signal and ζ is in the set of real numbers. The augmented form of the system in (1) can be inferred in the following equations: E⋆ ẋ⋆ (t) = A⋆hv x⋆ (t) + B⋆h u(t) + D⋆h w(t) y(t) = C⋆h x⋆ (t)
(1)
(12) (13)
with y(t) = Ch x(t)
(2)
with
A⋆hv = Ev =
lle ∑
vk (z(t))Ek
(3)
B⋆h =
lri ∑
hi (z(t))Ai
(4)
C⋆h =
lri ∑
lri ∑
hi (z(t))Bi
(5)
lri ∑
hi (z(t))Di
(6)
hi (z(t))Ci
(7)
i=1
where in (1) and (2), x(t) is a state vector, u(t) represents control input, y(t) is the system output, and w(t) represents the additional disturbance. k ∈ flle ={1, 2, .., lle }, i ∈ flri ={1, 2, .., lri }. The variable z(t) is the premise vector which consists of premise variables. If vector z(t) has n elements, which means there are n premise variables, the system will have 2n fuzzy rules in total. hi (z(t)) and vk (z(t)) are mem‑ bership functions in the right sides and the left one of these above equations, respectively. Also note that the number of fuzzy rules in the right‑hand parts of (1) is lri and in the left‑hand parts is lle . The membership functions can be calculated as follows: ∏
log2 (lri )
hi (z(t)) =
hi (z(t))B⋆i
(15)
lri ∑
hi (z(t))C⋆i
(16)
D⋆h =
lri ∑
hi (z(t))D⋆i
(17)
i=1
i=1
Ch =
lri ∑
i=1
i=1
Dh =
(14)
i=1
i=1
Bh =
hi (z(t))vk (z(t))A⋆ik
i=1 k=1
k=1
Ah =
lle lri ∑ ∑
k
(8)
k
(9)
wj j (zj (t)).
j=1
∏
[ ] [ ] x(t) I 0 where x⋆ (t) = , E⋆ = , A⋆ik = ẋ(t) 0 0 [ ] [ ] [ ] 0 I 0 0 ⋆ ⋆ , Bi = , Di = and C⋆i = A −E B D i k i i [ ] Ci 0 . Remark 2. Although (1) and (12) are two similar TSFDSs, the conversion from (1) to (12) will make the process of calculating and proving the formulas below more convenient. Many studies have also applied this transformation [28, 29]. Assumption 2. The descriptor reference model can be presented in the following form: Er ẋr = Ar xr + Br r.
(18)
This model can be used as a sample trajectory for the tracking control problem. In the same way to achieve (12), the T‑S fuzzy system (18) are rewritten as:
log2 (lle )
vk (z(t)) =
wj j (zj (t)).
E⋆ ẋ⋆r (t) = A⋆r x⋆r (t) + B⋆r r(t)
(19)
j=1
where Assumption 1. There exists a positive number ϑ as the constraint for the disturbance function w(t): ∥w(t)∥ 6 ϑ.
(10)
[ ] [ xr (t) I ⋆ ,E = ẋ (t) 0 r ]
x⋆r (t) = [
B⋆r =
] [ 0 0 ⋆ , Ar = 0 Ar
] I , −Er
0 . Br
23
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
3. Problem formulation and control In this section, some transformations have been done to generate new formulations and ensure refe‑ rence tracking control for the system (1). From (12) and (19), we have: E⋆ ė⋆ (t) = A⋆hv e⋆ (t) + B⋆h u(t) + D⋆h w(t) (20) +(A⋆hv − A⋆r )x⋆r (t) − B⋆r r(t) [ ] [ ] e(t) x(t) − xr (t) where e⋆ (t) = = , in which e(t) ė(t) ẋ(t) − ẋr (t) is the tracking error that needs converging to zero. Remark 3. The control input u(t) can be synthesized from two components: the feedforward signal uf f and the feedback signal uf b in Fig. 1. u(t) = uf f + uf b .
(21)
The main objective of the feedforward part is to re‑ ject the reference input and the feedback one is used to maintain the closed‑loop tracking error stability. From (20) and (21): [ ] [ ] [ ] 0 I 0 0 E⋆ ė⋆ = e⋆ + uf f + uf b Ah −Ev Bh Bh [ ] [ ][ ] 0 0 0 xr + w+ Dh Ah − Ar Er − Ev ẋr [ ] 0 − (22) r. Br Assumption 3. In order to have reference input re‑ jected, we assume that: [ ] [ ][ ] [ ] 0 0 0 xr 0 uf f + − r Bh Ah − Ar Er − Ev ẋr Br = 0. (23) The feedforward part now can be inferred as follows: uf f = B−1 h (Ev ẋr − Ah xr ).
(24)
Substituting (24) to (20): ⋆ ⋆ E⋆ ė⋆ (t) = Ahv e (t) + B⋆h uf b (t) + D⋆h w(t).
(25)
It is obvious that the system (25) has a similar form to the TSFDS in (1), then the following parallel distribu‑ ted compensation (PDC) control law can be conside‑ red: uf b (t) = −F⋆hv e⋆ (t)
(26)
4. Main result The stability of the system is an extremely impor‑ tant factor and must be satis ied in the design of the controller. For a fuzzy system, LMI is an effective so‑ lution to ind the stability conditions. From there, the LMI‑based control gains for the 2‑DoF robot can be calculated. Theorem 1. The closed‑loop descriptor system in (28) with the feedback control signal uf b (t) = −F⋆hv e⋆ (t) is asymptotically stable if there exists matrices P3 , P4 , Mik in appropriate dimensions and a positive matrix P1 such that:
Ξii Φiik T D i Ψiik = 0 P1 0
⋆ Πiik 0 0 0 P4
⋆ ⋆ −ζ 2 I 0 0 0
⋆ ⋆ ⋆ −ζ 2 I 0 0
⋆ ⋆ ⋆ ⋆ −I 0
⋆ ⋆ ⋆ <0 ⋆ ⋆ −I (29)
where Ξii = −P3 −P⊤ 3 +2αP1 , Φiik = Ai P1 −Bi Mik + ⊤ Ek P3 + P⊤ , Π = −E iik k P4 − (Ek P4 ) . Furthermore, 4 the control gains of the PDC controller (26) can be com‑ puted as follows: Fik = Mik P−1 (30) 1 . [ ] P1 0 Proof. Let P = . −P3 P4 Consider the following Lyapunov function candi‑ date: V (e⋆ ) = e⋆⊤ E⋆ P−1 e⋆
F⋆hv =
lle lri ∑ ∑
hi (z(t))vk (z(t))F⋆ik
(27)
i=1 k=1
[ ] where the local control gains F⋆ik = Fik 0 are to be designed. The closed‑loop descriptor T‑S fuzzy system in ex‑ tended form can be rewritten: E⋆ ė⋆ (t) = (A⋆hv − B⋆h F⋆hv )e⋆ (t) + D⋆h w(t).
(28)
(31)
Based on the de ined matrices E⋆ and P, (31) has the time‑derivative as follows: V̇ (e⋆ ) = ė⋆⊤ E⋆⊤ P−1 e⋆ + e⋆⊤ (P−1 )⊤ E⋆ ė⋆
with:
24
Fig. 1. The block diagram of fuzzy controller for the TSFDS
(32)
Then [ ]⊤ V̇ (e⋆ ) = (A⋆hv − B⋆h F⋆hv )e⋆ + Dw P−1 e⋆ [ ] + e⋆ ⊤ (P−1 )⊤ (A⋆hv − B⋆h F⋆hv )e⋆ + Dw = e⋆ ⊤ [(A⋆hv − B⋆h F⋆hv )⊤ P−1 + (P−1 )⊤ (A⋆hv − B⋆h F⋆hv )]e⋆ + w⊤ D⊤ P−1 e⋆ + e⋆ ⊤ (P−1 )⊤ Dw
(33)
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
Since the control objective is for the tracking errors to converge at zero, the p(t) term in (11) now equals to e⋆ . Then the stability condition of the closed‑loop sy‑ stem (28) can be inferred as follows: V̇ (e⋆ ) + e⋆⊤ e⋆ − ζ 2 w⊤ w < −2αV (e⋆ ) Some simple computation leads to ξˆ ⋆ ⋆ Ψiik = D⋆⊤ −ζ 2 I 0 i ⋆ Cz P 0 −I
(34)
(35)
⋆ ξˆ = (Ahv − B⋆h F⋆hv )⊤ P−1 + αE⋆ P−1
+ (P
)
(A⋆hv − B⋆h F⋆hv ) + α(P−1 )⊤ E⋆ < 0
(36)
(39)
0.6 0.58 0.56 0.54
P⊤ (A⋆hv − B⋆h F⋆hv )⊤ + αP⊤ E⋆
0.52
(37) 0.5
then A⋆hv P − B⋆h F⋆hv P + αE⋆ P + A⋆hv ⊤ P⊤ − P⊤ F⋆hv ⊤ B⋆h ⊤ + αP⊤ E⋆ < 0
instance, one can perform E as follows: 1 0 0 0 0 1 0 0 E1 = 0 0 α + 2m2 r1 L2 z5max β + m2 L1 r2 z5max 0 0 β + m2 L1 r2 z5max β 1 0 0 0 0 1 0 0 E2 = 0 0 α + 2m2 r1 L2 z5min β + m2 L1 r2 z5min 0 0 β + m2 L1 r2 z5min β
0.62
Consider the expression in the right side of (36), multi‑ plying it on the left and right by P⊤ and P, respectively, we get:
+ (A⋆hv − B⋆h F⋆hv )P + αE⋆ P < 0
2022
The remaining A matrices will be represented in the same way.
where C⋆z = [I 0 0 0] and note that:
−1 ⊤
N° 3
0.48 0.46
(38)
0.44 0.28
0.3
0.32
0.34
0.36
0.38
0.4
0.42
Fig. 2. Trajectory tracking
5. Illustrative Results and Discussions In this paper, one applied the H∞ performance and the new fuzzy control in the form of T‑S descriptor system for the 2‑DoF robot. The model of the robot and its parameters was referred from a study [30]. Dyna‑ mic equations of this manipulator were converted to the form descriptor model as (1) with these following matrices: 1 0 0 0 0 1 0 0 E= 0 0 α + 2m2 r1 L2 z5 (x) β + m2 L1 r2 z5 (x) 0 0 β + m2 L1 r2 z5 (x) β 0 0 1 0 0 0 0 1 A= z3 (x) z4 (x) 2z1 (x) − fv1 z1 (x) z4 (x) z4 (x) z2 (x) −fv2 0 0 0 0 [ ] 0 0 0 0 1 0 0 0 B= 1 0 ,C = 0 1 0 0 ,D = 1 0 0 1 0 1 [ ]⊤ with x = θ1 θ2 θ˙1 θ˙2 , α = m1 r12 +I1 +m2 L21 + m2 r22 + I2 , β = m2 r22 + I2 , z1 (x) = m2 L1 r2 θ˙2 sinθ2 , z2 (x) = −m2 L1 r2 θ˙1 sinθ2 , z3 (x) = −(m1 gr1 + sinθ12 sinθ12 1 m2 gL1 ) sinθ θ1 − m2 gr2 θ12 , z4 (x) = −m2 gr2 θ12 , z5 (x) = cos(θ2 ). Matrices E and A accordingly have 1 and 4 varia‑ ble z, then they have 2 and 16 rules, respectively. For
The sample trajectories are designed so that the end effector moves sideways, then perpendicular, and inally into a circle. From Fig. 2, it is clear that the si‑ mulated trajectory achieved coincides with the sam‑ ple trajectory. Using inverse kinematics, it is possi‑ 0.3 0.25 0.2 0.15 0.1 0.05 0 -0.05 -0.1 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fig. 3. Angular position of Joint 1 ble to construct sample angular and velocity trajecto‑ ries for two joint angles. Simulation results in Figs. 3 and 4 have shown that in order for the end effector to move along the set path, the two joints of the robot also change almost identically with the results calculated from the reverse kinematics. Figs. 5 and 6 clearly show 25
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
2.1
1.5
2
1
1.9
N° 3
2022
10-3
0.5
1.8 0
1.7 -0.5
1.6
-1
1.5 1.4 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-1.5 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fig. 5. Tracking errors of angular position at Joint 1
Fig. 4. Angular position of Joint 2 the orbital tracking ability of the system. The magni‑ tude of the errors is also speci ied through Root Mean Square Error (RMSE). The errors of the variables are calculated in detail and presented in Tab. 1 . Even with the effect of the disturbance, the errors in angles of two joints are tiny with only 4.6657 × 10−4 (rad) and 4.8859 × 10−4 (rad), respectively. The PDC controller does an excellent job in stabilizing the whole system, and the feedforward component also shows strength in eliminating reference disturbance.
10-4
12 10 8 6 4 2 0
Error θ1 (rad) θ2 (rad) θ˙1 (rad/s) θ˙2 (rad/s)
RMSE 4.6657 × 10−4 4.8859 × 10−4 0.0176 0.0361
Tab. 1. Root Mean Square Errors As seen in Figs. 7 and 8, the travel velocities of the joints also follow the calculated velocities. The devia‑ tion between the actual and the reference is minimal and is close to zero, see Figs. 9 and 10. Tab. 1 also gi‑ ves the RMSE of the two‑joint velocity with error in the irst joint is 0.0176 (rad/s) and in the second one is 0.0361 (rad/s). The oscillation torque characteristic of both joints in Fig. 11 is not too large, indicating that the system will not jerk during motion. At the same time, it also shows that this control method is suitable for the ac‑ tual model and can be applied well in controlling ro‑ botic mechanisms.
-2 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fig. 6. Tracking errors of angular position at Joint 2
2.5 2 1.5 1 0.5 0 -0.5 -1 -1.5 -2 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fig. 7. Velocity profile of Joint 1
6. Conclusions This study was conducted with the primary ob‑ jective of trajectory tracking control for the robot ma‑ nipulators. With many advantages over the conventi‑ onal T‑S fuzzy system, the T‑S fuzzy descriptor model was chosen to describe the dynamic behaviour of the control object. The article also considers external fac‑ tors such as the disturbance of the model and uses H∞ performance to handle that problem. The novelty of 26
this paper is the replacement of a common control‑ ler into two separate controllers with different functi‑ ons, respectively. The feedforward controller is inten‑ ded to remove in luences of the reference model, and the feedback controller is utilized to stabilize the sy‑ stem. By a few simple transformations, the feedfor‑ ward component can be easily deduced. Meanwhile, the feedback part, which is the PDC controller, is de‑
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
3
N° 3
2022
250 200
2
150 1
100
0
50 0
-1
-50 -2
-100
-3
-150 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fig. 8. Velocity profile of Joint 2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fig. 11. Torques at Joints 1 and 2
0.06
AUTHORS
0.04 0.02 0 -0.02 -0.04 -0.06 -0.08 -0.1 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fig. 9. Tracking errors of velocity profile at Joint 1
Van Anh Nguyen Thi∗ – Hanoi University of Science and Technology, Dai Co Viet, Ha Noi, Viet Nam, e‑mail: anh.nguyenthivan1@hust.edu.vn. Duc Binh Pham – Hanoi University of Science and Technology, Dai Co Viet, Ha Noi, Viet Nam, e‑mail: binh.pd181343@sis.hust.edu.vn. Danh Huy Nguyen – Hanoi University of Science and Technology, Dai Co Viet, Ha Noi, Viet Nam, e‑mail: huy.nguyendanh@hust.edu.vn. Tung Lam Nguyen – Hanoi University of Science and Technology, Dai Co Viet, Ha Noi, Viet Nam, e‑mail: lam.nguyentung@hust.edu.vn. ∗
Corresponding author
ACKNOWLEDGEMENTS This research is funded by Hanoi University of Science and Technology (HUST) under project number T2021‑ TT‑002.
0.1 0.05
REFERENCES
0
[1] X. Yuan, B. Chen, and C. Lin, “Fuzzy adap‑ tive output‑feedback tracking control for nonli‑ near strict‑feedback systems in prescribed inite time,” Journal of the Franklin Institute, vol. 358, no. 15, pp. 7309–7332, 2021.
-0.05 -0.1 -0.15 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fig. 10. Tracking errors of velocity profile at Joint 2
signed based on the Lyapunov and LMI stability condi‑ tions. We have obtained many successful results by ap‑ plying those control theories to the 2‑DoF robot model and performing simulations. Not only is the trajectory of the end effector almost wholly coincident with the sample trajectory, but also such components as the po‑ sitions or velocities of the joints are strictly followed.
[2] D. Cui, Y. Wang, H. Su, Z. Xu, and H. Que, “Fuzzy‑ model‑based tracking control of markov jump nonlinear systems with incomplete mode infor‑ mation,” Journal of the Franklin Institute, vol. 358, no. 7, pp. 3633–3650, 2021. [3] A. Li, M. Liu, and Y. Shi, “Adaptive sliding mode attitude tracking control for lexible spacecraft systems based on the takagi‑sugeno fuzzy mo‑ delling method,” Acta Astronautica, vol. 175, pp. 570–581, 2020. [4] X. Yin, L. Pan, and S. Cai, “Robust adaptive fuzzy sliding mode trajectory tracking control for serial robotic manipulators,” Robotics and 27
Journal of Automation, Mobile Robotics and Intelligent Systems
Computer‑Integrated Manufacturing, vol. 72, p. 101884, 2021. [5] B.‑S. Chen, C.‑H. Chang, and H.‑C. Lee, “Robust synthetic biology design: stochastic game the‑ ory approach,” Bioinformatics, vol. 25, no. 14, pp. 1822–1830, 05 2009. [6] X. Cheng, P. Wang, and G. Tang, “Fuzzy‑ reconstruction‑based robust tracking control of an air‑breathing hypersonic vehicle,” Aerospace Science and Technology, vol. 86, pp. 694–703, 2019. [7] Y. Liu, Z. Wang, Y. Wang, D. Wang, and J. Xu, “Cas‑ cade tracking control of servo motor with ro‑ bust adaptive fuzzy compensation,” Information Sciences, vol. 569, pp. 450–468, 2021. [8] G. Sun, Z. Ma, and J. Yu, “Discrete‑time fractional order terminal sliding mode tracking control for linear motor,” IEEE Trans. Ind. Electron., vol. 65, no. 4, pp. 3386–3394, 2018. [9] H. Ouyang, Z. Tian, L. Yu, and G. Zhang, “Adaptive tracking controller design for double‑pendulum tower cranes,” Mechanism and Machine Theory, vol. 153, p. 103980, 2020. [10] A. Aftab and X. Luan, “A fuzzy‑pid series feed‑ back self‑tuned adaptive control of reactor po‑ wer using nonlinear multipoint kinetic model under reference tracking and disturbance re‑ jection,” Annals of Nuclear Energy, vol. 166, p. 108696, 2022. [11] G. Bao, Z. Zeng, and Y. Shen, “Region stability ana‑ lysis and tracking control of memristive recur‑ rent neural network,” Neural Netw., vol. 98, pp. 51–58, 2018. [12] L. Li, Z. Chen, and Y. Wang, “Robust task‑space tracking for free‑ loating space manipulators by cerebellar model articulation controller,” Eme‑ rald Publishing Limited, vol. 39, pp. 26–33, 2019. [13] Y. Pan and G.‑H. Yang, “Event‑based output tracking control for fuzzy networked control sys‑ tems with network‑induced delays,” Applied Mat‑ hematics and Computation, vol. 346, pp. 513– 530, 2019. [14] C. Liu, G. Luo, Z. Chen, W. Tu, and C. Qiu, “A linear adrc‑based robust high‑dynamic double‑loop servo system for aircraft electro‑mechanical ac‑ tuators,” Chinese Journal of Aeronautics, vol. 32, no. 9, pp. 2174–2187, 2019. [15] F. Doostdar and H. Mojallali, “An adrc‑based backstepping control design for a class of fractional‑order systems,” ISA Transactions, 2021. [16] Y. Zhang, Z. Chena, M. Suna, and X. Zhangb, “Tra‑ jectory tracking control of a quadrotor uav ba‑ sed on sliding mode active disturbance rejection control,” Nonlinear Analysis: Modelling and Cont‑ rol, vol. 24, no. 4, pp. 545–560, 2019. [17] J.‑J. Yan, G.‑H. Yang, and X.‑J. Li, “Fault detection in inite frequency domain for t‑s fuzzy systems 28
VOLUME 16,
N° 3
2022
with partly unmeasurable premise variables,” Fuzzy Sets and Systems, vol. 421, pp. 158–177, 2021. [18] C. Han, G. Zhang, L. Wu, and Q. Zeng, “Sliding mode control of t–s fuzzy descriptor systems with time‑delay,” Journal of the Franklin Institute, vol. 349, no. 4, pp. 1430–1444, 2012. [19] W. Zheng, H. Wang, H. Wang, and S. Wen, “Stabi‑ lity analysis and dynamic output feedback con‑ troller design of t–s fuzzy systems with time‑ varying delays and external disturbances,” Jour‑ nal of Computational and Applied Mathematics, vol. 358, pp. 111–135, 2019. [20] J. Wang, S. Ma, and C. Zhang, “Finite‑time H∞ control for t–s fuzzy descriptor semi‑markov jump systems via static output feedback,” Fuzzy Sets and Systems, vol. 365, pp. 60–80, 2019. [21] H. Schulte and K. Guelton, “Descriptor modelling towards control of a two link pneumatic robot manipulator: A t–s multimodel approach,” Non‑ linear Analysis: Hybrid Systems, vol. 3, no. 2, pp. 124–132, 2009. [22] H.‑N. Wu, Z.‑P. Wang, and L. Guo, “Distur‑ bance observer based reliable H∞ fuzzy attitude tracking control for Mars entry vehicles with ac‑ tuator failures,” Aerospace Science and Techno‑ logy, vol. 77, pp. 92–104, 2018. [23] H.‑N. Wu, S. Feng, Z.‑Y. Liu, and L. Guo, “Dis‑ turbance observer based robust mixed H2 /H∞ fuzzy tracking control for hypersonic vehicles,” Fuzzy Sets and Systems, vol. 306, pp. 118–136, 2017. [24] J. Dong and S. Wang, “Robust H∞ ‑tracking cont‑ rol design for t–s fuzzy systems with partly im‑ measurable premise variables,” Journal of the Franklin Institute, vol. 354, no. 10, pp. 3919– 3944, 2017. [25] G. Scorletti, V. Fromion, and S. De Hillerin, “To‑ ward nonlinear tracking and rejection using lpv control,” IFAC‑PapersOnLine, vol. 48, no. 26, pp. 13–18, 2015. [26] R. Maiti, K. Das Sharma, and G. Sarkar, “Li‑ near consequence‑based fuzzy parallel distribu‑ ted compensation tl1 adaptive controller for two link robot manipulator,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 10, pp. 3978–3990, 2019. [27] E. F. Stephen Boyd, Laurent El Ghaoui and V. Ba‑ lakrishnan, “Linear matrix inequalities in system and control theory,” pp. 154–155, 1994. [28] J. He, F. Xu, X. Wang, and B. Liang, “Admissibility analysis and robust stabilization via state feed‑ back for uncertain t‑s fuzzy descriptor systems,” pp. 1–8, 2020. [29] J. Li, Q. Zhang, X.‑G. Yan, and S. K. Spurgeon, “Observer‑based fuzzy integral sliding mode control for nonlinear descriptor systems,” IEEE
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
Transactions on Fuzzy Systems, vol. 26, no. 5, pp. 2818–2832, 2018. [30] K. Lochan and B. K. Roy, “Control of two‑link 2‑ dof robot manipulator using fuzzy logic techni‑ ques: A review,” Proceedings of Fourth Interna‑ tional Conference on Soft Computing for Problem Solving, vol. 335, pp. 499–511, 2014.
29
VOLUME 16, N° 3 2022 Journal of Automation, Mobile Robotics and Intelligent Systems
Integrated and Deep Learning–Based Social Surveillance System: a Novel Approach Submitted: 6th April 2022; accepted: 3rd May 2022
Ratnesh Litoriya, Dev Ramchandani, Dhruvansh Moyal, Dhruv Bothra DOI: 10.14313/JAMRIS/3-2022/22 Abstract: In industry and research, big data applications are gaining a lot of traction and space. Surveillance videos contribute significantly to big unlabelled data. The aim of visual surveillance is to understand and determine object behavior. It includes static and moving object detection, as well as video tracking to comprehend scene events. Object detection algorithms may be used to identify items in any video scene. Any video surveillance system faces a significant challenge in detecting moving objects and differentiating between objects with same shapes or features. The primary goal of this work is to provide an integrated framework for quick overview of video analysis utilizing deep learning algorithms to detect suspicious activity. In greater applications, the detection method is utilized to determine the region where items are available and the form of objects in each frame. This video analysis also aids in the attainment of security. Security may be characterized in a variety of ways, such as identifying theft or violation of covid protocols. The obtained results are encouraging and superior to existing solutions with 97% accuracy. Keywords: Video Surveillance, object detection, object tracking, YOLO v4 algorithm, OpenCV
1. Introduction In this day and age, people have begun to rely more on technologies that are called smart, m eaning that they can operate and learn on their own at the command of humans and do the needed duties. We have achieved this new step with the assistance of computer vision and deep learning. In planning, operation, and sustainability of contemporary industrial and urban areas, video surveillance is a major factor. The efficiency, safety, security, and optimality of the region, infrastructures, persons, operations, and activities are all aided by video surveillance [1]. Autonomous equipment, cyber-physical systems, and energy- efficient architectures are b ecoming more common in industrial settings. With the rising use of multi-level structures and greater traffic, pedestrian, and crowd movements, urban landscapes are becoming more densely inhabited. In both industrial 30
and urban contexts, this vertical and horizontal growth of asset and area utilization has led to considerable growth in the implementation of closed-circuit television (CCTV) camera systems to ensure the s afety of humans or assets and surveillance of activities. We have begun to perform analysis on photos which was previously confined to textual data. However, analyzing static images was no longer sufficient; we now need to examine videos using an approach similar to that used for still images. Giving a live video feed a real-time output is a new difficulty in the industry. With the use of CCTV, one can watch an area 24/7, or the video may be retrieved when needed provided it is kept in a secure location. It can be used to prevent crime and assist law enforcement in identifying and solving crimes. With the help of YOLO, a new a pproach opens to perform detection with an extremely quick architecture, real-time image computation, open-source computer vision (OpenCV) software, and a powerful library of image processing tools. In our study, we have taken up that challenge and attempted to apply it as much as possible to the field of surveillance in order to tackle the challenges of the modern world. We’ve been looking for abnormalities in real-time footage. In our research, we selected three real-world anomalies to identify: violations of social distance norms, face mask usage monitoring, and theft detection to determine if a valuable object is being taken in real time and from where in the frame. One of the main worries of today’s citizens is security. In our system, we’ve combined three functionalities that have proven to be important. In order to try and address some problems of the new common world, we built a working prototype of a multi-purpose smart surveillance system that can be used to detect multiple anomalies in real time and can be used anywhere, from a crowded public space to a quiet museum with a reasonable number of people in the room. The proposed system is less complex and more accurate, as compared to the existing solutions. This article also includes a literature survey with information about past research in this field. Methodology includes the implementation phase of this research, with the diagrammatic approach used in the Result and Discussion section, which includes all the outcomes drawn by the research. Our conclusion includes a summary of the research made.
2022 ® Litoriya et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)
Journal of Automation, Mobile Robotics and Intelligent Systems
2. Literature Review Our literature search for intelligent systems knowledge, relatable or preceding working prototypes, and algorithms for human and feature recognition was conducted, and the various papers mentioned were found to be very helpful in understanding the problems and the solution of those problems in the context of algorithms and the proposed system. These works appeared publications such as IEEE, Springer, Elsevier, Willey, and Cambridge Press, to mention a few. Papers based on deep neural networks, computer networks, the YOLO algorithm, social distance identification, face mask detection, and video analysis were the focus of our search. In the field of video analysis and object detection, many researchers have done a substantial amount of work and shared their findings with the world. Some of these are discussed below. Redmon et al. [2] proposed a system using the YOLO algorithm. Classifiers developed for object detection are repurposed for per-form detection. Their design was lightning quick: at 45 frames per second, their YOLO model analyzes pictures in real time. You just need to glance at an image once to guess what things are present and where they are with YOLO. YOLO is a simple and quick algorithm that uses complete photos to train the detection process. It doesn’t employ a complicated pipeline. It uses a neural network to anticipate detections on a fresh picture at test time, which allowed us to handle live streams. The model was simple to build and can be t rained on complete photos. YOLO is the fastest object classification detector in the literature, and it pushes the boundaries of real-time object detection. YOLO is also adaptable to new domains, making it excellent for applications that need quick and reliable object recognition. Suwarna Gothane [3] says in his research that they can quickly detect and identify the items of our attention when looking at photographs or movies. Object detection training is the process of passing this knowledge to computers. The YOLO model is quite precise and can recognise the things in the picture. YOLO takes an entirely new approach. Instead of focusing on individual regions, it employs a neural network to predict anchor boxes and their probability throughout the whole image [4]. YOLO uses a single deep neural network that divides the input image into grid size. Unlike image classification or face detection, each grid size in the YOLO algorithm has a related matrix in the output that informs us if a result is found in that grid size and class of that object. Gupta and Devi [5] made a study of the YOLOv2 method that is proposed for the identification of objects in pictures with geolocation and video recordings. The primary goal of this study is to detect things in real time, that is, live identification, utilizing a camera and video recordings. COCO, a dataset with 80 classes, was employed in this work [6]. Using the YOLOv2 model, it is simple to recognize items with grids and boundary prediction, and it also aids in predicting exceptionally small things or objects that are
VOLUME 16,
N° 3
2022
far away in the image. Darknet makes it simpler to recognize moving objects in video recordings, and it generates .avi files containing detections. Jayashri et al. [7] suggest a real-time system for monitoring social distance and avoiding congestion by employing the concept of suggested critical s ocial density. The team is committed to providing innovative, strategic advancements that protect persons and networks. Under the present circumstances of the COVID-19 pandemic, this work has practical value. The pipeline is capable of detecting persons with, without, and incorrectly wearing coverings with reasonable accuracy. Gupta et al. [8] proposes a simpler way to accomplish this goal by utilizing some fundamental deep learning tools such as TensorFlow, Keras, and OpenCV. The suggested technology successfully recognizes the face in the image/video stream and determines whether or not it is wearing a mask. It can recognize a face and a mask in motion as a surveillance task performance. Optimal parameter settings need to be determined for the CNN model in order to identify the existence of masks accurately. Kakadiya et al. [9] suggested a system using deep learning to establish a smart camera that observes bank activities and can identify any suspicious conduct. Criminals can be followed based on mobility and weapon presence. The SmartCam immediately transmits a message to the safety committee if any suspicious weapons or actions are observed. The message specifies the sort of warning that has been issued, as well as the type of weapon and number of weapons identified, as well as a web link to a live picture that may be viewed by security personnel. Patil et al. [10] proposed that theft is among the most widespread criminal activities, and it is on the ascent. It has become one of the world’s never-ending issues. Their study used a sophisticated algorithm like CNN which provides an advantage over more standard algorithms such as RNN, SVM, and others. This program correctly recognizes emotional expressions with a higher percentage of accuracy. The Keras toolbox is used for this. They arrived at the above- mentioned makeshift model after experimenting with various layer combinations and iterations. Chandan G. et al. [11] suggest an approach using the SSD method. In real-time applications, the SSD method is used to identify objects. SSD has also demonstrated outcomes with a high level of confidence. The main goal of the SSD method is to recognize and track numerous objects in a real-time video stream. This model performed admirably on the object t rained on in terms of detection and tracking, and it may be used in certain circumstances to identify, track, and respond to specifically targeted objects in video surveillance. Kumar et al. [12] suggest that for traffic and surveillance applications, object detection algorithms such as You Only Look Once (YOLOv3 and YOLOv4) be used. An input layer with at least one hidden layer and an output layer make up a neural network. Multiple object detection in surveillance cameras is a difficult
31
Journal of Automation, Mobile Robotics and Intelligent Systems
task that is influenced by the density of items in the monitoring area or on the road, as well as timings. The multiple object detection technique implemented in this work is useful for traffic and various surveillance applications. The dataset is made up of pictures and videos with different levels of light. The system efficiently recognizes several items with high accuracy, according to the results. Bochkovskiy et al. [13] stated in their research that there are a slew of factors that are thought to increase the accuracy of convolutional neural networks (CNNs). Theoretical explanation of the conclusion, as well as experimental evaluation of combinations of such characteristics on large datasets, is required. Some characteristics, such as batch normalization and residual connections, are appropriate for most models, tasks, and datasets, while others are only suitable to particular models and issues, or only for small-scale datasets. Weighted-residual-connections (WRC), cross-stage-partial-connections (CSP), cross mini-batch normalization (CmBN), self-adversarial-training (SAT), and Mish-activation are all assumed to be universal properties. As can be seen from the preceding literature study, our fellow researchers saw the need and worked diligently to meet it. Some compared a variety of algorithms in order to arrive at a speedier answer, while others attempted novel techniques to get better results. By reading these, one can get a solid sense of how things work when it comes to recognizing abnormalities in real time and comprehending the obstacles that come with it. Despite having amazing methodologies, observations, and findings, reading the above work reveals that none of our predecessors worked with or attempted to merge multiple-use cases to
Fig. 1. Block diagram of methodology 32
VOLUME 16,
N° 3
2022
create a technology that would be multifunctional and adaptable to the demands of the modern world. In our research, we attempted to achieve exactly that, creating a multifunctional smart CCTV to meet today’s demands by analyzing past practices, and developing our own practices that work well together to become an effective method with real-time output.
3. Methodology
The suggested research is built using Python 3, OpenCV, and the flask framework of Python. Using OpenCV’s machine learning methods, we can educate the machine to discern between diverse user use cases and unauthorized offenders’ unique undesired conduct, allowing us to take appropriate action based on the context. We are able to effectively use logic to execute the artificial intelligence idea at hand to recognize and classify the events that occur using image processing strategies and mathematical deductions. Additionally, the system is capable of taking action in response to the current occurrence [14]. The primary goal of this system is to analyze collected video footage for human detection and then further analyze it for any anomalies. An anomaly can be anything the user sets the software to detect. A breach of social distancing laws, a person not wearing a mask, or the detection of theft on video footage could all be considered anomalies for this research. The procedure begins by scanning each frame of a video stream one by one. This is depicted in Fig. 1, and is also depicted in the block diagram showing the entire sequence of actions. The object detection framework is the most essential aspect of this research. This is due to the study’s
Journal of Automation, Mobile Robotics and Intelligent Systems
component that focuses on establishing a person’s position from the input frame. As a result, selecting the most appropriate object detection model is critical in order to prevent any issues with recognizing people [15]. Monitoring: As soon as the system is turned on, its initial inclination is to scan its surroundings for any movement that could occur in the situation under consideration. The primary goal of motion analysis is to minimise large duplicate activity storage. The recording of the movement begins as soon as the camera detects any unknown creature approaching the target. Masking Frame: Masking is an image processing technique in which a tiny picture fragment is defined and used to affect a bigger image. Setting part of the pixel values of an image to zero and another backdrop value is known as masking [16]. The picture will be isolated. For example, a video is a collection of images that are played in a specific order over a period of time. To build ROI for each frame of the input frame, the OpenCV masking approach will be employed in this study. Motion Detection: The surveillance system stays silently watching until it detects an unknown creature coming, at which point the camera begins recording the scene in question, which includes a clear view of the item.
3.1. Detecting Anomaly
3.1.1. Theft Detection After the motion is detected, we save the frame just before the motion, and the next static frame after the motion is also taken. Now both the frames are converted to greyscale and blurred to make a comparison. The photos are converted to greyscale since less information is required for each pixel. Converting to greyscale also divides the luminance and chrominance planes. Luminance is more crucial for identifying visual characteristics in a picture. The image will be blurred: the picture is blurred by using a low-pass filter kernel to convolve it. This can be used to reduce noise. We’ll achieve it with the picture blurring averaging approach, which involves convolving an image with a normalized box filter. It simply replaces the core element with the average of all pixels under the kernel region. Then we calculate the image similarity score using the structural similarity function of skimage. The mean squared error (MSE) is a straightforward way to compare photos, but it isn’t a good indicator of perceived resemblance. By taking texture into consideration, structural similarity functions seek to remedy this problem. If the similarity score is more than the desired threshold, then we classify that nothing is stolen. On the other hand, if the similarity score is less than the desired threshold then it suggests that the two frames have structured dissimilarity and something is stolen or missing. Furthermore, in the second case, we again use the grey-scale images, apply thresholding to them, and construct a rectangular box where the dissimilarity exists.
VOLUME 16,
N° 3
2022
3.1.2. Face Mask Detection If the input is a video stream, the picture or a frame of the video is initially delivered to the default face detection module for detection of human faces. This is accomplished by first enlarging the picture or video frame, then identifying the blob inside it [17]. The face detector model receives this identified blob and outputs just the cropped human face without the backdrop. This face is used as model input that we previously trained. This determines whether or not a mask is present. [18]. To implement the face mask detection function, we will utilize convolution neural networks to train our model, with one exception: we will skip the convolution layer of the feature map and replace it with MobileNets. MobileNets are low-latency, low-power models that have been parameterized to match the resource restrictions of various use cases. So, after converting the input picture to an array, we’ll send it to MobileNets. Furthermore, we will perform max pooling, which is a pooling procedure that determines the maximum value for patches of a feature map and utilizes that value to produce a down-sampled (pooled) feature map. We’ll next flatten it to create a completely linked layer that we’ll utilize to generate output. We chose MobileNets since they have been shown to be quicker than traditional convolutional neural networks in terms of processing speed and parameter use. Although MobileNets appears to be a good option, it has its own drawbacks. They are occasionally less accurate, but for our purposes of creating a model to utilize in real time, they have proven to be more effective because we preferred speed over minor accuracy.
3.1.3. Social Distancing Detection
To detect humans in the frame in this module, we employed the YOLOv4 method, which is an object detection system that is a development of the YOLOv3 model. It is twice as quick as EfficientDet and has comparable performance, thus it is a good fit for us to fulfill our job and provide real-time output. YOLO is an acronym that means “You Only Look Once.” Because of its simplified construction, it operates much quicker than RCNN. It’s taught to conduct classification and bounding box regression at the same time, unlike quicker RCNN. As a result, we utilize it to find persons in our model. We save all the instances of the output in a set after detecting individuals and bound each person in a rectangular box to determine the centroid of each person later. Find the centroid of the person detected on the frame using the below formula. Centroid of rectangle:( (x1 + x2) / 2, (y1 + y2) / 2) where,x-Center = (x1+x2) / 2y-Center = (y1 + y2) / 2 Moving on after finding the centroid, compute the pairwise Euclidian distances between all detected people. Euclidian distance: d = √ [ (x2– x1)2 + (y2 – y1)2] 33
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
where, • (x1, y1) are the coordinates of one point. • (x2, y2) are the coordinates of the other point. • d is the distance between (x1, y1) and (x2, y2).
Based on these measurements, determine whether any two persons are fewer than N pixels away, which is the accepted threshold pixel distance. The minimum pixel distance varies depending on the camera’s height and angle. The social distance protocols are broken if the distance between two centroids is smaller than the threshold distance, and vice versa.
4. Results and Discussion 4.1 Theft Detection
This method is used to find whether an object is moved from its place. If the object is moved or is in motion, the next static frame after motion will be taken to check whether the object is moved or not. If the object goes missing in the static frame after motion then a box will be made on the initial frame taken for comparison. The full procedure is depicted in Figure 2. Figure 2 shows an object resting on the table with no motion detected at this time. However, when the object is moved, as shown in F igure 3, motion is detected, and our system provides us with information about the object being moved as
Fig. 4. Missing object highlighted in green boundary and grey-scale image of missing object well as from where the object is moved, as shown in Figure 4. Previously, as mentioned in the literature review of a paper by Patil et al. that proposed a solution to the problem of theft detection in which they drew a link between the object and the owner, in order to identify theft if the link between them increased. Even though it is a reliable and effective method, it does not identify the position of the missing object, as our proposed approach does. Also, the p revious solution was designed with the goal of detecting theft at airports or similar locations, whereas our proposed solution is better suited to places like bank lockers and museums, where objects of interest are at rest and untouched for the majority of the time, as it would be easier to detect motion and detect missing objects in those settings. In our theft detection function, motion was identified 100% of the time in decent lighting, but only 91 percent of the time in low or dim illumination. Each time an object was taken or moved in the frame, it was detected with 100% accuracy and the precise location of the missing object was noted. Although the software’s recognition of the object’s shape was not flawless, it was able to accurately indicate the shape of the m issing object more frequently than not.
4.2 Face Mask Detection
34
Fig. 2. Frame with object
In the above Figs. 5 and 6, a bounding box is drawn around the ROI enabling a check to see whether the person is wearing a mask or not. The green colour bounding box depicts the person is wearing a mask and the red bounding box depicts a person is without
Fig. 3. Frame without object
Fig. 5. Person with mask on
Journal of Automation, Mobile Robotics and Intelligent Systems
mask. Fig. 5 shows that when a person successfully wears a mask, our system correctly predicts that the person is wearing the mask with an accuracy of 99.88 percent, but when the person is not wearing the mask, it predicts that accurately with 100% accuracy, as shown in Fig. 6. This method works on a group of people in the frame as well, as we can see in Figs. 7 and 8. Accordingly, ROI is created around each face and bounding boxes are assigned. The green box will only be assigned if a person is wearing the mask correctly. In all other cases the red bounding box is assigned. Fig. 7 shows that when a person is not wearing a mask properly, it
Fig. 6. Person without mask
VOLUME 16,
N° 3
2022
is accurately recognized as such with a high degree of accuracy, whereas Fig. 8 shows that our system can recognize faces from a fair distance and correctly classify several persons wearing masks or not. In contrast to S. Gupta et al.’s work, which used computer vision to detect face masks, we chose to use deep neural networks to detect face masks because a deeper network can learn a more complex, non-linear function, which improves performance. This allows the networks to discriminate b etween d ifferent classes more easily if they have enough training data. In comparison to a network with regular convolutions of the same depth in the nets, we used MobileNets because it significantly reduces the number of parameters. As a result, lightweight deep neural networks are created. Two procedures are used to create a depth wise separable convolution. We have exhibited the results of our trials, as well as a table of observations (Table 1) that we used to calculate the accuracy of our system. When a human face was an acceptable distance from the camera, the face mask detection algorithm was able to detect a human face with an accuracy of 97 percent and determine whether the person was wearing a mask or not based on the features that could be seen. After repeated tests, the mask identified and undetected accuracy ranged from 92 percent to 100 percent. The optimal setting to run the system was bright light, since it produced the best accuracy of up to 100 percent, but when the light was dim, the accuracy dropped and usually stayed between 86 percent and 95 percent. Overall, the technology correctly predicted whether
Fig. 7. Two people with and without mask
Fig. 8. Group of people with and without masks 35
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
Tab 1: Outcomes of face mask detection method Sr.no.
People in Frame
Lighting Conditions
Face Detected
Correct prediction
1
1
Standard lighting
1
1
3
2
Standard lighting
2
2
3
Standard lighting
5
Standard lighting
7
Standard lighting
8
Standard lighting
13
10
Standard lighting
10
10
15
11
Standard lighting
11
11
12
Standard lighting
12
12
15
Standard lighting
2 4 5 6 7 8 9
10 11 12 14 16 17 18 19 20
1 2 3 5 7 8
10 11 12 15
Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting
1 2 3 3 5 5 7 6 8 8 9
11 11 15 14
1 2 3 3 5 5 7 6 8 8 9
10 11 15 13
or not people were wearing masks 96 percent of the time.
4.3. Social Distancing Detection
36
As seen in Fig. 9, our system was able to recognize human figures in video frames rather effectively, even when given a frame of crowded individuals, and assess whether humans are obeying or breaching social distance regulations. The experiment was conducted by positioning a camera from a top-down perspective; the observations table is shown as Table 2. Our system was able to recognize practically all human figures from a decent height with a 98 percent efficiency. We have seen that its efficiency only drops by 2 percent, to 96 percent, when the same camera is put at the same angle but under different environmental conditions— in this example, when the lights are dim. We also put our surveillance system to the test against shadows, which might look human-like from afar. In the case of shadows, our algorithm performed admirably, with a 97 percent accuracy in identifying between a genuine human and a shadow. Even though our system is able to detect humans effectively when the camera is not at a top-down approach and is directly in front of humans, as shown in Fig. 10, we can see that it is not able to recognize the distance correctly because it does not factor in the depth, and thus gives us bad predictions. As a result, it is recommended that the camera be placed high in order to obtain better and more precise findings. The rapid spread of coronavirus leaves the major population vulnerable to getting infected.
Fig. 9. Predicted simulated results of SSD Inception V2 for 3 classes of chest pain facial expression detection The preventive health care team along with the technology specialists must remain vigilant and focus on strategic areas. Social distancing is a proven method used to control the spread of any contagious diseases. As the name suggests, social distancing implies that people should physically distance themselves from one another, reducing close contact, and thereby reducing the spread of a contagious disease. The presented research focuses on assuring safe distance among p eople while at the same time providing the ability to detect theft. This integrated solution works efficiently and accurately. This contribution utilizes advanced image processing concepts along with efficient computing algorithms to solve the issues of social distancing and theft detection and to prevent the spread of pandemics.
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
Fig. 10. Social distancing module used at a non-inclined angle Tab 2: Outcome for social distancing detection method Sr.no.
People in Frame
Lighting Conditions
Humans Detected
Correct Prediction
1
10
Standard lighting
10
10
3
12
Standard lighting
12
12
18
Standard lighting
22
Standard lighting
29
Standard lighting
32
Standard lighting
35
Standard lighting
40
Standard lighting
44
Standard lighting
50
Standard lighting
2 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20
10 12 18 22 29 32 35 40 44 50
Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting Dim lighting
5. Conclusion One of the most significant precautions in avoiding physical contact that could contribute to the spread of coronavirus is social distancing. Viral transmission rates will be increased as a result of noncompliance with these rules. To implement the proposed features that are crucial to stop the spread of coronavirus—social distancing and wearing face masks—a system was created using Python and the OpenCV
10 12 18 18 22 22 29 29 32 31 34 33 40 37 42 43 50 48
10 12 18 18 22 21 29 29 32 30 33 33 40 37 40 43 49 48
library. In the first and second features, we also employed the YOLO method to recognize humans and classify faces, respectively. The current research examines whether or not people were wearing face masks. Real-time video streams and images were used to test the models. The model’s optimization is a continual process, and we’re fine-tuning the hyperparameters to provide a very accurate answer. Because of its high precision and low error rate, the
37
Journal of Automation, Mobile Robotics and Intelligent Systems
proposed method can be easily implemented in real scenarios, such as schools, public places like airports, bus stations, banks, tourist attractions, museums, and many more. However, it does not work as well for all camera angles or setups. A top-down camera angle is the best-advised camera angle. We have addressed a need for more intelligent surveillance technologies to be employed in public and private spaces in order to make it easier for persons monitoring those areas to identify abnormalities more quickly and accurately in this work. Then, in a functional prototype of a system, we presented a solution to the problem. For the sake of our research, we categorized different actions such as violation of covid protocols and theft as abnormalities, and we then attempted to develop a multipurpose intelligent surveillance system for the same that has several modes and can work in any mode dependent on the user requirements. In the future, we intend to test our concept in a variety of industries, including service, commerce, and security.
VOLUME 16,
[5] [6] [7]
[8]
AUTHORS
2022
Semantic,” Segmentation for Images, Sensors, 2020.
S. Gupta and D. T. U. Devi, “YOLOv2 Based Real Time Object Detection,” International Journal of Computer Science Trends and Technology, vol. 8, no. 3, 2020. G. S., “Real-Time Object Detection with Yolo,” proceedings of the International Journal of Engineering and Advanced Technology (IJEAT), 2019.
T. K. M, V. P. M., Y. B., J. S. and L. Dr. K., “Video A nalytics on Social Distancing and Detecting Mask - A detailed Analysis,” International Journal of Advanced Engineering Research and Science (IJAERS), vol. 8, no. 5, 2021. S. Gupta, V. Dhok, A. Chandrayan and S. Tiwari, “Facemask Detection using OpenCv,” International Journal of Advanced Research in Computer and Communication Engineering, vol. 10, no. 6, 2021.
Ratnesh Litoriya* – Computer Science Engineering Dept. Medi-Caps University, Indore, India, Email: litoriya.ratnesh@gmail.com.
[9]
Dhruvansh Moyal – Computer Science Engineering Dept. Medi-Caps University, Indore, India.
[10] S. Patil, M. Shidore, T. Prabhu, S. Yenare and V. Somkuwar, “Theft detection using computer vision,” International Journal of Advance Research, Ideas and Innovations in Technology, vol. 5, no. 1, 2019, pp. 567-569.
Dev Ramchandani – Computer Science Engineering Dept. Medi-Caps University, Indore, India. Dhruv Bothra – Computer Science Engineering Dept. Medi-Caps University, Indore, India. *Corresponding author
REFERENCES [1] [2]
[3]
[4]
38
N° 3
H. Liu, S. Chen and N. Kubota, “Intelligent Video System and Analytics: A Survey,” IEEE Transactions on Industrial Informatics, vol. 9, no. 3, 2013, pp. 1222-1233.
J. Redmon, S. Divvala, R. Grishick and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779-788. D. S. Gothane, “A Practice for Object Detection Using YOLO Algorithm,” International Journal of Science Research in Computer Science, Engineering and Information Technology, vol. 7, no. 2, 2021, pp. 268-272.
B. Qiang, R. Chen, M. Zhou, Y. Pang, Y. Zhai and M. Yang, “Convolutional Neural NetworksBased Object Detection Algorithm by Jointing
R. Kakadiya, R. Lemos, S. Mangalan, M. Pillai and S. Nikam, “AI Based Automatic Robbery/ Theft Detection using Smart Surveillance in Banks,” Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology, 2019.
[11] C. G, A. Jain, H. Jain and M. , “Real Time Object Detection and Tracking Using Deep Learning and OpenCV,” Proceedings of the International Conference on Inventive Research in Computing Applications, 2018. [12] C. Kumar B, P. R and Mohana, “YOLOv3 and YOLOv4: Multiple Object Detection for Surveillance Applications,” Proceedings of the Third International Conference on Smart Systems and Inventive Technology, 2020. [13] A. Bochkovskiy, C.-Y. Wang and H.-Y. Mark Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv:2004.10934v1 [cs.CV], 2020.
[14] P. K. Mishra and G. P. Saroha, “A Study on Video Surveillance System for Object Detection and Tracking,” IEEE, 2016.
[15] A. Bari, S. Waseem, S. and S., “Social Distancing Through Image Processing, Video Analysis, and CNN,” in International Conference on
Journal of Automation, Mobile Robotics and Intelligent Systems
Computational Intelligence and Emerging Power System, 2022.
[16] A. H. Ahamad, N. Zaini and M. F. A. Latip, “Person Detection for Social Distancing and Safety Violation Alert based on Segmented ROI,” IEEE International Conference on Control System, Computing and Engineering (ICCSCE2020), 2020. [17] D. k. R. S. M. P. V. R. D. P. R. Harish Adusumalli, “If the input is a video stream, the picture or a frame of the video is initially delivered to the default face detection module for detection of human
VOLUME 16,
N° 3
2022
faces. This is accomplished by first enlarging the picture or video frame, then identifying the blob ins,” Proceeding of the third international conference on intelligent communication technologies and virtual mobile networks, 2021.
[18] H. Adusumalli, D. Kalyani and R. K. Sri, “Face Mask Detection Using OpenCV,” Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021)., 2021.
39
VOLUME 16, N° 3 2022 Journal of Automation, Mobile Robotics and Intelligent Systems
Edge Artificial Intelligence-Based Facial Pain Recognition During Myocardial Infarction Submitted: 23rd April 2021; accepted: 11th October 2021
Mohan H M, Shivaraj Kumara H C, Mallikarjun S H, Dr. Prasad A Y DOI: 10.14313/JAMRIS/3-2022/23 Abstract: Medical history highlights that myocardial infarction is one of the leading factors of death in human beings. Angina pectoris is a prominent vital sign of myocardial infarction. Medical reports suggest that experiencing chest pain during heart attacks causes changes in facial muscles, resulting in variations in patterns of facial expression. This work intends to develop an automatic facial expression detection to identify the severity of chest pain as a vital sign of MI, using an algorithmic approach that is implemented with a state-of-the-art convolutional neural network (CNN). The advanced object detection lightweight CNN models are as follows: Single Shot Detector Mobile Net V2, and Single Shot Detector Inception V2, which were utilized for designing the vital signs MI model from the 500 Red Blue Green Color images private dataset. The authors developed cardiac emergency health monitoring care using an Edge Artificial Intelligence (“Edge AI”) using NVIDIA’s Jetson Nano embedded GPU platform. The proposed model is mainly focused on the factors of low cost and less power consumption for onboard real-time detection of vital signs of myocardial infarction. The evaluated metrics achieve a mean Average Precision of 85.18%, Average Recall of 88.32%, and 6.85 frames per second for the generated detections. Keywords: Vital Signs, Myocardial Infarction, Facial Pain Expression, Computer Vision, Medical Assistance, Convolution Neural Network
1. Introduction The primary cause of human death across the globe is heart disease, specifically ischemia, and angina pectoris (chest pain) is its most common symptom [1]. Pain is a distressing experience, with actual or potential tissue damage associated with sensory, emotional, cognitive, and social components [2]. Pain is a publicly displayed visible event (usually demanding attention) and the facial expressions allow the observer to appropriately respond. Characteristics may be discovered on the observational scale of pain (sharp, intense or unusual) that may help to identify a warning signal as a potential danger threat. The patterns in facial pain expression induce social responses such as care, empathy, and nursing [3, 4]. Without pain, the total lifespan of human beings will be reduced 40
drastically [5]. Pain is a multi-dimensional representation that incorporates behavioral, physiological, sociocultural, cognitive, and affective characteristics [6]. Clinicians perform their initial investigation through the patient’s self-report, taking into consideration such as location, severity, sensory quality, temporal features, and aspects that escalate or diminish pain. A conscious verbal patient’s self-report gathering can be adapted by different modes like verbal communication, gestures, nodding the head for a question, or writing. A non-verbal patient’s selfreport information collection can include searching for causes of pain, keen observation of patient behavior, the report from the patient caretaker, or through an analgesic trial. In a manual process, there are chances of missing certain information in a short interval of time due to high patient density in hospitals, miscommunication, late detection, and human reading errors leading to a faulty diagnosis. It is very necessary for the patient’s diagnosis process to develop an accurate and automatic pain detection model that can eliminate all these human errors during the pain monitoring period [7]. Chest pain is an evident clinical attribute of myocardial ischemia during the suspected acute phase of myocardial infarction [8]. Both cardiac functional failure and pain fall under myocardial ischemia, caused by the imperative pumping of the heart against hypertensive pressure. It is almost certain that severe angina pectoris can occur in the presence of myocardial hypertrophy or preexistence of coronary artery disease [9]. One study tried to establish a relationship between MI pain duration and the mortality rate, and reports suggested that the highest mortality rate existed among the patients who succumbed to the longest duration of pain [10]. An investigation by researchers found a common pattern in facial expression of patients with chest pain diagnosed with necrosis or cardiac ischemia [11]. Pain is an individual experience and proves to be a complex phenomenon for automatic precise measurement and effective medical diagnosis using facial expressions. In recent decades, researchers have shown keen interest in exploring the challenges and problems of facial expression recognition (FER) in the growing medical allocation area of research such as computer vision and artificial intelligence domains [12]. The existing clinical practices in diagnosing chest pain treatment are time-consuming and involve
2022 ® Mohan et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)
Journal of Automation, Mobile Robotics and Intelligent Systems
costly procedures in the treatment process that need to be adapted. A focus has been made by researchers to develop more effective methods for evaluating chest pain caused by myocardial infarction. This idea helped us begin developing an edge-based AI model that interprets the expressions of the face to evaluate the intensity of angina pectoris. Therefore, the patients with high complications with MI who need immediate attention can be called in for an emergency procedure and be admitted to the hospital at the earliest possibility. The research community has been constantly exploring integral solutions for remote monitoring of patients by generating reports to the clinicians for more than a decade. The primary motivation is to address healthcare issues at all levels: pediatric care, disease monitoring, elderly supervision, emergency patients handling, fitness, and private health management. In recent years, efficient combinations of cloud computing and effective Internet of Things architectures, along with algorithmic approaches of artificial intelligence (AI), have been exploited to develop a product model for real-time smart health care applications. Data captured by embedded sensors, wearable devices, smartphones, and Internet of Things devices may help to explore the habits and patterns of a person and be effectively utilized in the healthcare domain to solve existing problems through state-of-the-art AI-based approaches [13]. The concept of edge intelligence (“Edge AI”) is to “provide AI for every person, anytime, anywhere at all concerns.” The Edge IoT devices were developed with built-in AI models to acquire the sensor data and decode its behavior to make accurate decisions and near-precise predictions. IoT devices with Cloud-based architecture disadvantages exhibit qualities such as non-safety, low latency, and soft real-time abilities, which are critical for IoT healthcare applications. However, considering the critical conditions of the patients under time-bound emergency conditions, those criticalities need high robustness, low latency, high bandwidth, and a large degree of reliable systems to avoid fatal consequences. Traditional cloud computing techniques based on IoT devices pose bigger challenges in health monitoring applications, such as bandwidth issues and reliability, latency, and privacy problems. In order to overcome these challenges, the concept of Edge AI has been introduced [14]. Presently, there are mainly three edge computing platforms, namely: i) on-device computation, in which AI computations are done locally on the end device; ii) edge-server architecture, in which the edge server does the computation task after gathering data from the edge devices called nodes; and iii) edge-cloudbased joint computation. Recently, considerable work has been carried out in the healthcare segment on edge platforms. Ghulam Muhammad et al. developed a voice disorder detection and classification system in
VOLUME 16,
N° 3
2022
smart health frameworks [15]. The acquisition of voice signals was carried out through IoT smart sensors. Processing and computing were done through edge computing and a cloud platform. J. Pena Queralta et al. proposed an LSTM Recurrent Neural Network technique for a fall detection system utilizing the stateof-the-art Low Power Wide Area Network (LPWAN). The authors utilized the state-of-the-art low power wide area network technology to overcome the network limitations in Edge AI [16]. Xiangfeng, Dai, et al. presented a mobile health platform for skin cancer detection, developing an inferencing platform on the device by combining deep learning and mobile health technology into a single technique for cancer detection and classification [17]. In this work, we have implemented an automatic facial pain recognition system using an embedded edge GPU platform, Jetson Nano; this was also to evaluate the state-of-the-art CNN lightweight architectures, SSD Inception V2 and Mobile Net SSD V2, by considering the following performance metrics: precision, recall, and frames per second. In this research work, the aim is to classify the severity of pain states of MI expressed as the vital sign exhibited in facial expressions as: normal, mild, and severe levels. Our major contributions in this research paper are: (i) to create a chest pain facial expression dataset following the benchmark metrics of the Facial Action Coding System (FACS) along with suitable annotation; (ii) to choose a suitable real-time detection model, considering DCNN models SSD- MobileNetV2 and SSD Inception Net V2 for embedded platforms; and (iii) model optimization and performance tests being performed using NVIDIA Jetson NANO, and evaluating the inference speed during the real-time detection model on an embedded platform. This research work is organized as follows. Section 2 explores the recent research works carried out in automatic pain expression extraction from facial expressions. Section 3 explains the overview of our proposed work and our Facial Action Coding System, and describes our dataset and the performance metrics used for evaluation. Section 4 elaborates and discusses the obtained simulation results. Lastly, Section 5 describes the overall conclusion.
2. Related Work
Scientific analysis of automatic facial expression analysis systems has been around for three decades. Early attempts failed to work for spontaneous facial expression detection and perform under a real-time environment due to a lack of powerful algorithms, capturing of quality datasets, and efficient hardware to process large datasets [18]. The recent revolution in computer vision techniques has made it possible to extract and analyze various health indicators from facial expressions, such as mental state, as well as physiological parameters like respiratory rate, blood pressure, ECG signals, etc. Automatic facial Articles
41
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
detection has high relevance and has attracted considerable interest in many applications like medical diagnosis, biometrics, forensics, defense, and surveillance. Automatic pain recognition requires a minimum of one sensory input channel, called modality, for extracting relevant information from the patient, and the same data is utilized for further processing in an embedded device or computer. It could be a behavioral feature or physiological feature of the person during observation. Behavioral features are based on body movements (head movements or restlessness), facial expressions, paralingual vocalizations (moaning or crying), or speech. Physiological modalities might be an electro-dermal activity, cardiovascular activity signal (ECG), or brain signal (EEG). A recognition system can be a unimodal or multimodal system [19]. Table 1 provides information related to 4 different metrics that have been incorporated by different authors using machine learning models. One of the main hurdles in automatic pain detection (APD) research has been the widely accepted scientific dataset until 2009. Later, when a publicly available UNBC-McMaster Shoulder Pain Expression Dataset was released, there was increased interest in this field as well, which noteworthy publications have featured. However, compared to AFER research, APD related works are few, and still in their infancy [19]. Some of the benchmark datasets designed particularly for pain-related research, which have encouraged research in automatic pain detection and classification techniques, are specified in Table 2.
Tab. 1. Various pain recognition modalities and approaches Paper
Modality For Facial Expression
Adibuzzaman [20] Ashouri [21] Haque [22] Rivas [23] Yang [24]
Smartphone
Age
Breast cancer
35-48
-
22-42
Lower back pain
Hand movement, finger pressure
Stroke patients
RGB, depth, thermal Physiological data
ICU
Tab. 2. Publicly available pain recognition databases
Stimuli
Model
Samples
------
SVM-KNN
454+513
20-50
Trunk motion
Adult
Rehabilitation exercises
Adult
Electrical -------
SVM
CNN+LSTM
Semi-naïve Bayesian classifier Boltzmann machine
52 2k
6K
1K
Sl No
Database
Subjects
Stimuli
1
UNBC-McMaster shoulder pain [27]
Shoulder pain patients: 25 adults
200 range of motion tests with affected and unaffected limbs.
90 healthy adults aged between 18-29
14k emotion elicitation heat pain; 41 posed expression cold pressor emotion elicitation
2 3
EmoPain [28]
BioVid Heat Pain [29]
22 adults with chronic lower back pain aged 50, 28 healthy adults aged ~37
4
IIIT-S ICSD [30]
33 infants aged between 3-24 months
5 6
42
2022
The UNBC-McMaster database comprises 200 videos captured from 25 patients suffering from shoulder pain. The video frames were labeled based on the golden standard for facial expressions: Facial Action Coding System (FACS), which was originally invented by Prkachin and Solomon [26]. Various approaches have been adopted by researchers considering the context of Automatic Pain Detection (APD), ranging from the traditional handcrafted feature extraction techniques of supervised learning to the recent AI algorithmic approaches [25]. A few supervising learning techniques like linear, logistic regression, and decision trees are least often used as learning methods. Other popular supervised machine learning techniques, like Support Vector Regressor (SVR) and Multiple Kernel SVM, are widely used. Semi-supervised learning methods have never been used until now [19]. Apart from AI-based approaches, several traditional approaches have been proposed by researchers. These methods are developed as emergency-based models by combining several facial descriptors such as shape appearance, facial texture, geometry, etc. Yang et al. presented a novel approach based on appearance-based facial descriptors in automatic pain assessment by analyzing the role of spatio-temporal information. Two pieces of descriptor information, namely, spatial texture features and spatio-temporal features, are extracted from video frames and video sequences respectively. The spatial descriptors extracted are mainly those that consist of binarized statistical image features (BSIF), binary patterns (LBP), and local phase quantization (LPQ) analyzed
Clinical Context
Inertial sensor
N° 3
Articles
BP4D Spontaneous [31] Sense Emotion [32]
41 healthy adults aged between 18-29 45 healthy adults aged around 26
Physical exercises (therapy scenarios)
Immunization pain causes; non painful cry causes
41 cold pressor tasks; emotion elicitation 8k heat pain (3 intensities x 30 repetitions x 2 stimulus sites x 45 participants)
Journal of Automation, Mobile Robotics and Intelligent Systems
from the videos utilizing Three Orthogonal Planes [33]. Juho Kannala et al. advocated an approach for constructing local image descriptors in order to encode textual information which would be suitable for histogram-based representation of image regions. This method generates binary code for each pixel, inspired by local binary pattern and local phase quantization, and provides advancement in overall performance compared to LBP and LPQ techniques [34]. Almost all of the research approaches have focused on facial pain evaluation where an input signal is either images or video samples. Initially, the preprocessing techniques are implemented using normalization or localization techniques for each input frame considered. In the later stage, characteristic feature extraction is carried out based on facial Action Units (AU), or extracting various fiducial points choosing specific facial landmarks. Finally, different algorithmic approaches are utilized to optimize a specific inference model. Lately, especially this decade, the most widely used unsupervised approaches have included Deep Neural Networks such as Recurrent Neural Networks (RNN), Conventional Neural Networks (CNN), and Long Short-Term Memory (LSTM), which have delivered high accurate results. Ghazal Bargshady et al. implemented an improvised deep neural network structure with four threshold levels for facial pain intensity detection. The VGG pre-trained model was adopted for feature extraction technique from the UNBC-McMaster Shoulder Pain Archive Database, and the principal component analysis dimensional reduction method was applied to improve efficiency. A hybrid approach of the deep learning CNN-BiLSTM model was incorporated for pain classification to achieve an accuracy of 90% and AUC of 98.4% [35]. Jing Zhou et al. proposed a novel technique for real-time automatic frame-level pain intensity estimation with an RNN technique by using regression framework. This work demonstrates the sliding window technique to acquire fixed-length input samples for RNN. The regressor approach of RNN provides a continuous score for the pain classification problem instead of discrete labels [36]. Marco Bellantonio et al. highlighted three major factors during the implementation of automatic pain detection: spatial information, temporal axis information, and variation in face resolution during the pain expression variations in video frames. A fusion of deep learning networks (CNN and RNN) was used to extract the features of pain patterns for the UNBC-McMaster Shoulder Pain database, using a super-resolution algorithm to generate the video frames of facial expression through a downsampling process with different resolution setups [37]. Paul et al. explored pain detection techniques to improve the medical diagnosis process with high accuracy and less computing time using the UNBC master shoulder pain database. The mechanism adapted to ensure improved metrics performance like accuracy, AUC, subject exclusive, and nonexclusive settings with a trained deep CNN model for estimating the percentage of pain
VOLUME 16,
N° 3
2022
level using RNN. These results showed a much better metrics performance during the evaluation process for the database as compared to the analysis report of the CK+ facial motion recognition database. The result of the aligned crop LSTM approach shows much better accuracy with an emotion classifier built on top of CNN. The results also highlight the correct classification with and without pain frames, with different facial gestures under various pain conditions for each subject, achieving the classification error [38]. Mohammad Tavakolian et al. adopted the facial pain frames as a compact binary code for intensity level classification by dividing the facial videos in terms of overlapping sequences. Feature extraction of frames was carried out using a CNN algorithm and aggregated as low-level structural information and high-level patterns [39]. Patrick Thiam et al. explored the spatial and temporal features of pain facial expression using attention networks and a mechanism of feeding the sequence of Motion Optical Flow Images (OFIs) and History Images (MHIs) to an attention networks-hybrid CNN and Bidirectional Long Short-Term Memory Recurrent Neural Network (BiLSTM RCNN) for the classification task. Performance analysis was carried out on the BioVid Heat pain database, sensing emotion database points, and achieving an improved performance compared to state-of-the-art methods [40]. Xiaojing Xu et al. established a three-stage DNN approach to evaluate the visual analog scale (VAS) for a video-level measure of pain intensity. The three-stage model includes i) a VGG Face neural network model for predicting the frame-level PSPI, ii) a fully connected neural network for estimating the sequence level pain score, and iii) a linear combination of multidimensional pain estimation of VAS [41]. Frerk Saxen et al. adopted lightweight CNN architectures for automatic face attribute detection systems for deploying the models on smartphones. NasNet-Mobile and Mobile-NetV2 models were used for classifying the custom facial dataset to achieve better accuracy with speed and ease to implement on mobile devices compared to other state-of-the-art methods [42]. From the extensive literature survey, the majority of the current research works on deep learning have adopted the Keras Tensorflow library for facial condition diagnosis. There are alternative powerful libraries in the Python and the C++ programming language, such as Microsoft CNTK, Theano, Caffe, Torch, and Sci-kit Learn, that can be adopted for facial pain analysis. During recent years, various types of DCNNs have been utilized in object detection architectures, dramatically increasing performance with object detection algorithms. CNN-based object detectors have solved complex real-world problems, e.g., medical imaging, autonomous navigation, video surveillance, and machine vision [43]. Jiaxing Li designed a facial recognition system using the Faster R-CNN object detection algorithm [44]. Facial image feature extraction was carried out using a CNN layer, which was passed to the Region Proposal Networks for generating region Articles
43
Journal of Automation, Mobile Robotics and Intelligent Systems
proposals, and the classification layer consisting of SoftMax and regression layer. The efficient Faster RCNN network with the Chinese Linguistic Data Consortium (CLDC) dataset’s video data for facial expression classification performed to achieve a better mean Average Precision (mAP) of 0.82 [44]. In spite of the significant progress in APD through facial expressions, more efforts should be made to collect an accurate pain database and to improve modelling for its effectiveness in real time clinical practices. From the literature survey, it was found that less attention is paid to chest pain-related facial expressions being used for pain detections. In this research work, we utilize Deep Convolution Neural Network object detection networks SSD Mobile Net V2 and SSD Inception V2 to extract more effective real-time performance on an embedded platform with limited computing resources. CNN lightweight models with high capacity have been adopted in feature selection and feature extraction and also effective transfer learning, thereby implementing automatic pain recognition model using facial expression images. Recently, some researchers have worked on MI and cardiovascular diseases as medical emergency conditions to automatically detect the early symptoms in humans and prevent mortality. A deep learning-based artificial intelligence algorithm (DLA) has been adopted to detect MI using six-lead electrocardiography. A novel idea for a variational autoencoder was developed using the TensorFlow library for enhancing the performance of DLA. The results highlight that MI can be detected with high accuracy even with a 6-lead ECG device [45]. Mandair et al. utilized logistic regression and DNN algorithmic techniques for predicting MI from the known risk factors. ML packages such as sci-kit-learn and Keras were effectively utilized and implemented on a Google Cloud platform. Compared to the DNN algorithm, the traditional method of logistic regression offered better benefits in evaluating the disease factor from harmonized EHR data [46]. A novel work was advocated by Kwon et al. for estimating the risk strategy for the mortality of patients with acute MI. The authors identified the potential limitations of traditional methods and employed the deep learning-based approach using a multilayer perceptron built through the Tensorflow library. The prediction performance of the deep learning model designed
Fig. 1. Generated default boxes from the SSD Model
44
Articles
VOLUME 16,
N° 3
2022
for AMI patient outcomes was excellent [47]. A unique approach of the wearable ECG MI classifier was developed using CNN and recurrent neural networks with only a single lead recording. A stacking decoding method was adopted for the classification scheme of “MI,” “healthy,” “other,” and “noisy” ECG signals to achieve superior performance [48]. Jyoti Metan et al. uniquely adapted an automatic detection technique based on a sandpiper-optimized CNN for detecting cardiovascular disease using cardiovascular magnetic resonance imaging [49].
3. Methodology
3.1. Overview of the Proposed Model This work implements two high-performing deep learning CNN architectures merged into a single architectural model for an efficient implementation into a computationally intensive embedded platform.
3.1.1. Single Shot Detector (SSD) The Single Shot Detector (SSD) model is devised to perform localization and classification tasks simultaneously. The SSD architectural framework consists of two stages: a backbone structure and SSD head. The first stage is the backbone structure with a pre-trained CNN network, which acts as an image feature extractor. The backbone structure is pre-trained on a large-scale benchmark dataset like COCO. ImageNet provides a solution to train a rich set of various features. In this work, Mobile Net V2 [50] and Inception V2 models were pre-trained from the COCO dataset for feature extraction or object prediction. The second stage extracts the semantic information from the image without losing the spatial information for classification. Here, the SSD-multi-box approach’s core objective is to convert the bounding boxes into a set of default boxes with different aspect ratios and scales [51]. The SSD predicts the objects of different classes even though overlapped bounding boxes exist. During the prediction of objects in an image, the model creates scores in presence of each default box and produces adjustments to the box for better object shape matching. To achieve better detection, the model merges the predictions from multiple feature maps with different resolutions of various sizes of the objects. Figure 1 shows the generated default
Journal of Automation, Mobile Robotics and Intelligent Systems
boxes for various aspect ratios vs. cell sizes. The adaptation of SSD–Mobile-Net V2 architecture, as shown in Figure 2, consists of a base CNN network as Mobile-Net V2 for image feature extraction, an SSD module for bounding box regressions, and a final classification step for accurate facial pain detection.
3.2. The Proposed Method
A block diagram of the advocated architectural framework in the present research is shown in Figure 3. The methodology incorporates three modules to improvise the efficacy of the proposed algorithm. Three stages are i) input stage, ii) training stage, and iii) detection or output stage. In the first stage, RGB original images from the chest pain dataset are transferred to the preprocessing stage wherein the cropping and resizing technique is applied. In the subsequent step, the region of interest of the facial expression is marked and prepared for the next feature extraction and model training phase. During the second stage, a pre-trained CNN network SSD Mobile-Net V2 and SSD Inception-Net V2 are selected for feature extraction and training the custom data. The training process is carried out in the workstation as more powerful hardware is required for training the deep neural network models. Later, the trained model is transferred to the Jetson Nano embedded GPU board, and real-time detection is carried out in the final detection stage for obtaining three distinctive classes of the vital signs of MI as chest pain facial expressions.
VOLUME 16,
N° 3
2022
3.2.1. Facial Action Coding System (FACS) The Facial Action Coding System (FACS), designed by Ekman and Friesen in 1976, is the most widely accepted set of standard criteria for facial expression research. FACS was proposed to provide a set of finegrained, unified criteria for 6 basic emotions: surprise, joy, fear, sadness, disgust, and anger. The Action Units (AUs) defined can be used for all possible human facial anatomical expressions, and gave the researchers a new analytic powerful tool [18]. The investigation report by researchers has revealed that even a pain-related AU can be formulated. The pain evaluation metric PSPI is derived by Prkachin and Solomon [26], and using Equation 1, the metric PSPI is computed from different painrelated AU facial expression intensities.
(
PSPI = AU 4 + max AU 6; AU 7
(
)
)
+ max AU 9; AU 10 + AU 43 (1)
Intensity values of AUs are measured (0-5 from the weakest trace to maximum intensity). With the closing of eyes, AU43 is evaluated for the score values (either 0 or 1). Here, the researchers adopt either frequency occurrences or pain baseline criteria to differentiate the patients with pain or without pain expression, i) Frequency of occurrence criterion: The critical frequency level is being marked for a particular AU and, if exceeding the normal range, may be around 5-10%.
Fig. 2. Proposed SSD Mobile-Net V2 Architectural Model designed for facial expression chest pain detection
Fig. 3. Method for Training and Detecting Real-Time Vital Signs of Myocardial Infarction
Articles
45
Journal of Automation, Mobile Robotics and Intelligent Systems
ii) Pain baseline criterion: There is a painful baseline condition being set. The AU is defined as painrelated when it occurs more frequently in pain patients compared to non-pain patients [52].
From this private chest pain facial expression PM dataset, 11 AUs with a more relevant and believable connection to chest pain facial expression [14] are expressed in Table 3. All pain-related studies deviate in selecting the possible AUs for their specific medical application. Only a few AUs are listed here. Figure 4 shows the simulated chest pain facial expression taken from the PM private dataset, considering a few AUs chosen from Table 3 [11, 52].
VOLUME 16,
N° 3
2022
publicly available datasets. Pain-induced or simulated facial expressions as custom-made datasets are hard to collect. An optimal dataset has to include high-quality annotations, be multimodal, and also have other relevant states to access specificity corresponding to pain against the false alarm rate trade-off [54]. The UNBC-McMaster database is a challenging dataset, where in some cases it is difficult to predict whether a person is in pain or not, even for medical professionals [55]. Facial expression analysis models that are designed for young adults would not also generalize to older age groups [19]. Thus, a dataset of participants aged more than 65 years old has been included for evaluating performance for the age group
Fig. 4. Facial expression related to pain from PM dataset Tab. 3. Pain Related Action Units FACS Action Units
Description
Muscular Basis
AU4
Eyebrow lowering
AU6
Cheek raising
Depressor glabellae, Depressor supercilii, Corrugator supercilli
Orbicularis oculi; pars obitalis
VAS Score/ PSPI Score
Pain Level
Number of Images
Nose wrinkling
levator labii superioris alaeque nasi
0
No Pain/ Normal
160
3-5
Severe Pain
175
AU7
Eyelid tightening
AU10
Upper lip raising
Levator labii superioris; caput infraorbitals
AU26
Jaw dropping
Masetter; temporal and internal pterygoid relaxed
AU9
AU20
Lip stretching
AU27
Mouth stretching
AU51
Head turning left
AU43 AU55
Orbicularis oculi; pars palebralis
Risorius
Pterygoids, digastric
Eyes closing
Relaxation of levator palpebrae superioris
Head tilting left
Sternocleidomastoid
Sternocleidomastoid
3.2.2. Dataset One of the major challenges researchers face in automatic pain recognition is the availability of suitable 46
of 16-80 years. Table 4 indicates the classification scheme with different pain score levels for the private PM database.
Articles
Tab. 4. Different pain intensity levels in the proposed work database
0-3
Mild Pain
165
While acquiring our custom-made chest pain dataset PM, the Action Units mentioned in Table 3 and following points were considered: i) FACS coding pattern for pain fulfilling the critical frequency level, ii) male and female subjects being considered in equal proportion, iii) the participants’ age group being from 16–80 years, and v) the type of pain. The images were captured using the OnePlus 5 smartphone camera with a resolution of 16 MP of original frame size 4608x3456. The images are scaled down to lower dimensionality (1067x800 pixels) to minimize computational complexity, in turn enhancing the processing speed. The pain facial expressions simulated are ensured to look like real-time scenarios of a heart attack for the observer. The present work dataset consists of three classes: i) normal pose, ii)
Journal of Automation, Mobile Robotics and Intelligent Systems
mild pain pose, and iii) severe pain pose, as shown in Figure 5.
3.2.3. Hardware Description With advancements in complex architectures of deep learning networks for performing object detection and classification tasks, high speed parallel computing architectures play a major role. The choice of an edge device and AI algorithm for a specific application are coupled with each other. A careful analysis has to be made while choosing hardware-based architecture models on certain factors, such as cost, energy consumption, accuracy, and throughput. To enrich the optimal performance of computing deep learning models, Nvidia Corporation has developed GPU-enabled parallel processing Cuda core architecture-based embedded boards in recent years. A lowcost, powerful Nvidia Jetson Nano embedded system platform is adopted with the cutting-edge technology
VOLUME 16,
N° 3
2022
of Edge AI in this research work to achieve high accuracy and throughput. Utilizing Jetson Nano’s full potential involves an optimization of effective algorithms, as well as hardware, to achieve impressive real-time performance. Figure 6 shows the Jetson Nano board and system interfacing. 3.2.4. Training The lightweight Deep Convolution Network models in this proposed work are: SSD InceptionNet V2 and SSD MobileNet V2, which were downloaded from the TensorFlow model Zoo, and are pre-trained networks. The pre-trained weights were initialized by training the model using the COCO dataset. Our PM dataset is organized into three main facial expression classes: i) normal, ii) mild pain, and iii) severe pain. The dataset consists of 350 training images and 150 test images. Training a Deep Neural Network model requires high-performance systems with GPUs for effective ad-
Fig. 5. Chest pain facial expression dataset consisting of i) normal, ii) mild, iii) severe expressions
Fig. 6. a) Jetson Nano b) Jetson Nano board configuration Articles
47
Journal of Automation, Mobile Robotics and Intelligent Systems
vanced training with a high computation speed. With the help of the class labels mentioned, the goal is to train a DCNN model which predicts chest pain facial expression detection directly from video sequences. Hyper-parameter tuning of the neural network models is based on the evaluation of training/validation learning curves. The number of epochs, training time, and stopping criteria for training the model were decided based on the careful examination of learning curves.
3.2.5. Performance Evaluation Metrics COCO evaluation object detection metrics have been used in this work to validate the effectiveness of CNN models [56]. The classification problem of chest pain facial expressions as normal, mild, and severe conditions have been evaluated, and SSD Inception/ MobileNet model performance has been tested using COCO performance metrics. The mean Average Precision (mAP), Average Recall (AR) and F1 score are used in this evaluation process. The frames per second parameter is also used as a key element in the evaluation process to implement the real-time embedded applications. The bounding box location and class confidence are defined as the predicted outputs. Intersection of Union (IOU) indicates the scaling factor at which the predicted bounding object box matches with the ground truth box. It brings the relation between the common intersection area over the summation of their areas.
IoU =
A ∩B (2) A ∪B
For evaluation, the preferred performance metrics in object detection algorithms are: precision (P), Average Precision (AP), mean Average Precision (mAP), Average Recall (AR), and F1 Score. The criteria for performance optimization are designed to find out mispredictions, wrong localization, and any duplications involved during object detection. Considering the test for the object detection problem, for the i-th image and j-th prediction, the algorithm is expected to find a predicted bounding box bij. Denoting the confidence value as cij and threshold of confidence as t, sij = 1, if cij = t; otherwise, sij = 0. The value zij = 1, if the confidence value exceeds the threshold t when the detection prediction j on image i matches a ground truth box; otherwise, zij = 0. Four metrics are defined as follows. Recall (Rot) is indicated by a proportionality constant of perfect predictions with respect to total number objects in the images for any object classification. Recall Rot given by equation 3. Rot =
48
Articles
∑
i =1. . N i
∑
N
j =1 N ij
zij
(3)
VOLUME 16,
N° 3
2022
where ot - object threshold value, Ni - number of images, Nij - total number of detections on image i, and N - maximum number of objects in s given class considering total images. Precision (Pot ) is a scaling factor expressed in terms of exact predictions of object over total predictions. Precision Pot is given by equation 4 as
∑ P = ∑ ot
i =1. . Ni i =1. . Ni
∑ ∑
j=1 Nij
z ij
s ij
j=1 Nij
(4)
Here, sij is set at 1 when the algorithm determines that detection j in image i is an object in the given class, and zij indicates if it is a correct object under the same class.
Average Precision (AP) for multiple object detection, the precision factor is inversely proportional to recall threshold value. Thus, the metric Average Precision is generally adopted to evaluate by using Equation (5). It is an integral function of precision with respect to the recall over a boundary [0-1] (where r stands for recall). Average Precision (AP_ is given by Equations 5 and 6. AP
ò precision (r )dr (5)
Under practical considerations, Average Precision is calculated on different recall levels. Let us assume the difference between two close recall levels to be dr. Then Average Precision can be defined as a`verage of precision results over different recall levels. AP =
∑
recall =
0,1 precision 1 dr
dr + 1
(6)
Mean Average Precision (mAP) the total number of classes of objects in measured images is To. mAP is defined as the mean of APs over total number of classes To. mAP is adopted as the main metric for object detection applications. mAP is defined as the average of AP taking into consideration all classes. mAP is given by equation 7 as mAP =
∑
n =1,2 . . N 0
T0
APn
(7)
The mAP metric evaluates the algorithm’s performance over all recall levels and all classes.
Losses (L) the total loss consists of two main losses: localization loss and confidence loss. The localization loss gives an estimate of the mismatch between the final predicted bounding box and the ground truth box. The SSD model mainly adopts the predictions from positive matches, which are closer to ground truth
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
boxes, and the negative matches are ignored. The confidence loss is a value while performing the class prediction. It is a measure of the confidence of a network while estimating the objectness score of the computed bounding box . Let x ijp = {1, 0} be an Indicator for matching the i −th default box to the j -th ground truth box of category P. x ijp ≥ 1 . The overall In this matching strategy, If
∑i
loss function is a weighted sum of the localization (loc) and the confidence loss (conf):
(
)
L x,c,l , g =
1 é L x ,c + µ Lloc x , l , g ùû (8) N ë conf
( )
(
)
where N - number of matched default boxes. If N = 0, The loss value is set to 0. The localization loss is a smooth L1 loss between the predicted box (l) & the ground truth box (g) parameters. Considering offsets for the center (Cx, Cy) of the default bounding box (d), width (w), and height (h).
(
N
)
Lloc x , l , g =
(
å {
}
i ÎPos m Î C x ,C y , w , h
(
å xijk smoothL lim - g mj 1
)
(
g cx = g cx - d icx / d iw j j
)
(9)
)
g cy = g cy - d icy / d ih j j
æ gw ö j ÷ g wj = log ç ç dw ÷ è i ø
æ gh ö j ÷ g hj = log ç ç dh ÷ è i ø
The confidence loss is the softmax loss over multiple classes’ confidences (c).
( )
Lconf x , c = −
N
∑
iPos
C iP =
N
( ) ∑ log (C )
xijP log C ip −
iN eg
0 i
(10)
( ) ∑ exp (C ) exp C ip p
p i
where the weight term ∝ is set to 1 by cross validation.
N° 3
2022
4. Results and Discussion The self-exploratory modelling experiments were performed to analyze the efficacy of the DCNN proposed models and implemented using Intel(R) core (TM) i7-7700 CPU @3.60GHz and 12GB DDR4 RAM. The current work implements the algorithmic model from Tensorflow object detection API, and the prototype model was developed using the Python programming platform. A powerful deep learning library, Tensor-Flow, and Keras were used for easy and faster prototyping. Figure 7 shows the experimental results of the trained CNN SSD Inception V2 model. The ground truth boxes and the predicted box scores are displayed in the image section of the Tensor board visualization. Figure 8 depicts three facial expressions captured from the camera. The Tensor board visualization toolkit is designed as a scalar dashboard that is utilized for visualizing the performance metrics. Here, the TensorFlow object detection model is user-friendly; APIs are used for visualizing the evaluated metrics like mean Average Precision, Average Recall, loss function, test images ground truth, and predicted values. Figure 9 shows the mAP graph plotted using the Tensor Board window.
4.1. Precision, Average Recall and F1 Score Evaluation
The parameter metrics are visualized graphically in regular intervals of checkpoints on the Tensor board and the same results are updated. The IOU parameter helps in evaluating the detection to be correct or incorrect with a given threshold. The IOU ratio is equal to 0.5, which indicates the overlapping area of the ground truth box with the bounding. Precision indicates whether the object detection model is identifying relevant objects in a given class, and gives correct positive predictions in terms of percentage. Average Precision elucidates maximum detections per image, considering the predefined standard areas defined in Common Objects in Context metrics, like i) Smaller sized objects <= pixels, ii) Medium-sized objects > to <=, and iii) Larger sized objects > pixels. According to standard COCO metrics, AP and mAP represent the same identity. Figure 9a shows a set of mAP values plotted in the range of IOU values from 0.5 to 0.95, with step size of 0.05. The main classes considered are: normal,
Fig. 7. Predicted simulated results of SSD Inception V2 for 3 classes of chest pain facial expression detection
Articles
49
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
Fig. 8. Real-time detection of facial pain expression mild pain, and severe pain cases. Average Recall shows the ability of this proposed model to determine the correct positive predictions against all proposed ground truths in a given class and is expressed as a percentage value. Figure 9 (b) shows a set of Average Recall values plotted over categories and IoUs. From the results obtained for Precision and Recall metrics, they were used to classify the performance of model detections into 3 types: i) Maximum number of detections – High Precision and High Recall, ii) Maximum detected objects are incorrect indicating maximum false positives – High Recall and Low Precision, and iii) All predicted boxes are correct and maximum faulty ground truth objects indicating false negatives – Low recall and High Precision. The model’s aim is to achieve high recall and High Precision Condition as a high-performing system. Table 5 gives the values of mean Average Precision, Average Recall, and F1 Score in terms of percentage.
Tab. 5. Measurement of mean Average Precision, Recall and F1 Score values of SSD MobileNet V2 and SSD Inception V2 Convolutional Neural Networks
Mean Average Precision
Average Recall
F1 Sore
SSD Inception V2 COCO
85.18
88.32
86.72
SSD Mobilenet V2 COCO
82.7
85.5
84.07
4.2. Loss Function Evaluation The total loss of SSD Inception V2 and SSD InceptionNet V2 models occurring at different stages of the training process is framed into three main losses: classification, regularization, and localization loss. The four possible cases of loss functions were simulated using the Tensor-flow framework whose simulated results in four cases were represented in Figure 10a-c, and the overall loss function arealso represented in
Fig. 9. Mean Average Precision and Average Recall values of SSD InceptionNet V2 COCO a) mean Average Precision b) Average Recall (large)
50
Articles
Journal of Automation, Mobile Robotics and Intelligent Systems
Figure 10d. Table 6 gives the different loss values obtained from simulation results for two different CNNs. Tab. 6. Various loss factors measured by SSD Inception V2 Loss Type
Loss
Loss value
A
Classification loss
1.36578
Localization loss
0.124921
B
Regularization loss
E
Final total loss
C
4.3. Training Time Comparison
0.25272 1.7434
Training time infers total time taken in training the deep learning model. Training for 12,000 numbers of steps took approximately 49 hours for SSD MobileNet V2 COCO. The training time of SSD InceptionNet V2 COCO took lesser time when compared with SSD Mobilenet V2 COCO. Table 7 shows the evaluation results for the training period vs. the number of steps for different DCNN SSD COCO models. Tab. 7. Measurement of training time v/s number of steps for different SSD models Convolution Neural Networks
Training Time (Hrs)
Number of Steps (x1000)
SSD Inception V2 COCO
35
12
SSD MobileNet V2 COCO
49
12
4.4. Embedded Implementation After training the model in a high-performance system, it was deployed to the Jetson Nano GPU board.
VOLUME 16,
N° 3
2022
The accuracy of the proposed chest pain face detection CNN model was estimated by the evaluation of a custom-made chest pain dataset and inference speed on Jetson Nano board, which is tested to verify realtime performance on an EDGE-AI embedded device. The results highlight that the CNN models tested ensure balanced performance in inference speed and also in terms of accuracy in embedded system platforms. Figure 4 shows the experimental setup for inference evaluation of the model. Table 8 shows Jetson Nano’s performance considering inference speed measured in terms of frames per second. Tab. 8. Measurement of Frames per Second for 2 CNN models Device
Jetson NANO
CNN Model
Power Consumption (Watts)
Frame per Second (FPS)
SSD MobileNet V2
10
6.85
10
6.26
SSD Inception V2
5 5
3.32 3.18
4.5. Results Comparison With Other Work
Even though authors of different papers may have used identical databases, comparing the results of different papers is not generally a good approach. The limitations leading to incomparability are due to some of the following differences: i) using subsets of custom-made data, ii) evaluating with different performance measures, iii) evaluation methodologies followed, and iv) prediction tasks. As per the extensive literature survey by the authors, none of the
Fig. 10. Loss curves of SSD Inception V2 model Articles
51
Journal of Automation, Mobile Robotics and Intelligent Systems
pain facial expression databases have been evaluated using object detection CNN models. However, facial expression recognition has been carried out using the popular object detection Faster RCNN algorithm. Table 9 shows the comparison of performance metrics of our proposed work with the results of Jiaxing Li [44]. Tab. 9. Measurement of mAP metric of this proposed work Paper
Model
Dataset
mean Average Precision
Jiaxing Li [44]
Faster RCNN with VGG 16 backbone
81.6
Proposed work
SSD inception V2
Chinese Linguistic Data Consortium [CLDC]
4.6. Discussion
Custom chest pain database
85.18
Automatic pain detection is a much-anticipated remedy to the prevailing acute and chronic pain management in the expert medical domain. Computer vision-based analysis offers a promising viable solution for chest pain facial expression for efficient pain detection. In this work, the author has endeavored to evaluate the two state-of-the-art DCNN algorithms, SSD InceptionNetV2 and SSD Mobile Net V2, for chest pain facial expression being considered as a vital sign of heart attacks. It considers the image dataset from custom-made RGB chest pain facial expression images from a high-resolution camera. Three main vital sign postures of facial expression images have been contemplated manually and evaluated in interpreting the severity of the pain as a sign of heart attack detection. In order to evaluate the CNN algorithm, this experiment was carried out to find the best training model incorporating the best testtrain configuration ratio for satisfying minimum loss criteria. The process of automatic feature extraction from the training images is an advantage compared to traditional facial expression feature extraction algorithms. The main limitation is a lack of publicly available standard databases for chest pain facial expression, and it was a challenging task to gather the custom-made dataset, annotate it, and build an accurate pain-based facial recognition system for the object detection algorithm.
5. Conclusion
Automating pain detection and estimating the pain intensity level based on facial images via suitable pain management strategies can emerge as a lifesaver in medical health informatics. The artificial intelligence approach adopted in this research plays a significant role for researchers and medical 52
Articles
VOLUME 16,
N° 3
2022
professionals working with pain management practices. The authors have developed a real-time chest pain-based facial expression pain detector that guarantees a pain estimator and detection solution under myocardial infarction emergency conditions to save lives. Two deep conventional neural networks were developed based on algorithmic implementation: SSD InceptionNetV2 and SSD Mobile Net, which have been deployed to evaluate the chest pain facial expression recognition task using CNN networks. The results have been shown to accomplish a state-of-the-art performance using the classification task in TensorFlow object detection-API. Training a CNN model end to end achieved better metrics, with a mean Average Precision of 85.18% and Recall 88.32%. An embedded GPU platform, Jetson Nano, estimates the real-time performance using an object detection algorithm, and the results achieved 6.85 frames per second in the pain detection technique. In the future, this can lead to a road map for researchers by incorporating knowledge-based ideas to develop an embedded system solution that can be designed based on our model as an emergency alarm indicator based on the severity of the pain score during potential life-threatening cardiac arrest situations.
AUTHORS
Dr. Mohan H M* – HMI, Digital Shark Technology Pvt. Ltd, Bangalore, Karnataka, India, e-mail: mohanhm@ gmail.com. Shivaraj Kumara H C – Tecplix Technologies Pvt. Ltd, Bangalore, Karnataka, India, e-mail: hivaraj.k@ tecplix.co.
Mallikarjun S H – Government Polytechnic, Kampli, Karnataka, India, e-mail: vmakdree@gmail.com.
Prasad A Y – Dept. of CSE, SJB Institute of Technology, Bangalore, VTU, India, e-mail: prasadadguary@gmail. com. *Corresponding author
REFERENCES [1] [2] [3]
Cristina Balla Rita Pavasini Roberto Ferrari, “Treatment of Angina: Where Are We?”, Cardiology journal, vol.140, 2018, pp.52–67 10.1159/000487936 Amanda Williams, Kenneth D. Craig, “Updating the definition of pain” PAIN, vol.157, no.11, 2016, pp. 2420-2423, 10.1097/j.pain.
George R. Hansen, Jon Streltzer, “Psychology of pain”, Emerg Med Clin N, vol. 23, 2005, pp.339– 348, 10.1016/j.emc.2004.12.005.
Journal of Automation, Mobile Robotics and Intelligent Systems
[4] [5] [6]
[7]
[8] [9]
Steven J. Linton, William S. Shaw,” Impact of Psychological Factors in the Experience of Pain” Physical Therapy, vol.91, no.5, 2011, pp.700711, 10.2522 /ptj. 20100330. James C Eisenach, “Textbook of Pain, 4th edition”, vol.12, no.3, July 2000, 276-278.
Xiaojing Xu, Jeannie S Huang, Virginia R De Sa, “Pain Evaluation in Video using Extended Multitask Learning from Multidimensional Measurements”, Proceedings of Machine Learning for Health NeurIPS Workshop, PMLR 116: 2020, pp.141-154.
Philipp Werner; Daniel Lopez-M, Walter St , “Automatic Recognition Methods Supporting Pain Assessment: A Survey”, IEEE Trans on Affective Computing, vol.13, no.1, Oct 2019, pp. 530-552, 10.1109/ TAFFC.2019.2946774. Johan Herlitz, Ake Hjalmarson, Finn W, “Treatment of pain in acute myocardial infarction,” British Heart Journal, vol.61, 1989, pp. 9-13, 10.1136/hrt.61.1.9. Richard Gorlin, “Pathophysiology of Cardiac Pain” Circulation, vol.32, July 1965
[10] James H Behrmann, Harold R Hipp, Howard E Heyer, “Pain Patterns in Acute Myocardial Infarction” American Journal of Medicine, vol. 9, no.2, Aug 1950, pp.156-163, 10.1016 00029343(50)90018-0.
[11] J A Dalton L Brown, J Carlson, R McNutt, S M Greer “An evaluation of facial expression displayed by patients with chest pain”, vol. 28, no.3, May-Jun 1999, pp.168-74, 10.1016/s01479563(99)70056-7 [12] Patrick Thiam, Hans A Kesler, “Two-Stream Attention Network for Pain Recognition from Video Sequences”, Sensors, vol. 20, no.3, pp.839 2020, 10.3390/s20030839
[13] Luca Greco, Gennalo Percannella, Pierluigi Ritrovato, “Trends in iot based solutions for health care moving ai to the edge”, Pattern Recognition Letters, vol.135, July 2020, pp. 346-353, 10.1016/j.patrec.2020.05.016.
[14] Massimo Merenda, Carlo Porcaro, Demetrio Lero, “Edge machine learning for ai-enabled iot devices a review”, Sensors, Vol. 20, no.9, 2020, pp.25-33, 10.3390/s20092533. [15] Ghulam Muhammed, Mohammed F Alhamid, “Edge computing with cloud for voice disorder assessment and treatment” IEEE Communications Magazine, vol. 56, no. 4, April 2018, pp. 60-65 10.1109/MCOM.2018.1700790.
VOLUME 16,
N° 3
2022
[16] J. P. Queralta, T. N. Gia, H. Tenhunen and T. Westerlund, “Edge-AI in LoRa-based Health Monitoring: Fall Detection System with Fog Computing and LSTM Recurrent Neural Networks,” 2019 42nd International Conference on Telecommunications and Signal Processing, Budapest, Hungary, 2019, pp. 601-604, doi: 10.1109/ TSP.2019.8768883. [17] X. Dai, I. Spasić, B. Meyer, S. Chapman and F. Andres, “Machine Learning on Mobile: An On-device Inference App for Skin Cancer Detection,” 2019 Fourth International Conference on Fog and Mobile Edge Computing (FMEC), Rome, Italy, 2019, pp. 301-305, doi: 10.1109/ FMEC.2019.8795362.
[18] Dianbo Liu, Dan Cheng, Timothy T Houle, Lucy Chen, Wei Zhang, Hao Deng, “Machine learning methods for automatic pain assessment using facial expression information”, vol.97, no. 49, Medicine 2018, 10.1097/MD. 0000000000013421. [19] Teena Hassan, Dominik Seuß, Johannes Wollenberg, Katharina Weitz, Miriam Kunz, Stefan Lautenbacher, Jens-Uwe Garbas, Ute Schmid, “Automatic Detection of Pain from Facial Expressions: A Survey” IEEE Transactions on pattern analysis and machine intelligence, 2019, 10.1109/TPAMI.2019.2958341 [20] M Adibuzzaman; Colin Ostberg; S Ahamed, Richard P; “Assessment of Pain Using Facial Pictures Taken with a Smartphone”, IEEE 39th Annual computer Software and Applications conference, vol.2, July 2015, pp. 726-731. 10.1109/ OMPSAC.2015.150. [21] Sajad Ashouri, Mohsen Abedi, Masoud Abdollahi, “A novel approach to spinal 3-D kinematic assessment using inertial sensors: Towards effective quantitative evaluation of low back pain in clinical settings”, Comput Biol Med, vol.89, Aug 2017, pp.144-149, 10.1016/j.compbiomed.2017.08.002.
[22] M. A. Haque et al., “Deep Multimodal Pain Recognition: A Database and Comparison of Spatio-Temporal Visual Modalities,” 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, 2018, pp. 250-257, doi: 10.1109/ FG.2018.00044. [23] J. J. Rivas et al., “Automatic recognition of pain, anxiety, engagement and tiredness for virtual rehabilitation from stroke: A marginalization approach,” 2017 Seventh International Conference on Affective Computing and Intelligent Articles
53
Journal of Automation, Mobile Robotics and Intelligent Systems
Interaction Workshops and Demos (ACIIW), San Antonio, TX, USA, 2017, pp. 159-164, doi: 10.1109/ACIIW.2017.8272607.
[24] Lei Yang,Shuang Wang, Xiaoqian Jiang, “PATTERN: Pain Assessment for patients who can’t tell using Restricted Boltzmann machine,” BMC Medical Informatics and Decision Making, vol.73, July 2016, pp.190-208, 10.1186/ s12911-016-0317-0.
[25] Zhanli Chen, Rashid Ansari, Diana J Wilkie, “Automated Pain Detection from Facial Expressions using FACS: A Review”, Nov 2018, pp. 1-19. [26] Kenneth M Prkachin 1, Patricia E Solomon, “The structure, reliability and validity of pain expression: Evidence from patients with shoulder pain”, Pain, vol.139, Oct 2008, pp. 267-74, 10.1016/j.pain.2008.04.010.
[27] Patrick Lucey, Jeffrey F. Cohn, Kenneth M. Prkachin “Painful data: The UNBC-McMaster shoulder pain expression archive database”, 2011 IEEE International Conference on Automatic Face & Gesture Recognition, March 2011, pp. 1-9, 10.1109/FG.2011.5771462. [28] Min S. H. Aung, Sebastian Kaltwang, Bernardino Romera-Paredes, “The Automatic Detection of Chronic Pain-Related Expression: Requirements, Challenges and the Multimodal Emo Pain Dataset”, 2016 IEEE Transactions on Affective Computing, vol.7, no. 4, Oct 2016, pp.435-451, 10.1109/ TAFFC.2015.246283. [29] S. Walter, S Gruss, H Ehleiter, “The biovid heat pain database data for the advancement and systematic validation of an automated pain recognition system,” 2013 IEEE International Conference on Cybernetics (CYBCO), Lausanne, Switzerland, 2013, pp. 128-131, doi: 10.1109/ CYBConf.2013.6617456.
[30] Mittal V K., “Discriminating the Infant Cry Sounds Due to Pain vs. Discomfort Towards Assisted Clinical Diagnosis”, SLPAT 2016- on Speech and Language Processing for Assistive Technologies, 2016, pp.37-42, 10.21437/SLPAT. 2016-7. [31] Xing Zhang, Lijun Y, “BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database”, Image and Vision Computing, vol.32, no. 10, Oct 2014, pp.692-706, 10.1016/j.imavis.2014.06.002. [32] Maria Velana, Sascha G, G Layher, “The Sense Emotion Database: A Multimodal Database for the Development and Systematic Validation of an Automatic Pain- and Emotion-Recognition System”, Multimodal Pattern Recognition of 54
Articles
VOLUME 16,
N° 3
2022
Social Signals in Human-Computer-Interaction, vol. 10183, June 2017, pp. 127-139.
[33] Ruijing Yang, Shujun Tong, Miguel Bordallo “On Pain Assessment from Facial Videos Using Spatio-Temporal Local Descriptors”, IEEE 6th International Conference on Image Processing Theory, Tools and Applications, IEEE 2016, pp.1-6. 10.1109/IPTA.2016.7820930. [34] Juho Kannala, Esa Rahtu, “BSIF: Binarized Statistical Image Features”, Proceedings of 21st International Conference on Pattern Recognition, ICPR2012, Nov 2012, pp.1363 – 1366. [35] Ghazal Bargshady, Xujuan Zhou, Ravinesh “Enhanced deep learning algorithm development to detect pain intensity from facial expression images”, Expert systems with applications, vol.149, 1 July 2020,
[36] Jing Zhou; Xiaopeng Hong “Recurrent Convolutional Neural Network Regression for Continuous Pain Intensity Estimation in Video” 2016 IEEE Conf on Computer Vision and Pattern Recognition, 10.1109/CVPRW.2016.191. [37] Marco Bellantonio, Mohammad A. Haque, Pau Rodriguez, “Spatio-temporal Pain Recognition in CNN-Based Super-Resolved Facial Images” FFER 2016: Video Analytics. Face and Facial Expression Recognition and Audience Measurement, March 2017, pp.151-162.
[38] Pau Rodriguez, Guillem C, Jordi Gonalez “Deep Pain: Exploiting long short Term Memory Networks for facial expression classification” IEEE Transactions on Cybernetics, Feb 2017, pp.1-11, 10.1109/tcyb.2017.2662199.
[39] M. Tavakolian and A. Hadid, “Deep Binary Representation of Facial Expressions: A Novel Framework for Automatic Pain Intensity Recognition,” 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 2018, pp. 1952-1956, doi: 10.1109/ ICIP.2018.8451681. [40] Patrick Thiam, Hans A. Kestler, Friedhelm Schwenker “Two-Stream Attention Network for Pain Recognition from Video Sequences” Sensors, vol. 20, no.3, 2020, 10.3390/s20030839.
[41] Xiaojing Xu, Jeannie S. Huang, Virginia R. de Sa, “Pain Evaluation in Video using Extended Multitask Learning from Multidimensional Measurements”, Proceedings of Machine Learning Research, vol.116, 2020, pp.141–154. [42] F. Saxen, P. Werner, S. Handrich, E. Othman, L. Dinges and A. Al-Hamadi, “Face Attribute Detection with MobileNetV2 and NasNet-
Journal of Automation, Mobile Robotics and Intelligent Systems
Mobile,” 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA), Dubrovnik, Croatia, 2019, pp. 176-180, doi: 10.1109/ISPA.2019.8868585.
[43] Min-Kook Choi, Jaehyung Park, Heechul Jung “Fast and Accurate Convolutional Object Detectors for Real-time Embedded Platforms” Computer Vision and Pattern Recognition, 2019, arXiv:1909.10798.
[44] Jiaxing Li, Dexiang Zhang, Jinging Zhang “Facial Expression Recognition with Faster R-CNN”, Procedia Computer Science, vol.107, 2017, pp.135-140, 10.1016/j.procs. 2017.03.069. [45] Cho, Y. et al. (2020) ‘Artificial intelligence algorithm for detecting myocardial infarction using six-lead electrocardiography’, Scientific Reports 2020 10:1. Nature Publishing Group, vol.10, no.1, pp.1–10. 10.1038/s41598-020-77599-6. [46] Mandair, D. ‘Prediction of incident myocardial infarction using machine learning applied to harmonized electronic health record data’, BMC Medical Informatics and Decision Making. BioMed Central, vol.20, no.1, 2020, pp.1–10. 10.1186/S12911-020-01268-X.
[47] Kwon, J. ‘Deep-learning-based risk stratification for mortality of patients with acute myocardial infarction’, PLOS ONE. Public Library of Science, vol.14, no.10, 10.1371/JOURNAL. PONE.0224502
[48] Lui, H. W. and Chow, K. L. (2018) ‘Multiclass classification of myocardial infarction with convolutional and recurrent neural networks for portable ECG devices’, Informatics in Medicine Unlocked. Elsevier, vol.13, pp.26–33. 10.1016/ J.IMU.2018.08.002 [49] Jyoti Metan, A.Y. Prasad, K.S. Ananda Kumar et al. “Cardiovascular MRI image analysis by using the
VOLUME 16,
N° 3
2022
bio inspired (sand piper optimized) fully deep convolutional network (Bio-FDCN) architecture for an automated detection of cardiac disorders” Biomedical Signal Processing and Control, vol.70, 2021, 10.1016/j.bspc.2021.103002
[50] Howard, Andrew Zhu, Menglong Chen, Bo Kalenichenko, “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications”, 2017, pp.1-9. arXiv preprint ar Xiv:1704.04861.
[51] Wei Liu, Dragomir Anguelov, Dumitru Erhan, “SSD: Single Shot MultiBox Detector” Computer Vision and Pattern Recognition, 2016, 10.1007/978-3-319-46448-02. [52] M. Kunz, D. Meixner, S. Lautenbacher, “Facial muscle movements encoding pain–a systematic review,” Pain, vol.160, no. 3, March 2019, pp.535–549. [53] Wilkie, Diana J. “Facial Expressions of Pain in Lung Cancer” Analgesia, vol.1, no.2, 1995, pp.9199, 10.3727/ 107156995819564301.
[54] P. Werner, D. Lopez-Martinez, S. Walter, A. Al-Hamadi, S. Gruss and R. W. Picard, “Automatic Recognition Methods Supporting Pain Assessment: A Survey,” in IEEE Transactions on Affective Computing, vol. 13, no. 1, pp. 530552, 1 Jan.-March 2022, doi: 10.1109/TAFFC.2019.2946774.
[55] Pau Rodriguez; Guillem Cucurull; Jordi Gonzàlez P. “Deep Pain: Exploiting Long Short-Term Memory Networks for Facial Expression Classification,” in IEEE Transactions on Cybernetics, vol. 52, no. 5, pp. 3314-3324, May 2022, doi: 10.1109/TCYB.2017.2662199. [56] Lin, Zitnick, Doll, “Microsoft COCO: Common Objects in Context”, Computer Vision, ECCV, vol. 8693, 2014, pp. 740-755.
Articles
55
VOLUME 16, N° 3 2022 Journal of Automation, Mobile Robotics and Intelligent Systems
Skin Lesion Detection Using Deep Learning Submitted: 12th May 2022; accepted: 28th July 2022
Rajit Chandra, Mohammadreza Hajiarbabi DOI: 10.14313/JAMRIS/3-2022/24 Abstract: Skin lesion can be deadliest if not detected early. Early detection of skin lesion can save many lives. Artificial Intelligence and Machine learning is helping healthcare in many ways and so in the diagnosis of skin lesion. Computer aided diagnosis help clinicians in detecting the cancer. The study was conducted to classify the seven classes of skin lesion using very powerful convolutional neural networks. The two pre trained models i.e DenseNet and Incepton-v3 were employed to train the model and accuracy, precision, recall, f1score and ROCAUC was calculated for every class prediction. Moreover, gradient class activation maps were also used to aid the clinicians in determining what are the regions of image that influence model to make a certain decision. These visualizations are used for explain ability of the model. Experiments showed that DenseNet performed better then Inception V3. Also it was noted that gradient class activation maps highlighted different regions for predicting same class. The main contribution was to introduce medical aided visualizations in lesion classification model that will help clinicians in understanding the decisions of the model. It will enhance the reliability of the model. Also, different optimizers were employed with both models to compare the accuracies. Keywords: Skin lesion, DenseNet, Inception V3
1. Introduction
56
Dermatologists use technological approaches for detecting skin cancer to facilitate in the early detection of skin cancer. Such lesions are produced by aberrant melanocyte cell formation and it usually happens when skin is exposed to sun more than necessary. Melanocytes cells generates “melanin”. Melanin is the substance that is responsible for producing pigmentation in the skin. Moreover, the amount of skin cancer cases has risen dramatically, resulting in a growth in the mortality rate from the condition, notably from melanoma instances. That is why the skin lesion is a big concern in all over the world. Skin lesion has many different kinds, and some kinds if not detected early can become skin cancer and so it is important to detect this disease in the early stage. Like every other field, technology is also used
in this area to facilitate clinicians and to contribute to human health. Machine learning is sub field of artificial intelligence and it is proved to outperform in various fields. With the enhancement in the computational power and the huge data availability, it became possible to use deep learning models. Deep learning models have the power to take in the complex structure of images and to learn the pattern out of it. The process in m aking the deep learning model includes collecting the data, pre-processing it, the image data is then segmented and features are extracted. These features are then fed into the model and probabilities are calculated. The class label having the highest probability is predicted. Data is the most important factor for machine learning algorithms. Experts uses various strategies to collect the data. The two types of images are used in medical AI, i.e. dermoscopic images and macroscopic images. For the study, the dataset provided by the International Skin Imaging Collaboration is used. The ISIC has provided various versions of the dataset. The ISIC-2018 dataset is used for the making the model. The 2018 archive contains seven d ifferent classes of skin lesion. So it was a multiclass classification problem. The images that are provided by ISIC are the d ermoscopic images of the lesion. Convolutional Neural Networks are neural networks that are primarily used for the computer vision tasks. The r eason is that CNNs are able to understand the complex structure of images. Dermoscopy is the state-of-the-art procedure for skin cancer screening, with a diagnosis accuracy that is higher than the naked eye [2]. In this p aper, the researchers offered a method for improving the accuracy of automated skin lesion identification by combining different imaging modalities with the metadata of patients. Only those cases were kept that had metadata of patients, a macroscopic image, a dermatoscopic image, and a histological diagnosis details. Moreover, only instances where input images are of adequate quality and untainted by any identifying traits the were picked by repeated hand scanning of all images (ie, eyes, facial landmarks, jewellery or garment). ResNet-50 was used to extract the features of the images. Three kinds of experiments were conducted.
1.1. Full Multimodality Classification
When all mentioned three modes (macroscopic image of lesions, dermatoscopic images, and metadata of patients) were provided, the researchers built a network
2022 ® Chandra and Hajiarbabi. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)
Journal of Automation, Mobile Robotics and Intelligent Systems
with two image feature extractions, one for dermatoscopic input images and the other for macroscopic input images.
1.2. Partial multimodality classification
The researchers excluded the other two from the complete network when only one image modality (macroscopic images or dermatoscopic images) and information were supplied for classifying the images. Before passing it through the embedding network, the researchers generated only one feature vector of image and combined it with the feature vector of metadata.
1.3. Single image classification
When there was only one image type for classification and there was no metadata, the image was sent through the image feature extraction network, and the extracted features were then transmitted via the network. In the testing phase, it came out that the metadata variables of patients like age, sex and location did not enhance precision for pigmented skin lesions appreciably. As a result, it was concluded that available models rely substantially on tight image criteria and may be unstable in clinical practice. Furthermore, selecting datasets may contain unintended biases for specific input patterns. Using image representations produced from Google’s Inception-v3 model, the proposed automated approach intends to detect the kind and cause of cancer directly [3]. The researchers used a feed forward neural network having two layers with softmax activation function in the output layer to perform two-phase classification based on the representation vector. Two separate neural networks with the same representation vector were used to perform the two-phase classification. In phase one, the researchers determined the type of cancer, whether it was malignant or benign, and in phase two, the researchers determined whether the cancer was caused by melanocytic or nonmelanocytic cells. The training dataset includes 2000 JPEG dermoscopic images of skin lesions, as well as ground truth values. The validation set had 150 photos, whereas the testing set contained 600. The method identifies the images automatically using Google’s inspection model and the image representation produced from the dermoscopic images. This paper had two major contributions: first, the researchers offered a classification model that used Deep Convolutional Neural Network and Augmentation of data to evaluate the classification of skin lesion images [4]. Second, the researchers showed how data augmentation could be used to overcome data scarcity, and the researchers looked at how varying numbers of augmented data samples affect the performance of different models. The researchers used three methods of data augmentation in melanoma classification.
1.4. Geometric augmentation
The semantic interpretation of the skin lesion is preserved by the position and scale of lesion mark
VOLUME 16,
N° 3
2022
within the image; therefore, its ultimate classification is unaffected. As a result, input images were randomly cropped and horizontal and vertical flips were used to produce new samples under the same label as the original.
1.5. Color augmentation
The images of skin lesions were gathered from various sources and made using various devices. As a result, while using photographs for training and testing any system, it is critical to scale the colors of the images to increase the classification system’s performance.
1.6. Data warping based on the knowledge of specialist
The clinicians diagnose the melanoma by seeing the patterns that surrounds the lesion. So, affine transformations including distorting, shearing and scaling the data can be helpful in classifying the images. As a result, warping is an excellent way to supplement data in order to improve performance and reduce overfitting in melanoma classification. In [5] three classifiers named SVM, Random forests and Neural Networks were used to classify the image dataset. The results showed that different augmentations performed differently in this case. The neural networks performed best for classification task. In image recognition nowadays, two basic types of feature sets are routinely used [5]. The traditional kind is based on what are known as “hand-crafted features”, which are created by academics with the goal of capturing visual aspects of a picture, such as texture or color. A new sort of feature set was just presented that was motivated by how brain decode images and derived from powerful Convolutional Neural Networks. These new features beat “hand-crafted” features when combined with deep learning, and as a result, they are increasingly popular in computer vision. The researchers proposed in this study to utilise a mix of both sorts of features to classify skin lesions. “RSurf features” was extracted by the researchers for image description. This feature set’s concept is to divide the input image into “parallel sequences of intensity values from the upper-left corner to the bottom-right corner”. The concept behind such extraction technique is based on the texture unit model, in which an input image’s texture spectrum is defined. The support vector machine with Gaussian kernel and standardized models was used in the first categorization. It estimated the class for a given input image using RSurf features and LBPR=1,3,5. CNN characteristics were used in the second SVM classifier, which had a Gaussian kernel and standardized predictors. The researchers used the AlexNet to extract the features. The researchers chose the label with the greatest absolute score value for each image that was tested. As a result, the final classifier incorporated both approaches, including hand-crafted characteristics as well as features acquired from the deep learning method. It’s critical to distinguish malignant form of skin lesions from benign form of lesions like “seborrheic Articles
57
Journal of Automation, Mobile Robotics and Intelligent Systems
58
keratosis” or “benign nevi”, and good computerized classification of skin lesion imagess can help with diagnosis [6]accurate discrimination of malignant skin lesions from benign lesions such as seborrheic keratoses or benign nevi is crucial, while accurate computerised classification of skin lesion images is of great interest to support diagnosis. In this paper, we propose a fully automatic computerised method to classify skin lesions from dermoscopic images. Our approach is based on a novel ensemble scheme for convolutional neural networks (CNNs. The researchers offer a completely automated method for classifying skin lesions from dermoscopic pictures in this study. For tasks like object detection and natural picture categorization, deep neural network algorithm, particularly convolutional neural networks, outperformed alternative methods. The well-established CNN architectures were used to attain great accuracy. Transfer learning had been applied in medical field for other tasks too. The pipeline of the model includes the data pre-processing, fine-tuning of neural networks and then the features were extracted, these features were fed into the SVM model. Then the outputs of the model were assembled together. To facilitate improved generalization ability when tested on additional datasets, the researchers kept the data pre-processing minimum in suggested pipeline. Only one task-specific pre-processing step (related to skin lesion categorization) was included in the technique, while the rest were typical pre-processing stages to prepare the pictures before fed them to model. Normalization, resizing, and color standardization were employed. VGG16, which included 16 weight layers, the number of convolutional layers were 13, and 3 FC layers were employed. In addition to vgg16, the powerful ResNet-18 and ResNet-101, which have varying depths, were used for extracting the features. To solve the three class classification (Malignat Melanoma /Sabrohtic Kerosis/ benign nevi) classification, the 190 final fully connected layers and the last layer which was output layer of all pre-trained networks were eliminated and replaced by two new fully connected layers of 64 nodes and 3 nodes. The new fully connected layers’ weights were chosen at random using a normal distribution with average value of zero and a standard deviation of [195 0.01]. The researchers froze the weight values of the earliest layers of the deep models. By freezing the weights, the issue of overfitting was addressed. Also freezing the weights can be helpful in decreasing the training time. The researchers froze the early layers up to the 4th layers and up to the 10th layers for AlexNet and VGG16, respectively, and up to the 4th residual block and 30th residual blocks for ResNet-18 and ResNet-101 respectively. To avoid overfitting of the little training dataset, the researchers used data augmentation to boost the training size artificially. As key data augmentation approaches, the researchers used rotation of 90 degrees, 180 degrees and 270 degrees and they also employed horizontal flipping. A ternary SVM classifier was trained using the collected deep features and the related labels defining Articles
VOLUME 16,
N° 3
2022
the lesion kinds. The researchers examined linear kernel as well as radial basis function (RBF) kernels and found that the RBF kernel performed marginally better. In the final models, the researchers used 265 one-vs-all multiclass SVM classifiers with radial basis function kernels. The major participation of the method is that it proposed a hybrid deep neural network method for classifying the skin lesion that extracted deep features from data images using multiple DNNs 395 and assembles features in a support vector machine classifier that produced very accurate results without needing exhaustive pre-processing or lesion area segmentation. The results demonstrated that combining information in this way improves discrimination and is complimentary to the 525 individual networks. The “attention residual learning convolutional neural network (ARL-CNN)” model for skin lesion categorization is proposed in this research[7]. The researchers combined a residual learning framework for training a deep convolutional neural network with a small number of data images with an attention learning mechanism to improve the DCNN’s particular representation capacity by allowing it to object more on “semantically” important regions of dermoscopy images (i.e. lesions). The suggested attention learning mechanism made full usage classification-trained DCNNs’ innate and impressive self-attention capacity, and it could work under any deep convolutional neural network framework without appending any additional “attention” layers, which was important for the learning problems having small dataset as in the problem in hand for classifying the images. In terms of implementing this technique, each s o-called ARL block might include both “residual learning” and “attention learning”. By stacking numerous ARL blocks and training the model end-to-end, an ARLCNN model with any depth could be created. The researchers tested the suggested ARLCNN model using the ISIC-skin 2017 dataset, and it outperformed the competition. The research contributed in many aspects. The researchers proposed a novel ARLCNN model for accurate skin lesion categorization, which incorporates both residual learning and attention learning methods. The researchers created an effective attention framework that took full advantage of DCNNs’ inherent “self-attention” ability, i.e., instead of learning the attention mask with extra layers, the researchers used the feature maps acquired by upper layer as the attention mask of a lower level layer; and the researchers achieved “state-of-the-art” lesion classification accuracy on the ISIC-skin 2017 dataset by using only one model with 50 layers, which was foremost for CAD of skin cancer. Researchers addressed two problems in the paper. The first task entailed classifying skin lesions using dermoscopic pictures. “Dermoscopic” images and the metadata of patients were used for the second task [1]. For the first job, the researchers use a variety of CNNs to classify dermoscopic images. The deep learning models for task 2 are divided into two sections:
Journal of Automation, Mobile Robotics and Intelligent Systems
a convolutional neural network for dermoscopy images and a “dense neural network” for processing the patients’ metadata. In the beginning, the researchers just trained the convolutional neural network on image data (task 1). The weight values of CNN are then frozen, and the metadata neural network is attached. Only the weights of the metadata neural network and the classification layer are trained in the second step. The researchers rely heavily on EfficientNets (EN), which were pre-trained on a very large dataset called ImageNet. These models consist of eight separate models that are architecturally similar and follow particular principles for adjusting the image size if it is larger. The version B0 which is also smallest of all, uses [224 *224] as the input size. In bigger versions, up to B7, the input size is raised while the network breadth and network depth are scaled up. The researchers use efficient net versions of B0 to B6. The researchers also trained SENet154 and the two versions of powerful ResNet for the training. In developing the model, three optimizers were used to compare the results. The following optimizers were used 1. Stochastic gradient descent 2. RMSprop 3. Adam
1.6.1. Stochastic gradient descent It is an ‘iterative method’ that optimizes the loss function with differentiable properties. The goal of machine learning is to optimize the loss function or objective function. Mathematically,
( ) n1 ∑ i −nQi (w ) n
Qw =
Here “w” is estimated which minimizes Q. Because it is the iterative method so it performs following iterations to minimize the objective function.
( )
w := w − h ∇ Q w = w −
η is learning rate.
h
n
n
∇Q i (w ) ∑ i n −
1.6.2. RMSProp
Root mean square propagation is also an optimization algorithm in which learning rate is adjusted for parameters. The ‘running average’ is calculated as follows:
( )
ÐÐÐ, := gn
( , − 1) + (1 − g ) (∇ i ( ))
2
The learning parameters are updated as follows:
w := w −
h
( )
n w ,t
( )
∇ Qi w
VOLUME 16,
N° 3
2022
1.6.3. Adam It is an optimization algorithm that is used in place of the standard stochastic gradient descent process to iteratively update weights in neural network using training data. Diederik Kingma of “OpenAI” and Jimmy Ba of the “University of Toronto” presented Adam in their 2015 ICLR paper (poster) titled “Adam: A Stochastic Optimization Method.” Adam, the authors explain, integrates the benefits of two stochastic gradient descent enhancements. More precisely, an “Adaptive Gradient Algorithm” (AdaGrad) is responsible for managing the per-parameter learning rate and hence increases the efficiency on issues with sparse gradients (e.g. computer vision problems and natural language processing problems). To experiment the skin lesion classification model, Python 3.6 were used as progemming languaue. Tensorflow and Keras were used for frameworks.
2. Methods
2.1. Method 1 The model was trained from scratch; the framework was rained for epochs after being initialised with random weights. The algorithm learnt attributes from input and calculates weights by backpropagation after every epoch. If the dataset is not very large, this strategy is unlikely to yield the most accurate results. However, it can still be used as a comparison point for the two other methods.
2.2. Method 2
For the second experiment, ConvNet were used as a feature extractor because most dermatological datasets have a small number of photos of skin lesions, this method used the weights from the available pre trained model VGG16 which was trained on a bigger dataset (i.e. ImageNet), this practice is titled as “transfer learning”. This pre-trained model had previously learnt features that could be relevant for the classifying the skin lesion images, it is the core idea underpinning transfer learning.
2.3. Method 3
Another frequent transfer learning strategy entails not only training the model by assigning pre-trained weights, but also fine-tuning the model by solely training the upper layers of the convolutional network and using the backpropagation. The researchers recommended freezing the lower layers of the network in this paper since they contain more generic dataset properties. Because of their ability to extract more particular features, they were mainly interested in training the model’s top layers. The parameters from the ImageNet dataset were used to initialise the first four layers of convolution neural network in the final framework in this method. The model weights that were saved was loaded from the matching convolutional layer in Method 1 were used to initialise the fifth and final convolutional block. The evaluation metrics showed that the third method performed better than Method 1 and Method 2. Articles
59
Journal of Automation, Mobile Robotics and Intelligent Systems
3. Results The data was divided into train, validation and test split. Train set images
Validation set images
9714
100
Test set images 201
The training set was augmented with the images generated by introducing the changes into original dataset. The images were horizontally flipped, the rotation range was 90 degrees and the zoom range was kept 0.2. the images were also rescaled before feeding into the model.
3.1. Evaluation Metrics
Following evaluation metrics were used to evaluate the models. The Receiver Operator Characteristic (ROC) curve is metric that is used to evaluate the classification models of machine learning. It presents a probability curve that plots the true positive rate against false positive rates at many threshold values. It basically distinct the ‘signal’ from the ‘noise’. The formula of true positive rate and false positive rate are as follows: true positive True positive rate = true positive + false negative =
ÐÐÐÐÐÐÐÐÐ
false positive false positive + true negative
The Area Under the Curve (AUC) measures the performance of the classifier by evaluating its ability to differentiate between classes. It is utilized as the summary of Receiver Operator Characteristic (ROC) curve. The higher value of AUC means that the classification model is performing accurately in differentiating the negative and positive classes. Accuracy is also an evaluation metric that is used for evaluation of classification models. The accuracy value represents the fraction of predictions that model predicts correctly. The formula of accuracy is: total number of correct predictions Accuracy = total predictions
Precision indicates the fraction of positive predictions that were actually correct. The formula of precision is Precision =
true positive true positive + false positive
Recall indicates fraction of actual positives that were predicted correctly. Recall = 60
Articles
true positive true positive + false negative
VOLUME 16,
N° 3
2022
It shows the balance between recall and precision. The formula of F1 Score is as follows: F1 Score =
(
2 * precision * recall
3.2. L2 Regularization
)
precision + recall
L2 regularization is applied to models to combat overfitting. Overfitting is a term used to describe a situation where training loss decreases but the validation loss increases. In other words, the model is well fitted on training data but it is not predicting accurately for validation data. The model is not able to generalize. This is serious because If model is not generalizing then it will not produce accurate results when it will be implemented in real world scenario. There are different techniques that can be used to control overfitting. Regularization is used to control the complexity of model. When regularization is added, the model not only minimize the loss, but it also minimizes the complexity of model. So, the goal of machine learning model after adding regularization is, minimize(Loss(Data|Model)) + complexity(Model))
The complexity of the models used in paper was minimized by using L2 regularization. The formula of L2 regularization is the sum of square of all the weights,
L2 regularization term = w
2
2
= w 12 + w 22 + + w n2
In the models, two layers of L2 regularization was used before the final softmax layer. A total of 12 experiments were conducted by using different optimizers. The three optimizers Adam, RMSprop, Stochastic Gradient Descent were used in DenseNet and inception v3. Moreover, experiments were conducted with augmentations and without augmentations to see whether the augmentations are useful in our case or not. The details of the experiments are given below
3.2.1. With Augmentation
Different augmentations were applied to the dataset to increase the image data to avoid overfitting. If the model is trained on less data, it will learn the pattern but will not generalize it. In other words, the training accuracy is more than testing accuracy. The model does not generalize for unseen data. Different augmentations i.e. rotation range, horizontal flip and zoom range was applied on the dataset. Six experiments were performed with augmentations. 1. DenseNet [RMSPROP] 2. DenseNet [ADAM] 3. DenseNet [SGD] 4. Inception v3 [RMSPROP] 5. Inception V3 [ADAM] 6. Inception V3 [SGD]
Journal of Automation, Mobile Robotics and Intelligent Systems
3.2.2. Without Augmentation These experiments were also conducted without augmentations to see if the model can generalize well without augmentations. 1. DenseNet [RMSPROP] 2. DenseNet [ADAM] 3. DenseNet [SGD] 4. Inception v3 [RMSPROP] 5. Inception V3 [ADAM] 6. Inception V3 [SGD]
4. Discussion
Early detection of skin lesion can save many lives and Artificial Intelligence is helping the medical science in serving this purpose. Convolutional Neural Networks are useful in medical imaging. The two state of the art architectures of convolutional neural network were experimented in this paper and they both showed good results overall. It turned out that DenseNet performed better then Inception V3 in classifying the images into different classes. In order to evaluate the model performance, AUC-ROC curves, precision, recall, F1 score and accuracy were employed. The reason of choosing multiple metrics was that the data was highly imbalance. So, accuracy metric alone might be a deceiving metric. The data imbalance issue was resolved by using focal loss. The per class ROC curves of classes in the DenseNet model are better than the Inception V3 model. Also the overall accuracy, precision, recall and F1 Score figures are better in DenseNet model. The models were run for 60 epochs and early stopping criteria was applied. The reason of applying early stopping was to ensure that model does not overfit. If the model is trained on too many epochs, there are chances that model will overlearn the pattern. And if the model is run for few epochs, the model can underfit i.e. it won’t learn the pattern completely. Since number of epochs is a hyperparamter, so it has to be tuned. Normally, the model is run with huge number of epochs and when it stops learning, it is stopped. In keras, the early stopping callback is provided and that was used in experiments. In the result tables, termination epoch is also provided. The purpose of mentioning termination epoch was to see which optimizer converge on what epoch. The idea was to see that which optimizer converge relatively fast. In Dense Net model, Adam converged on 39th epoch and gave accuracy of 79% but stochastic gradient descent converged on 35th epoch and was 81% accurate. It means that stochastic gradient descent performed better in both perspectives. It gave higher accuracy with less epochs. In the experiments where augmentations were not applied, the accuracies were comparatively better than experiments with augmentations. But the experiments without augmentations faced overfitting problem. this is because the data was very less and the model learnt the training data but did not generalize well on testing data. The purpose of applying augmentations in deep learning is
VOLUME 16,
N° 3
2022
to increase the data because deep learning models requires huge data to learn. The training accuracies of experiments without augmentations were more than 90%. Although L2 regularization were also applied to overcome the issue of overfitting. In case of Inception V3, very interesting figures were produced. Adam optimizer achieved 75% test accuracy in 22 epochs while stochastic gradient descent produced same accuracy in 60 epochs. Moreover, the RMSprop optimizer produced 76% accuracy in 30 epochs. So for the given problem, stochastic gradient descent optimizer with inception V3 is not a suitable choice. The experiments without augmentations showed that RMSprop is a better choice. It gave 81% accuracy in 38 epochs. While Adam and SGD run for same number of epochs and gave 80% and 79% accuracies respectively. Another interesting thing was to see the per class AUC-ROC of Dermatofibroma class. It showed AUC-ROC around 60% in experiments without augmentations. And in experiments with augmentations, it showed AUC-ROC scores around 70%. While this was not the pattern in DenseNet experiments. All the AUC-ROC scores are around 90%. It shows that Inception V3 architecture did not learn the pattern of Dermatofibroma class very efficiently. The loss function that was used for experiments was focal loss which performed well. It was used to overcome the class imbalance issue. In deep learning, it is important to have equal distribution of the classes. If data entries of one class are more than others, the model will learn efficiently the class with more examples. And when the model is deployed, it predicts every image belong to that class. The data was highly imbalance. There are multiple ways to solve this issue. One method is to use weighted loss. But recently, another loss function as introduced called focal loss. it focuses the class with few examples more than the class with more number of examples. It showed good performance overall. In the given problem, the Vascular class had very few examples in training dataset. focal loss focused on this class and on test dataset almost all experiments accurately classified the Vascular class. The accuracies are better in DenseNet then Inception V3. Moreover, the grad activation maps show that the two models have seen different places to classify the same image. The focus region of inception V3 is different from the focus region of DenseNet. Inception V3 model misclassified Vascular class as it is shown in figure. While we cannot know from grad activation maps the reason of focusing the certain region, this is the black box to understand. But these visualizations can help medical staff in knowing that why the model is predicting the certain image to belong to certain class. Because the explainability of the machine learning models is important especially in the sensitive area of medical science. It will help medical staff to understand the model prediction without knowing much about artificial intelligence, machine learning and convolutional neural networks. Articles
61
Journal of Automation, Mobile Robotics and Intelligent Systems
5. Future Work In future the focus would be to improve the model accuracy by experimenting other models like AlexNet and vgg-16. The accuracy of the models will be compared and the best accurate model will be chosen. Also, the skin lesion follows a certain hierarchy that can be incorporated in future research. The hierarchy of skin lesion goes like: In this paper, the seven classes from the third level are incorporated. Total of eight classes belongs to the third level but in the dataset of skin lesion 2018, the seven classes are given. In future the focus would be to consider the complete hierarchy. In the first stage, the first level will be classified, in second phase, the second level will be classified and in the third level all the seven classes will be classified by the model.
6. Figures and Tables
VOLUME 16,
N° 3
2022
The per class AUC-ROC is highly accurate. The results of other experiments are following, Tab. 1. DenseNet Comparison Table
Optimizer Accuracy Precision Recall
F1- Termination SCORE epoch #
Adam
0.79
0.82
0.79
0.79
39
SGD
0.81
0.82
0.81
0.81
34
RMSProp
0.80
0.80
0.80
79
35
Tab. 2. Per class AUC-ROC [DenseNet, RMS Prop, focal Loss, with Augmentations] Class
AUC-ROC
Actinic
0.957
Dermatofibroma
0.985
Carcinoma
0.98
Melanoma
0.921
Nevs
0.962
Seborrheic
0.958
Vascular
1.0
Tab. 3. Per class AUC-ROC [DenseNet, SGD, focal Loss, with Augmentations] Class
AUC-ROC
Actinic
0.918
Dermatofibroma
0.93
Carcinoma
Fig. 1. Skin Lesion Hierarchy
DenseNet model:
0.981
Melanoma
0.879
Nevs
0.965
Seborrheic
0.973
Vascular
1.0
Tab. 4. Focal Loss – Without Augmentation, [DenseNet] Class
AUC-ROC
Actinic
0.971
Dermatofibroma
0.915
Carcinoma
0.977
Melanoma
0.864
Nevs
0.959
Seborrheic
0.945
Vascular
1.0
Tab. 5. DenseNet without Augmentation
Fig. 2. ROC Curve for RMSProp
62
Articles
Optimizer Accuracy Precision Recall
F1- Termination Score epoch #
Adam
0.81
0.80
SGD
0.81
RMSprop
0.82
0.79
0.82
0.80
0.81
0.82 0.81
0.82
0.80
38
38
29
Journal of Automation, Mobile Robotics and Intelligent Systems
Tab. 6. Per class AUC-ROC [DeseNet, Adam, focal Loss, without Augmentations] Class
AUC-ROC
Actinic
0.965
Carcinoma
Dermatofibroma Melanoma Nevs
Seborrheic Vascular
0.979 0.869 0.924 0.94
0.957 1.0
Tab. 7. Per class AUC-ROC [DenseNet, RMSProp , focal Loss, without Augmentations] Class
AUC-ROC
Actinic
0.946
Melanoma
0.905
Carcinoma
Dermatofibroma Nevs
Seborrheic Vascular
0.986 0.982 0.96
0.956 1.0
Tab. 8. Per class AUC-ROC [DenseNet, SGD , focal Loss, without Augmentations] Class
AUC-ROC
Actinic
0.944
Melanoma
0.928
Carcinoma
Dermatofibroma Nevs
Seborrheic Vascular
0.975 0.975 0.958 0.964 1.0
VOLUME 16,
N° 3
2022
Tab. 9. Inception V3 comparison table Optimizer Accuracy Precision Recall
F1- Termination Score epoch #
Adam
0.75
0.78
0.75
0.75
22
SGD
0.75
0.74
0.75
0.74
60
RMSprop
0.76
0.71
0.76
0.73
30
Tab. 10. Per class AUC-ROC [Inception Adam, focal Loss, with Augmentations] Class
AUC-ROC
Actinic
0.887
Dermatofibroma
0.859
Carcinoma
0.959
Melanoma
0.791
Nevs
0.92
Seborrheic
0.911
Vascular
0.99
Tab. 11. Per class AUC-ROC [Inception RMSprop, focal Loss, with Augmentations] Class
AUC-ROC
Actinic
0.912
Dermatofibroma
0.719
Carcinoma
0.953
Melanoma
0.751
Nevs
0.935
Seborrheic
0.914
Vascular
0.985
Tab. 12. Per class AUC-ROC [Inception, SGD, focal Loss, with Augmentations] Fig. 3. Grad-CAM of DenseNet model
Class
AUC-ROC
Actinic
0.929
Dermatofibroma
0.786
Carcinoma
0.953
Melanoma
0.826
Nevs
0.94
Seborrheic
0.905
Vascular
0.998
Tab. 13. Focal Loss – Without Augmentations, [Inception v3] Optimizer Accuracy Precision Recall
F1- Termination Score epoch #
Adam
0.80
0.80
0.80
0.80
43
SGD
0.79
0.79
0.79
0.79
43
RMSprop
0.81
0.81
0.81
0.80
38
Fig. 4. Focal Loss – With Augmentations, [Inception v3] Articles
63
Journal of Automation, Mobile Robotics and Intelligent Systems
Tab. 14. Per class AUC-ROC [Inception, Adam, focal Loss, without Augmentations] Class
AUC-ROC
Actinic
0.921
Dermatofibroma
0.613
Carcinoma Melanoma Nevs
Seborrheic Vascular
0.937 0.868 0.947 0.928 0.998
Tab. 15. Per class AUC-ROC [Inception, RMSProp, focal Loss, without Augmentations] Class
AUC-ROC
Actinic
0.903
Dermatofibroma
0.673
Carcinoma Melanoma Nevs
Seborrheic Vascular
0.933
0.997
Class
AUC-ROC
Actinic
0.909
Dermatofibroma
0.671
Melanoma Nevs
Seborrheic Vascular
0.946 0.863 0.954 0.932 0.997
Fig. 5. Grad-CAM of Inception V3
AUTHORS Rajit Chandra – Computer Science Department, Purdue Fort Wayne, Fort Wayne, 46805, USA, E-mail: chanr02@pfw.edu.
Mohammadreza Hajiarbabi* – Computer Science Department, Purdue Fort Wayne, Fort Wayne, 46805, USA, E-mail: hajiarbm@pfw.edu. *Corresponding author
64
Articles
[1]
[2] [3] [4]
0.946 0.906
N° 3
2022
References
0.864
Tab. 16. Per class AUC-ROC [Inception, SGD, focal Loss, without Augmentations]
Carcinoma
VOLUME 16,
[5]
[6]
N. Gessert, M. Nielsen, M. Shaikh, R. Werner, and A. Schlaefer, “Skin sion classification using ensembles of multi-resolution EfficientNets with meta data,” MethodsX, vol. 7, p. 100864, 2020, DOI: 10.1016/j.mex.2020.100864.
J. Yap, W. Yolland, and P. Tschandl, “Multimodal skin lesion classification using deep learning,” Exp. Dermatol., vol. 27, no. 11, pp. 1261–1267, 2018, DOI: 10.1111/exd. 13777.
P. Mirunalini, A. Chandrabose, V. Gokul, and S. M. Jaisakthi, “Deep Learning for Skin Lesion Classification,” 2017, [Online]. Available: http:// arxiv.org/abs/1703.04364.
T. C. Pham, C. M. Luong, M. Visani, and V. D. Hoang, “Deep CNN and Data Augmentation for Skin Lesion Classification,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10752 LNAI, no. June, pp. 573–582, 2018, DOI: 10.1007/978-3319-75420-8_54. T. Majtner, S. Yildirim-Yayilgan, and J. Y. Hardeberg, “Combining deep learning and hand-crafted features for skin lesion classification,” 2016 6th Int. Conf. Image Process. Theory, Tools Appl. IPTA 2016, no. December, 2017, DOI: 10.1109/IPTA.2016.7821017.
A. Mahbod, G. Schaefer, I. Ellinger, R. Ecker, A. itiot, and C. Wang, “Fusing fine-tuned deep P features for skin lesion classification,” Comput. Med. Imaging Graph., vol. 71, pp. 19–29, 2019, DOI: 10.1016/j.compmedimag.2018.10.007.
[7] J. Zhang, Y. Xie, Y. Xia, and C. Shen, “Attention Residual Learning for Skin Lesion Classification,” IEEE Trans. Med. Imaging, vol. 38, no. 9, pp. 2092–2103, 2019, doi: 10.1109/TMI.2019.2893944.
VOLUME 16, N° 3 2022 Journal of Automation, Mobile Robotics and Intelligent Systems
APPLICABILITY OF AUGMENTED AND VIRTUAL REALITY FOR EDUCATION IN ROBOTICS Submitted: 1st February 2022; accepted: 3rd May 2022
Norbert Prokopiuk, Piotr Falkowski DOI: 10.14313/JAMRIS/3‐2022/25 Abstract: The rapid development of automatic control and robo‐ tics requires an innovative approach to teaching. This is especially important in the case of studies at the acade‐ mic level and in vocational schools, where training with expensive robotic stations is necessary. A solution to this problem may be substituting these with augmented rea‐ lity (AR) and virtual reality (VR) due to their significantly lower costs. AR/VR technologies have an advantage in terms of low‐cost and effective practices in robotics. The described content is an outcome of the MILAN project, wherein those technologies are used for the purpose of online courses. This includes providing access to virtual laboratories via a mobile application and the real‐life training stations connected to the network. This paper provides an overview of the available AR/VR applications implemented in robotics education and a detailed des‐ cription of related best practices. The main sections con‐ tain a detailed methodology for designing an AR mobile tool for learning robotics in the AR environment. Moreo‐ ver, the main challenges encountered during the develop‐ ment phase were listed and analysed. Additionally, this paper presents the possible future use of the application with the associated benefits. The literature overview and conclusions may be used to design similar online courses with an interactive form of teaching practical industrial robots programming. Keywords: Augmented Reality, Education, Industry 4.0, Robotics, Virtual Reality
with enough robots and industrial devices required for effective practice. The most promising trend rela‑ ted to this is implementing augmented reality techno‑ logy using smartphones with the Android system as the most available and easiest scalable.
2. Overview of the existing solutions AR / VR technologies are widely applied for edu‑ cational purposes due to their innovative capabilities. However, most developed applications demand highly specialised equipment such as VR / AR goggles or ad‑ ditional controllers. This section contains an overview of already used solutions. As to create an up‑to‑date application, the selected best practices were analysed. The Scopus and Google Scholar databases were invol‑ ved in searching for available papers. When using the Scopus database, the following keywords were impu‑ ted: AR, VR, Robotic, Education. During the search, the publication time of papers was limited to 2018 and above, while the ield of science was limited to engi‑ neering. Such a search resulted in ten papers, of which only three were selected as considerably related to the topic of research and had a detailed description of cre‑ ated applications. The Google Scholar database was searched based on the following queries: AR in robo‑ tic education, VR in robotic education, AR in education, VR in education, AR VR industrial robotic training. In this case, the publication time of the paper was also limited to 2018 and above. After selecting the papers corresponding to the topic, ten were chosen. 2.1. VR in robot control
1. Introduction The rapid development of automatic control and robotics, as well as the spread of the Industry 4.0 pa‑ radigms , made current teaching methodologies inef‑ fective [1]. Meanwhile, the current labour market si‑ tuation forces the creation of new approaches to tea‑ ching. The effective methods should be widely availa‑ ble. Moreover, they should be possible to be used even without specialised equipment. Creating them accor‑ ding to the proposed methodology would support on‑ line learning and, thus, stop the deterioration of the education quality caused by the COVID‑19 pandemic [2]. AR applications in particular may help to keep education at the highest level. The described metho‑ dology is based on substituting real‑life industrial se‑ tups with their interactive models in AR/VR. This will lead to raising the quality of teaching at universities in small towns, which do not have suf icient resour‑ ces to provide a proper learning setup, e.g., a setup
The application described by Ibá ñ ez et al. [3] is an example of a virtual laboratory created in MAT‑ LAB. This allowed the students to operate several dif‑ ferent robotic arms without purchasing machines and the risk of accidental damage to equipment. According to the authors, such possibilities and immersion of VR technology allow a profound comprehension of robo‑ tic issues. Another example of using VR and AR technologies to teach robotics is the application described by Ruk‑ angu et al. [4]. It allows students to connect remotely to the UR‑10 robot. Thanks to this, they may complete laboratory tasks during the COVID‑19 pandemic; thus, to maintain the appropriate quality of remote educa‑ tion. VR technology was also used in the application described by Perez et al. [5], which allows users to ope‑ rate a digital copy of a real‑life robotic setup. The cre‑ ated environment may be used for training purposes
2022 ® Prokopiuk and Falkowski This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)
65
Journal of Automation, Mobile Robotics and Intelligent Systems
or for simulating human‑robot collaboration with no hazards for operators. According to the survey con‑ ducted by the authors, created VR environment is re‑ alistic and encourages users to work with robots. 2.2. AR in robot control AR technology can also be used for educatio‑ nal purposes. An example of such is the educational platform introduced in the paper written by Martin Hernandez‑Ordoñ ez et al. [6]. It consists of two main components: a robot with two degrees of freedom, and an AR application, which enables operating the robot and visualising relevant data. The control algorithms are implemented in Matlab software, and the genera‑ ted motion instructions are sent directly to the robot. Every segment of the device is equipped with AR mar‑ kers, which enable scanning it with the camera and calculating con iguration. The method thus allows for comprehensive education in the ield of manipulator control algorithms, which may be tested on the prepa‑ red setup. Moreover, it gives more immersive insight into robot kinematics, as the joint angles may be intui‑ tively monitored thanks to AR. Another example of using AR technology for tea‑ ching robotics is described by Bogosian et al. [7]. The application displays selected industrial manipulators with a corresponding task in augmented reality. With these, the lessons based on various case studies in the ields of architecture, engineering, and construction may be studied. This approach facilitates understan‑ ding of the operation purposes of industrial robots and their applications beyond the industry. AR technology was also used within the applica‑ tion described by Su et al. [8]. It displays industrial ro‑ bot digital twins in AR. Therefore, its real‑live version may be controlled via the application with the HTC Vive controllers motion. The movements registered with the lighthouse sensors are transferred to the ap‑ plication with trajectory planning algorithms. Thanks to this, a user may move the characteristic point of a manipulator with their hands. Due to its mobility and of line capabilities, it is a suitable solution for training in robots control. An online module that allows for the mobilisation of a robot is highly desired for learning about Industry 4.0. The last example of using AR technology is the ap‑ plication described by Vener [9]. This application is used to control a humanoid robot that performs the task of lifting a load. It displays a digital copy of the robot and essential parameters such as its centre of gravity. In addition, the system provides control of the robot using the buttons available on the user inter‑ face and simulation of the expected robot movements. These functions let students understand the dynamics of a humanoid robot easier. 2.3. VR in mobile robotics According to Zhong, Zheng, and Zhan, the use of VR technology for training in the ield of mobile robo‑ tics has a similar impact as for industrial robotics [10]. The authors describe involving a virtual environment to operate the IRobotQ3D robot. The study proves that 66
VOLUME 16,
N° 3
2022
VR technology as an additional part of the educatio‑ nal course brings better results than conventional trai‑ ning with a physical robot only. The educational pro‑ cess carried out within this methodology contributes to reducing the stress level of students and enhancing their design capabilities. 2.4. AR in mobile robotics Mobile robotics also takes advantage of AR techno‑ logy. An example of such a case is an application des‑ cribed by Herrera et al. [11]. It is designed to simu‑ late the work of a mobile robot in augmented reality. Thanks to this, the users may get acquainted with con‑ trol algorithms and robot kinematics. Moreover, the application is accompanied by the feature facilitating a profound understanding of the robot’s mechanical structure. Consequently, the described solution allows for a comprehensive presentation of mobile robots re‑ garding their control and mechanics. Another relevant example is the application des‑ cribed by Mallik et al. [12]. It allows for controlling a real‑life mobile robot with a differential drive. To start, a student has to scan a visual tag and then de ine the model of the robot by locating the centre point of the actual device. Then, the real robot can be controlled by pointing the target point at the screen to build its motion paths. Students using this method found that it was intuitive and helped them understand the topic of robots kinematics. Similarly to the previously des‑ cribed applications, this one increases the ef icacy of education in the ield of robotics. 2.5. VR in safety VR technology is also applicable for the training in human‑robot cooperation and its safety. One of such applications is described in a paper by Vladimir Kuts [13]. This work presents a system for simulating a real‑life factory in virtual reality, including robots, CNC milling machines, and other industrial devices. The person using the application has the opportunity to observe how robots perform their tasks with the dee‑ ply immersive human‑machine interaction. In addi‑ tion, the application allows several users to connect to the system simultaneously. Thus, the trainer can run health and safety (OHS) group training for a given ro‑ botic setup. Another example of using VR for OHS training is the application described by Kaarlela [14]. The pre‑ sented system displays a robotic setup in an immer‑ sive VR environment. The workstation model has vi‑ sually marked safety areas updated during the robot’s operation. Additionally, a user can stop or rerun the task performed by the machinery to analyse the mo‑ vement of the manipulator during the executed in‑ structions. Thanks to these functions, such an appro‑ ach may be used for complex safety training without a production line stoppage. Therefore, the cost of such actions is signi icantly decreased, and the risk of not providing customers with the purchased products is mitigated.
Journal of Automation, Mobile Robotics and Intelligent Systems
2.6. AR in robot construction AR technology may also be used to explain robots construction and the purposes of operating robotised setups. One such approach is presented by Michaloset al. [15]. As AR technology is only a layer superimposed on the real‑life image, it can be used to familiarise stu‑ dents with invisible elements of the machinery. Thus, it allows them to analyse the part without the compli‑ cation of disassembling devices. Also, thanks to this, students may learn where the selected element is loca‑ ted in the real‑life setup, and hence, deeply understand the construction of modern industrial robots. Such an approach may be broadened to applicability in main‑ tenance and service to accelerate repairs and decrease the number of human errors [16]. 2.7. Summary of current solutions Although the previously described solutions may signi icantly improve the quality of education, they are not free from disadvantages. The main one is that some applications require additional equipment, such as VR goggles or physical robotic devices (e.g., the ap‑ plication described in subsection 2.2). This may limit the solution’s usability by increasing its using costs or forcing the students to ind specialised institutions providing necessary hardware. Therefore, the main goal of the project described in this paper is to elimi‑ nate such dif iculties by creating the application invol‑ ving only the devices already possessed by most of the target group.
3. Case description This paper presents a solution that requires only an Android device for full functionality. The require‑ ments to be met by a smartphone are included in Ta‑ ble 1. The described application was created to allow student to learn robotics without access to hardware facilities. This is especially important due to the con‑ sequent signi icant cost reduction and the possibility of studying in the real world, anywhere, and with any scenery. An environment like this may simulate poten‑ tial applications of industrial robots involving real‑life objects. Hence, this will encourage users to seek and implement robots in new areas of life. Tab. 1. Phone requirements System version Required applications Processor Memory Storage
Android 7.0 or higher Google Play Services for AR ARMv7 2GB 80 MB
Besides the requirements aggregated in the table, the phone must support the ARCore system. Informa‑ tion on whether a particular phone has ARCore sup‑ port can be found on the page for developers [17]. In addition, Table 2 contains information on tested devi‑ ces. This application was created as a component com‑ plementary to online courses on robotics. Its aim is to
VOLUME 16,
N° 3
2022
Tab. 2. Tested devices Phone model Honor 10 Xiaomi Redmi Note 8 Pro Xiaomi Redmi Note 10 Pro LG G8s Samsung Galaxy a71
Processor HiSilicon Kirin 970 Mediatek Helio G90T Qualcomm Snapdragon 732g Qualcomm Snapdragon 855 Qualcomm Snapdragon 730
RAM memory 4GB 6GB 6GB 6GB 8GB
familiarise students with the robotic workstation and teach them possibilities of motion of currently used in‑ dustrial manipulators. The whole programme was cre‑ ated within the MILAN project. 3.1. MILAN project The MILAN project aims to use augmented reality and virtual reality to develop innovative and widely available training materials and tools in the ield of automatic control robotics. MILAN training materials are placed on the interactive and freely accessible e‑ learning platform. This platform provides high‑quality training in Advanced Manufacturing, targeted prima‑ rily at operators of complex automatic devices and robots, teachers, consultants, and students at techni‑ cal universities. The MILAN programme includes an e‑ learning course. The MILAN project was focused on developing: 1) Case studies overview ‑ collecting information about the most effective technologies and practi‑ ces used for distance learning, particularly for the ield of automatic control and robotics; 2) Curricula of learning blocks; 3) Training content involving virtual reality and aug‑ mented reality technologies; 4) Educational platform containing all the necessary elements for online learning; 5) Teaching methodology. Thanks to the MILAN project, everyone interested in learning automatic control and robotics can have free access to materials published at the e‑learning platform, MILAN website, and YouTube channel. In ad‑ dition, the programme will be enriched with access to the cameras at the robot setups at the ŁUKASIEWICZ Research Network – Industrial Research Institute for Automation and Measurements PIAP and at the Uni‑ versity of Technology in Kosice, as well as the remote access to the virtual robotics laboratory. In order to test students’ skills in practice, the application descri‑ bed in this paper will be used. All the educational ma‑ terial provided by the MILAN training system can be divided into four main categories: ‑ Basics of Advanced Manufacturing; ‑ Automation and Robotics in Advanced Manufactu‑ ring; ‑ ICT in Advanced Manufacturing; ‑ Occupational Safety and Health. The point of the MILAN project was to combine the availability and high quality of the courses. Therefore, while choosing an approach to the knowledge presen‑ tation and practice, reaching a broad audience was considered [18], [19]. 67
Journal of Automation, Mobile Robotics and Intelligent Systems
3.2. Description of the robotic station When using the application, course participants may operate a station for laying down gaskets with the ABB IRB 4600 robot. The industrial manipulator is equipped with a mixing head that pours two‑ component polyurethane along the gaskets path. As the mixture dries up, it swells and sticks to the base material (e.g., an electric cabinet metal door). The application of the 6‑axis robot is as innovative as enabling laying down gaskets at any angle, which may be used for complex geometries (e.g., for 3D‑printed custom parts). Moreover, the station allows potential use of the robot for various Advanced Manufacturing tasks with multiple tools. For such an application, both the work tables should be involved. The mentio‑ ned work tables also enable simultaneous work of the device and an operator. Thanks to the safety system, these two can work within physical reach. The developed AR application contains the interactive model of the setup described above. It allows the virtual robot to move within the scope of work, which may be used to check the programmed scope of operation without the risk of collision with the fence. The corresponding 3D model developed in Autodesk Inventor is presented in Figure 1.
VOLUME 16,
N° 3
2022
the complete model, participants will learn about in‑ dustrial robotic stations’ construction and their ope‑ ration principles. Additionally, the presentation of this complex setup may be used to explain the basics of the safety systems or to analyse the required room in the production halls. As a consequence of using AR‑ based courses, students are not exposed to the risk of injuries while the space and inancial costs are de‑ creased. Hence, such an approach to teaching robotics should result in higher availability of high‑tech educa‑ tion, staying in line with equal opportunities.
4. Methods of designing the application The mobile application was prepared for Android phones using the Unity engine (v.2020.1.17f1) with ARCore (v.4.15) and AR Foundation (v.4.1.5) packa‑ ges. The development of the solution started with pre‑ processing of the CAD model. This phase is visualised in Figure 2.
Fig. 1. Station for laying down gaskets
Tab. 3. Information about the robotic setup Fig. 2. Diagram of preparing a 3D model Width Length Height Power supply
5213 mm 5040 mm max. 3527 mm 200‑600 V 50‑60 Hz
The station presented in Figure 1 contains all the safety measures required for industrial purposes, in‑ cluding fencing and optoelectronic curtains. By using 68
The processed model was ready to be placed in the Unity engine. To obtain an accurate visual of the real‑life station, the robot’s main dimensions and an‑ gular ranges of joints were based on the producer’s documentation [20]. Based on these, the Denavit‑ Hartenberg parameters were introduced [21]. These are shown in Table 4 with the model in Figure 3, while Table 5 contains data on the robot’s joints’ angular ranges.
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
calculated with the transformation matrix according to the formula (2). It depends on the rotation angles in the respective kinematic pairs. Even though the ro‑ bot has a rigidly assembled mixing head, it is neglected within the computations. An approach like this is cau‑ sed by the intention of enabling the virtual tools’ ex‑ change without modifying the equation itself. T60 = T10 × T21 × T32 × T43 × T54 × T65 × [0 0 0 1]T (2) To transform the desired position of the tool cha‑ racteristic point (TCP) into the robot’s joint con igura‑ tion, formula (3) must be solved. The qi symbol used in this equation is an i‑th joint rotation, while x, y and z are transitional position components of the TCP, and a, b, and c are its rotation components regarding x, y and z axes. ( Fig. 3. Denavit‐Hartenberg model
Ψi ‑90 0 ‑90 90 ‑90 0
Θi 0 ‑90 0 0 0 0
Di 495 0 0 1270 0 135
ai 175 1095 175 0 0 0
Tab. 5. Angular reach of the robot’s joints Joint 1 2 3 4 5 6
)−1
T
[x, y, z, a, b, c] = [q1 , q2 , q3 , q4 , q5 , q6 ]
T
(3)
After these stages, the development within the Unity environment was conducted. This contained the following steps: 1) Con iguring the augmented reality
Tab. 4. Denavit‐Hartenberg parameters Joint 1 2 3 4 5 6
T60
Range +180◦ to −180◦ +150◦ to −90◦ +75◦ to −180◦ +400◦ to −400◦ +125◦ to −125◦ +400◦ to −400◦
The Denavit‑Hartenberg parameters presented beforehand were introduced to derive the kinematics equations of the ABB IRB 4600 robot. They are being used to calculate the visualised robot’s position based on the controls. For deriving the equation, the homo‑ geneous transformation matrix (1) was used.
‑ Adding the AR camera; ‑ Adding the indicator appearing at the application start on the detected plane; ‑ Implementing the script detecting a lat surface and adding a robotic station in the place of the indicator. 2) Adding the robotic station model ‑ Placing and scaling the model in the scene; ‑ Adding Cube elements that would be responsible for the motion; ‑ Creating a hierarchy of models. 3) Designing the user interface ‑ Adding buttons and sliders; ‑ Adding labels. 4) Attaching the scripts responsible for the robot’s motion ‑ Implementing scripts rotating and scaling the model; ‑ Implementing scripts moving the robot’s subcomponents. 5) Generating a .prefab ile and including it as a dis‑ play object 4.1. AR configuration
The augmented reality con iguration was prima‑ performed with the ARFoundation package. Then, rily cosΘi −cosΨi sinΘi sinΨi sinΘi ai cosΘithe described initial indicator was designed as the red sinΘi cosΨi cosΘi −sinΨi cosΘi ai sinΘi i−1 crosshair placed on the plane estimation of the de‑ Ti (Θi ) = 0 sinΨi cosΨi Di tected loor. The applied detection functions tracking 0 0 0 1 the position of the model and the light changes are (1) commonly available as pre‑made scripts. Neverthe‑ less, the function responsible for placing the object can be described with the following pseudocode: The position of the robot’s tool characteristic point
(TCP) located in the centre of the device’s lange is 69
Journal of Automation, Mobile Robotics and Intelligent Systems
AR objects and model initialisation P oseIsV alid ← f alse function START F ind Raycast M anager
VOLUME 16,
N° 3
2022
‑ C ‑ Sliders to rotate and re‑scale the entire model;
‑ D ‑ Buttons to control robot’s joints and angular me‑ asurements of each.
end function function UPDATE if Model is not placed and PoseIsValid=true then P lace object end if U pdate pose U pdate placement indicator end function 4.2. Adding robotic station model The critical aspect of adding the model is the accu‑ rate initial scaling of the device, so that it corresponds to its actual size while being projected in the augmen‑ ted environment. Six Cube objects were added to ens‑ ure natural motion, with each of them corresponding to one robot’s rotary joint. Moreover, the appropriate hierarchy of Cube models and robot components was implemented (see Figure 4).
4.4. Robot motion
The mobile educational application has the functi‑ onality of jogging the robot (controlling its every sin‑ gle joint coordinates). Thanks to the hierarchy pre‑ sented in Figure 4, the motion of one axis affects the position of the following ones. Furthermore, there is a possibility of rotating and scaling the entire sta‑ tion. Thanks to this, the solution enables learning even within the limited space. The robot’s motion is con‑ trolled with the seven scripts, including: ‑ Six scripts corresponding to the motion of particular axes of the robot;
‑ One script responsible for rotating and scaling the model.
Fig. 4. Models hierarchy
Each of the scripts is connected to the correspon‑ ding user interface elements and activates the requi‑ red functions upon click. This is either rotating around the appropriate axis or setting zero rotation for the Base Position button. The last script is connected with two sliders, assigning them to rotating (−180o to 180o ) and re‑scaling (10% to 100% of the real‑life size) the entire station. The sliders react with the resolution of ±2o .
All the iles prepared within this stage were placed in an empty GameObject. To enhance the immersion of the AR application, the realism of the object visua‑ lisation was increased by modifying the lighting of the Unity engine. 4.3. User interface The important aspect of educational applications is their user interfaces. They should be relatively sim‑ ple and intuitive to enable learning without the need for reading the manual. Therefore, all the buttons and sliders were described with labels clearly explaining their functions. In this described solution, the user in‑ terface was designed as presented in Figure 5, where: ‑ A ‑ Station name; ‑ B ‑ Button to lead the robot to the base pose; 70
4.5. .prefab file generation
Afterwards, the scripts had to be attached to the appropriate elements. The ones responsible for the ro‑ bot’s motion were connected with the Cube objects. After assigning these, a .prefab ile was generated, and the script responsible for manipulating the entire sta‑ tion was added. The inal version of the application is presented in Figure 5.
Journal of Automation, Mobile Robotics and Intelligent Systems
Fig. 5. The application main screen
5. Benefits of using AR/VR Involving AR technology for education in automa‑ tic control and robotics results in the rising availability of courses. This is especially signi icant for less deve‑ loped regions or areas located at a distance from spe‑ cialist training centres or universities. The mentioned impacts are possible due to the following bene its of AR‑aided systems: ‑ Cost reduction;
VOLUME 16,
N° 3
2022
for performing manufacturing processes is approxi‑ mately $120,000. Moreover, the initial purchase isn’t the only cost, as it also generates continuous exploi‑ tation costs. These are the expenses related to power and media consumption, maintenance, and upkeep of the machinery. According to Asari, the cost of opera‑ ting the robotic station for seven years depends on the region of the world [25]. It varies from 15.1% to 42.5% of the total initial purchase price. This means that the total expenses on a device similar to those featured in the app range from $141,000 to $208,000 within its 7‑year exploitation period. In comparison, the cost of creating an AR application was also analysed. This cost mainly depends on the application type and range of development. However, in similar cases to the one des‑ cribed in a 2019 paper focusing on marker‑less AR ap‑ plication, prices range between $10,000 and $11,500 [26]. As a result, the costs of the AR application and the real‑life robotic station are presented in Table 6. As an additional advantage, the application can be instal‑ led on unlimited devices. This is a signi icant superio‑ rity to the physical station, which may be operated by only one person at a time. Based on the gathered data, involving AR‑based systems for education results in much lower costs, enabling simultaneous training for multiple students at their own pace. Consequently, in‑ stitutions with limited budgets could improve the qua‑ lity of their courses at relatively low expenses. On the other hand, the ones that can buy physical devices can save funds thanks to AR applications, e.g., by using fe‑ wer robots and offering blended courses. Tab. 6. Costs comparison [23] ‐ [27] Purchase Costs Maintenance Costs Sum of Costs Possible Savings
Robotic Station AR Application $120,000 $10,000 ‑ $11,500 $21,000‑$88,000 $1,680 ‑ $5,040 $141,000 ‑ $208,000 $11,680 ‑ $16,540 $129,320 ‑ $191,460
‑ Increased quality of education;
5.2. Impact on education quality
‑ Capability of checking the robot’s compatibility with the environment;
Apart from the economic effects, the quality of edu‑ cation also needs to be assessed. This issue may prove the impact of the suggested teaching methodology. Therefore, an analysis of the impact of AR/VR applica‑ tions on the ef icacy of learning was performed. This analysis is based on the recent works of Li, Fu, and Wang [28]; Alzahrani [29]; and Criollo‑C [30]. In all of these papers, the authors found an improvement in the results for the students using AR/VR technologies. The irst study involved substituting learning with a real‑life station with a virtual one [28]. This allowed students to work independently instead of sharing one device with a group. After completing the course, stu‑ dents were surveyed. According to the obtained re‑ sults, 90% of participants stated that an improvement in the quality of the source was noticeable, while 95% said that VR helped them learn faster. In addition, the vast majority of respondents concluded that virtual re‑ ality increased their interest in the subject and satis‑ faction. The second paper deliberates on the bene its of AR
‑ Possible use of the real‑life elements for the tasks within the practicals; ‑ Compliance of the applications with massive open online courses (MOOC) standards and automatic as‑ sessments. 5.1. Cost analysis To perform the cost analysis, solutions available on the market were irst reviewed. Depending on the manufacturer, the price of an industrial robot wit‑ hout accessories is estimated as $37,000 on average [22]. The devices equipped with typical manufactu‑ ring accessories, such as grippers, tool changers, or sensors cost between $50,000 and $80,00 [23], [24]. The reach, payload, and application are the key fac‑ tors affecting the cost of the robot. The presented ranges include the robot with only basic accessories only. However, the price of the whole station essential
71
Journal of Automation, Mobile Robotics and Intelligent Systems
applications introduced at different studying phases and different levels of education [29]. Based on the ci‑ ted sources, the author concludes that using augmen‑ ted reality had a positive effect. Involved students des‑ cribed learning as easy, pleasant, and useful compared to traditional methods. Besides, the AR also increased concentration and kept the constant attention of the courses’ audiences. Altogether, these factors resulted in a more effective acquisition of new practical skills. The last paper describes the impact of AR techno‑ logy on learning in engineering [30]. The study con‑ sisted of a comparison for two groups ‑ one learning with the traditional lectures and the second using AR applications during classes. After completing the course, students were asked to describe the course with one adjective. The following were the most com‑ mon within the outcomes: motivating, easy to use, usa‑ ble. The impact of AR application on the educational process was also veri ied through the exam testing stu‑ dents on the acquired knowledge. The group using the AR application obtained, on average, 36.7% higher re‑ sults than the group using traditional learning. These prove the positive in luence of AR technology on incre‑ asing interest and the ef icacy of acquiring knowledge The project MILAN, which the presented applica‑ tion was designed for, was published on the Coursevo platform. By the 19th of March 2022, 282 users regis‑ tered for the course, which saw 320,631 hits. 5.3. Involvement of the real‐life environment In addition to the previously mentioned bene its, AR also allows testing the collaboration of devices with the environment. Moreover, it enables using real‑ life elements for the courses without the risk of da‑ mage. Thanks to this, a given station may be valida‑ ted in terms of itting into its intended working space. Moreover, the robot’s operational range can be pro‑ jected for the desired program. Based on this, the en‑ vironment, including walls and other machinery, may be assessed for possible collisions with the moving elements. The possibility of manipulating the model also allows users to check whether the placement of the devices does not limit the free work of the ope‑ rators. Such use of the AR application enables lear‑ ning the best practices within the early phases of de‑ signing robotic stations, e.g., cooperation with the en‑ vironment and employee safety, among others. More‑ over, this does not create any risk of harm for humans, like the trials with real‑life setups. These crucial as‑ pects are rarely included in the majority of robotics courses. 5.4. Possibility to use the application in massive open online courses Since the application can be installed on unlimited devices, its use may be easily scaled for MOOCs. These are free online courses available to everyone, during which the student has access to traditional materials such as videos and lectures as well as interactive fo‑ rums and practicals. Thanks to these courses, students may acquire new knowledge and gain experience at their own pace. Conducted research has proven that 72
VOLUME 16,
N° 3
2022
MOOCs have a positive impact on student learning. This is mainly due to the availability of detailed tuto‑ rials and the constant gathering of data on learners’ dif iculties and progress [31]. Regarding the analysis from the previous section, a similar course with the AR application could popularise the ield of robotics and help already interested students effectively deve‑ lop their skills without expensive setups. 5.5. Benefit summary Based on the examples and analyses provided, AR/VR technology signi icantly reduces the cost of le‑ arning while maintaining or even improving the qua‑ lity of education. This also allows for the explanation of issues that could not be presented during traditio‑ nal courses. Moreover, the audience of this sort of trai‑ ning could be signi icantly broadened if AR/VR appli‑ cations were to be combined with MOOCs. Then, the students would also bene it from learning at their own pace, while the organisers may constantly improve the content based on the gathered data.
6. Summary 6.1. Possible future use As the described application is designed for An‑ droid devices, it can be used for a wide range of pur‑ poses. Its minimum requirement is a compatible smartp‑ hone. Therefore, it is suitable for institutions with a low budget and individuals who want to enhance their competencies. This application or similar ones may be deployed directly into the robotics courses targeting these groups. This will allow for enriching theoreti‑ cal lectures with practical exercises involving a virtual laboratory. Due to the low cost of such a solution, it may contribute to equalising learning opportunities in less developed countries. Using the presented tea‑ ching methodology may become a remedy for the lack of access to machinery at the educational institutions. While aiming for long‑term increasing learning capa‑ bilities, employers’ hiring requirements are also ex‑ pected to rise. In addition to educational use, similar AR appli‑ cations can be used for commercial purposes. These include presenting the robotised station design to a customer prior to manufacturing. Thanks to this, the client is aware of the system’s visuals and may vali‑ date the amount of free space reserved for the machi‑ nery. Moreover, this enables improvements at the de‑ sign stage to meet the customer’s needs and expecta‑ tions without additional expenses on later modi icati‑ ons. Hence, the inal solution is fully client‑tailored. 6.2. Teaching with the AR application Assuming that this application is a complete tool, the model approach to the teaching process has to be speci ied, whether for stationary or remote courses. First, the solution can be used for presenting students with the general industrial robots’ design and their motion capabilities.
Journal of Automation, Mobile Robotics and Intelligent Systems
Then, the kinematics of the industrial manipula‑ tors may also be explained with the exercises based on the application. Courses focused on functional and safe designing of robotic workstations may also be or‑ ganised within the same work‑frame. To ensure com‑ prehensive learning capabilities, it would be advan‑ tageous to broaden the powers of the application by enabling the loading of custom models. This approach would familiarise users with widely applied solutions and test their mechatronic designs. 6.3. Conclusions The development of modern smartphones allowed for the implementation of advanced technologies such as augmented reality into multiple ields, including education. However, most solutions involving AR re‑ quires the use of professional equipment. This signi i‑ cantly limits the group of people bene iting from this learning methodology. With the application described in this paper, ho‑ wever, this methodology can reach more people, as it only requires an Android smartphone. Thanks to this accessibility, the application may be used statio‑ nary or remotely, which is particularly important du‑ ring pandemics. Because the application is based on augmented reality, the exercises may include real‑life elements, e.g., boards with the paths for the robot’s trajectory programming. Moreover, the use of AR al‑ lows for training in operating the robot while it is wor‑ king without a need for experienced staff monitoring safety. Similar applications can also be used to teach about Industry 4.0, which is generally not mentioned in education. As a result, the use of this presented solution for educational purposes will increase the competencies of students and people working for the industry while equalising learning opportunities. Thus, their skills will be better suited to the constantly expanding scope of robotics activities in their regions. The use of AR for learning automatic control and robotics is expected to rise in popularity, as it has proven to be signi icantly more effective for many current needs.
AUTHORS Norbert Prokopiuk – Warsaw University of Techno‑ logy, Plac Politechniki 1, Warsaw, 00‑661, e‑mail: nor‑ bert.prokopiuk.stud@pw.edu.pl. Piotr Falkowski – ŁUKASIEWICZ Research Net‑ work – Industrial Research Institute for Automa‑ tion and Measurements PIAP, Al. Jerozolimskie 202, Warsaw, Warsaw University of Technology, 02‑486, Plac Politechniki 1, Warsaw, 00‑661, e‑mail: pfalkow‑ ski@piap.pl.
VOLUME 16,
N° 3
2022
the ERASMUS+ Programme. This publication repre‑ sents only the author’s opinion, and neither the Euro‑ pean Commission nor the National Agency is not re‑ sponsible for any of the information contained in it.
REFERENCES [1] S. Vaidya, P. Ambad, and S. Bhosle, “Industry 4.0– a glimpse,” Procedia manufacturing, vol. 20, pp. 233–238, 2018. [2] E. M. Onyema, N. C. Eucheria, F. A. Obafemi, S. Sen, F. G. Atonye, A. Sharma, and A. O. Alsayed, “Impact of coronavirus pandemic on education,” Journal of Education and Practice, vol. 11, no. 13, pp. 108– 121, 2020. [3] V. Romá n‑Ibá ñ ez, F. A. Pujol‑Ló pez, H. Mora‑ Mora, M. L. Pertegal‑Felices, and A. Jimeno‑ Morenilla, “A low‑cost immersive virtual reality system for teaching robotic manipulators pro‑ gramming,” Sustainability, vol. 10, no. 4, p. 1102, 2018. [4] A. Rukangu, A. Tuttle, and K. Johnsen, “Virtual re‑ ality for remote controlled robotics in engineer‑ ing education,” in 2021 IEEE Conference on Vir‑ tual Reality and 3D User Interfaces Abstracts and Workshops (VRW). IEEE, 2021, pp. 751–752. [5] L. Pé rez, E. Diez, R. Usamentiaga, and D. F. Gar‑ cı́a, “Industrial robot control and operator trai‑ ning using virtual reality interfaces,” Computers in Industry, vol. 109, pp. 114–120, 2019. [6] M. Herná ndez‑Ordoñ ez, M. A. Nuñ o‑Maganda, C. A. Calles‑Arriaga, O. Montañ o‑Rivas, and K. E. Bautista Herná ndez, “An education application for teaching robot arm manipulator concepts using augmented reality,” Mobile Information Sy‑ stems, vol. 2018, 2018. [7] B. Bogosian, L. Bobadilla, M. Alonso, A. Elias, G. Perez, H. Alhaffar, and S. Vassigh, “Work in progress: Towards an immersive robotics training for the future of architecture, engi‑ neering, and construction workforce,” in 2020 IEEE World Conference on Engineering Education (EDUNINE). IEEE, 2020, pp. 1–4. [8] Y.‑H. Su, C.‑Y. Chen, S.‑L. Cheng, C.‑H. Ko, and K.‑ Y. Young, “Development of a 3d ar‑based inter‑ face for industrial robot manipulators,” in 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2018, pp. 1809– 1814.
ACKNOWLEDGEMENTS
[9] I. Verner, M. Reitman, D. Cuperman, T. Yan, E. Fin‑ kelstein, and T. Romm, “Exposing robot learning to students in augmented reality experience,” in International Conference on Remote Engineering and Virtual Instrumentation. Springer, 2018, pp. 610–619.
The paper presents research results supported by the EU within the project MILAN „Multifunctional Innova‑ tive Learning Assisting Network for VET in Advanced Manufacturing”, 2018‑1‑PL01‑KA202‑050812, under
[10] B. Zhong, J. Zheng, and Z. Zhan, “An exploration of combining virtual and physical robots in ro‑ botics education,” Interactive Learning Environ‑ ments, pp. 1–13, 2020. 73
Journal of Automation, Mobile Robotics and Intelligent Systems
[11] K. A. Herrera, J. A. Rocha, F. M. Silva, and V. H. Andaluz, “Training systems for control of mo‑ bile manipulator robots in augmented reality,” in 2020 15th Iberian Conference on Information Sy‑ stems and Technologies (CISTI). IEEE, 2020, pp. 1–7. [12] A. Mallik and V. Kapila, “Interactive learning of mobile robots kinematics using ARCore,” in 2020 5th International Conference on Robotics and Au‑ tomation Engineering (ICRAE). IEEE, 2020, pp. 1–6. [13] V. Kuts, T. Otto, E. G. Caldarola, G. E. Modoni, and M. Sacco, “Enabling the teaching factory levera‑ ging a virtual reality system based on the digital twin,” in Proceedings of the 15th Annual EuroVr Conference. VTT Technical Research Centre of Fin‑ land, Ltd, 2018. [14] T. Kaarlela, S. Pieskä , and T. Pitkä aho, “Digital twin and virtual reality for safety training,” in 2020 11th IEEE International Conference on Cog‑ nitive Infocommunications (CogInfoCom). IEEE, 2020, pp. 000 115–000 120. [15] G. Michalos, P. Karagiannis, S. Makris, O. Tokça‑ lar, and G. Chryssolouris, “Augmented reality (AR) applications for supporting human‑robot interactive cooperation,” Procedia CIRP, vol. 41, pp. 370–375, 2016. [16] M. Jasiulewicz‑Kaczmarek and A. Gola, “Maintenance 4.0 technologies for sustai‑ nable manufacturing‑an overview,” IFAC‑ PapersOnLine, vol. 52, no. 10, pp. 91–96, 2019. [17] “ARCore supported devices,” https://developers. google.com/ar/devices, accessed: 2021‑7‑6. [18] “Multifunctional innovative learning assisting network for VET in advanced manufacturing (MILAN) – multifunctional innovative learning assisting network for VET in advanced ma‑ nufacturing (MILAN),” http://milan‑project.eu/ en/, accessed: 2021‑7‑6. [19] P. Falkowski, Z. Pilat, P. Arapi, M. Tamre, P. Du‑ lencin, J. Homza, and M. Hajduk, “The concept of using AR and VR technologies in the vocatio‑ nal training system in robotics and automation,” in International Conference on Robotics in Educa‑ tion (RiE). Springer, 2020, pp. 318–325. [20] “ABB IRB 4600 product speci ication,” https://search.abb.com/library/Download. aspx?DocumentID=3HAC032885‑001& LanguageCode=en&DocumentPartId=&Action= Launch, accessed: 2021‑7‑6. [21] R. R. Serrezuela, M. A. T. Cardozo, D. L. Ardila, and C. A. C. Perdomo, “A consistent methodology for the development of inverse and direct kinema‑ tics of robust industrial robots,” ARPN Journal of Engineering and Applied Sciences, vol. 13, no. 01, pp. 293–301, 2018. [22] Robot installations 2019: Global eco‑ nomic downturn and trade tensions le‑ ave their marks, “Executive summary 74
VOLUME 16,
N° 3
2022
world robotics 20industrial robots,” https://ifr.org/img/worldrobotics/Executive_ Summary_WR_2020_Industrial_Robots_1.pdf, accessed: 2021‑7‑6. [23] RobotWorx, “How much do industrial robots cost?” https://www.robots.com/faq/how‑ much‑do‑industrial‑robots‑cost, accessed: 2021‑7‑6. [24] “How much do industrial robots cost?” https://sp‑automation.co.uk/how‑much‑do‑ industrial‑robots‑cost‑3/, Sep. 2019, accessed: 2021‑7‑6. [25] R. Asari, “Automotive industrial robot ‑ total cost of ownership,” 08 2018. [26] “How much does it cost to build an AR/VR application: Estimating an aug‑ mented reality app development cost,” https://www.clavax.com/blog/how‑much‑ ar‑vr‑app‑development‑cost‑in‑2019, acces‑ sed: 2021‑7‑6. [27] “How much does app maintenance cost in 2021?” https://www.mobileappdaily.com/cost‑ to‑maintain‑an‑app, accessed: 2021‑7‑6. [28] C. Li, L. Fu, and L. Wang, “Innovate engineering education by using virtual laboratory platform based industrial robot,” in 2018 Chinese Control And Decision Conference (CCDC). IEEE, 2018, pp. 3467–3472. [29] N. M. Alzahrani, “Augmented reality: A syste‑ matic review of its bene its and challenges in e‑learning contexts,” Applied Sciences, vol. 10, no. 16, p. 5660, 2020. [30] S. Criollo‑C, D. Abad‑Vá squez, M. Martic‑Nieto, F. A. Velá squez‑G, J.‑L. Pé rez‑Medina, and S. Lujá n‑Mora, “Towards a new learning ex‑ perience through a mobile application with augmented reality in engineering education,” Applied Sciences, vol. 11, no. 11, 2021. [On‑ line]. Available: https://www.mdpi.com/2076‑ 3417/11/11/4921 [31] P. Plaza, E. Sancristobal, G. Carro, M. Blazquez, A. Menacho, F. Garcı́a‑Loro, C. Perez, J. Muñ oz, E. Tovar, J. Sluss et al., “Portable blended mooc laboratory,” in 2019 IEEE Learning With MOOCS (LWMOOCS). IEEE, 2019, pp. 15–20.
VOLUME 16, N° 3 2022 Journal of Automation, Mobile Robotics and Intelligent Systems
Design of a Linear Quadratic Regulator Based on Genetic Model Reference Adaptive Control Submitted: 6th January 2022; accepted: 7th May 2022
Abdullah I. Abdullah, Ali Mahmood, Mohammad A. Thanoon DOI: 10.14313/JAMRIS/3-2022/26 Abstract: The conventional control system is a controller that controls or regulates the dynamics of any other process. From time to time, a conventional control system may not behave appropriately online; this is because of many factors like a variation in the dynamics of the process itself, unexpected changes in the environment, or even undefined parameters of the system model. To overcome this problem, we have designed and implemented an adaptive controller. This paper discusses the design of a controller for a ball and beam system with Genetic Model Reference Adaptive Control (GMRAC) for an adaptive mechanism with the MIT rule. Parameter adjustment (selection) should occur using optimization methods to obtain an optimal performance, so the genetic algorithm (GA) will be used as an optimization method to obtain the optimum values for these parameters. The Linear Quadratic Regulator (LQR) controller will be used as it is one of the most popular controllers. The performance of the proposed controller with the ball and beam system will be carried out with MATLAB Simulink in order to evaluate its effectiveness. The results show satisfactory performance where the position of the ball tracks the desired model reference. Keywords: model reference adaptive control, gradient approach, Linear Quadratic Regulator, genetic algorithm
1. Introduction Adaptive control is a method of control that uses a controller with adaptable parameters that change with respect to the variation in system response. This method has advantages over conventional control, where it has been used for better performance and accuracy of advanced control systems design, and for systems with uncertain or unknown parameter variations and environmental changes. These characteristics made adaptive control find numerous applications in control problems where it has the ability to be automatically compensated for changes in the plant dynamics [1]. Model Reference Adaptive Control (MRAC) is considered one of the most popular types of adaptive controllers for its straight adaptive strategy with adjustable parameters [2].
This adaptive effect will be given through a reference model, where the error between the real plant (system) and the reference model will be used to modify its parameters to make the plant output follow the reference model response [3]. As a result, MRAC will force the real plant to track the reference system, which has been chosen precisely. In the area of self-tuning controllers, MRAC is considered very popular. It is a robust control that can deal with disturbances and rapid changes in the parameters despite not needing a priori information about the bounds of the uncertainties or the time-varying parameters [4]. An example of a system requiring adaptive control is an aircraft, which when flying will reduce its mass at a slow rate as it consumes fuel. In this case, the controller needs to adapt itself continuously. As for the controller to be used inside the MRAC, the Linear Quadratic Regulator (LQR) was chosen, as it is one of the most utilized techniques for the feedback control design [5]. Optimal feedback LQR is one of the tools that might be implemented for stability improvement of the system performance, where a set of optimal feedback gains may be found by using minimization of a quadratic index [6]. The challenge of using the LQR application is the adjustment process that is used to find the elements of both weighting matrices Q and R. Therefore, for the LQR that will be used, the genetic algorithm (GA) optimization method will be used for Q and R adjustment. The GA has advantages over other optimization methods due to its ability to deal with complex problems and different optimization. For instance, it can deal with the linear or nonlinear, or with a system with random noise [7-10]. The MRAC that uses the GA to optimize its parameters and mechanism is called Genetic Model Reference Adaptive Control (GMRAC). The proposed GMRAC will be applied on a ball and beam system whose open loop is inherently unstable. This system has some uncertainty about its model due to the many assumptions considered when deriving the model. Also, this system is linked directly to real control problems in settings such as in an airplane: for instance, issues have arisen in horizontal stabilizing during landing, turbulent airflow, and automatic ball balancers in optic disk drivers [11]. Another problem of this system is nonlinearity where the open loop transfer function is nonlinear; to overcome
2022 ® Abdullah et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)
75
Journal of Automation, Mobile Robotics and Intelligent Systems
this problem, linearization with the modern state space method will be used around the horizontal region [12]. Although the model of the system has been linearized, it still represents typical systems in real life (e.g., horizontal stabilization of airplanes during landing) [12-14]. This paper deals with the design of the adaptive controller with a model reference scheme using the MIT rule. The principle of this work is to adjust the controller parameters in order to make the output of the plant (process) follow the output of the reference model for the same input.
2. Mathematical Model of Ball and Beam System
The ball-beam system involves a beam whose position can be adjusted by using an electrical motor and a ball that rolls on the top of the beam. This system has two degrees of freedom: one is for the rolling up and down of the ball, while the second one is from the beam rotating around its central axis. For this system, the torque generated by the motor will be used to control the ball position on the beam. The mathematical model of the ball and beam system has been explained in detail by many researchers, depending on the mathematical equations that drive the model of the system [12, 15-20]. Figure 1 shows the sketch of the system that this mathematical model is drawn from, including the torque balance of the beam as well as the force balance.
Fig. 1. Sketch map of the system [11] Many analyses must be completed to derive the mathematical model of the system. Firstly, the analysis of the balance force (Fb) depending on Newton’s law, and the torque balance of the motor (T motor), must be completed. Next, the equation of the used DC motor must be derived. All equations analyzed will be represented in state space, which can be used with state space control methods. The mathematical model may be expressed in state space form. The parameters that need to be controlled are the beam tilt angle (θ), the rate of change in θ, the ball position (x), and the rate of change in x. 76
Articles
VOLUME 16,
0 x 0 ¨ x = θ ¨ 0 θ M − ball J bm
1
0 0
0 g
2 R 1+ b 5 a1
2
0
0
0
N° 3
2022
0 x x θ 1 θ KK e − J bm 0
0 0 + 0 V K RJ bm
(1) x 1 0 0 0 x y= , 0 0 1 0 θ θ
(2)
where x is the position of the ball (m) and v is the control voltage [12]. Table 1 shows the parameter values that have been used. Tab. 1. Parameters of the ball-beam system Parameters
Value
Mass of ball (Mball)
0.0327 kg
Electromotive force constant (K)
4.91 Nm/A
Moment of inertia (Jbm)
0.062 kg/m2
Electric resistor (R) Ball radius (Rb)
4.7 ohms 0.01 m
It is crucial to notice that the system model depending on the physical and electrical laws in Equations 1 and 2 was depending on some assumptions. For instance, there is no slip between the ball and beam and the gearbox of the motor does not have backlash. The final model after the assumption and simplifications will be represented in Equations 3 and 4.
x ¨ 0 x 0 = θ 0 ¨ θ −5.17
1 0 0 3.7731 0 0
0 0
x 0 x 0 + (3) 1 θ 0 −105.1 θ 16.85 0 0
x 1 0 0 0 x y= 0 0 1 0 θ θ
(4)
where the input is the voltage (v) and the outputs are θ and x.
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
3. MRAC Methodology Adaptive controllers generally consist of two loops: the outer loop (normal feedback loop), and the inner loop (parameter adjustment loop). The traditional MRAC strategy is used to adjust the controller parameters so that the response of the actual plant follows the response of the reference model, where both have the same reference input. Whitaker proposed a MRAC in 1958 whose block diagram is illustrated below (Fig. 2) [21].
N° 3
and there are many methods to select its value. In this paper, we have used the GA adjustment mechanism. As the MIT rule is a gradient scheme which aims to minimize the squared model error e2 [22], the change in the parameter is in the negative gradient of J. If the process is linear with the transfer function while the k.G(s) equation is unknown, the underlying design provides a system with the transfer function km.G(s), where the value of Km is known [2, 23]. From Equation 5: E ( s ) = KG ( s )U ( s ) − kmG ( s )Uc ( s )
Defining a control law:
U ( s ) = θUc ( s )
E ( s ) = kG ( s ) θUc ( s ) − kmG ( s )Uc ( s )
Taking the partial derivative:
The reference model Gm(s) is used to create an optimal response of the adaptive system to the reference input Uc(s). The adjustable parameters are implemented to describe the controller while the values of θ depend on the adaptation gain. The most important block in the system is the “adjustment mechanism,” which is considered the heart of the MRAC, and its determination is crucial. For this work, the MIT rule has been chosen as the parameter adjustment mechanism, which is originally used in MRAC. For perfect tracking between the output of the plant (y) and the output of the reference model (ym), the squared model cost function must be minimized, so that the error function can be minimized. Using Equation 5, the error between y and ym can be determined. E ( s ) = y ( s ) − ym ( s )
(5)
According to the MIT rule, the cost function is defined as: J ( θ) =
( ) = kG s U s ( ) c( ) ∂θ
∂E s
From Figure 1:
ym = kmG ( s )Uc ( s ) so, G ( s )Uc ( s ) =
substitute it in (11).
∂E ( s ) ∂θ
=
(8)
(9)
Substituting Equation 9 in Equation 8: Fig. 2. Basic block diagram of a model-reference adaptive control (MRAC) system
2022
(10) (11)
ym km
K y (s) km m
Substituting Equation 12 in Equation 7, ´ dθ k = − γe ym ( s ) = − γ eym dt km
(12) (13)
Equation 13, which shows the law for adjusting the parameter θ, can be represented in Figure 3 as such:
e2 ( θ)
(6) 2 where θ, the controller parameter vector, is an adjustable parameter used to minimize J to zero. The parameter adjustment mechanism showed in Equation 3 is called the MIT rule.
Fig. 3. MIT rule for adjusting feed forward gain
where the ∂e / ∂θ component is the derivative of the sensitivity of the error while γ is the adaptation gain. Both indicate the error changing with respect to θ. The selection of γ is crucial to reduce the error
A reference model, whose pole positions determine the stability of the whole system, has been selected. For the output ym, which is the desired position of the ball on beam x, the input of the reference model is Uc(s), which is the voltage to the motor that rotates
dθ ∂J ∂e = −γ = − γe dt ∂θ ∂θ
(7)
Articles
77
Journal of Automation, Mobile Robotics and Intelligent Systems
the beam forward and backward. Reference model parameters have been selected so that the poles of the transfer function at x1 and x2 are placed on the left half of the s-plane. For the selected system, the ball-beam system, the most important specifications that need to be considered are the overshoot and settling time. This is so that the ball will reach its desired position in a specific time (settling time) while not going far from the desired position (overshoot). For this research, the required % OS <= 10%, while Ts<= 3 sec. To achieve these values, the transfer function with poles x1, 2= -2.5 ± i1.3229 will be chosen for the reference model. The transfer function Gm(s) of the reference model is defined as:
Gm ( s ) =
ym km 8 = = Uc ( s + x1 )( s + x2 ) s2 + 5s + 8
(14)
4. MRAC with LQR Controller and GA Optimization Method As the adjustment mechanism requires a method to choose θ values, the GA optimization method will be used with the MRAC. This combination is called the Genetic Model Reference Adaptive Control (GMRAC). Figure 4 shows the schematic representation of GMRAC (Fig. 4). The error between the outputs of the reference model and the plant is used to drive the linear quadratic regulator (LQR) controller parameters. The reference model is designed based on both control specifications and the position controller. This appropriate selection of a reference model leads to the stabilization of the entire system. To design the genetic adaptive controller, the behavior of the ball and beam system with the output of the reference model will be used. The genetic algorithm (GA) can be applied to tune the weight matrices Q and R of the LQR controller gains, which are unknown and approximated to reference values per requirement, to ensure an optimal control performance at nominal operating conditions. By using the approximation and adaptation of the reference model, the error derivatives will be calculated based on the GA [24].
Fig. 4. GA-tuned LQR controller based on modelreference approach
78
Articles
VOLUME 16,
N° 3
2022
For the LQR controller, the cost function used to find values of Q, R, which are the control input (θ), is represented in Equation 15. The goal is to reduce its value to the minimum.
∫ ( x Q x + U c R U c )dt
(15)
PA + AT B + Q − PBR −1P T P = 0
(16)
K = R −1B T P
(17)
u = −K X (t )
(18)
J =
∞
0
T
T
where R can be defined as the control-weighting matrix and Q is the state-weighting matrix. They are usually square and symmetric, and their choosing will be used to penalize the control signal and state variables respectively. Choosing a larger R means keeping the control input u(t) smaller to keep J small, while choosing a larger Q means keeping the state variables x(t) smaller. The other element that needs to be found is the P matrix, which represents the solution of the Algebraic Riccati equation, and it is given in Equation 16. To find the K matrix, Equation 17 can be used.
The optimal control signal (u) can be found using Equation 18. T
where K = k 1 k 2……kn , X = x 1 x 2……xn
for our system, n=4, where the system has four state variables. The closed-loop system that has the optimal Eigen values is given by:
x = Ac x = ( A − B K ) x
(19)
The genetic algorithm is a random search method that copies the process of natural evolution. The GA begins with no awareness of the accurate solution while relying on the response from its environment and evolution operators to find the best solution. The application of the basic operations permits the creation of new individuals, which have the opportunity to be better than their parents. The process above will keep repeating until it reaches individuals that represent the optimal solution. The architecture of the GA is shown in Figure 5 (Fig. 5) [25, 26, 27]. The tuning procedure using the GA starts with the definition of the chromosome representation (θ ) where θ = q11 , q22 , q33 , q44 , R . As illustrated in Figure 6, the chromosome is defined by five values that correspond to the five gains to be adjusted in order to achieve satisfactory behavior. [28]
Journal of Automation, Mobile Robotics and Intelligent Systems
VOLUME 16,
N° 3
2022
Tab. 2. Continued. GA property Fitness Function Selection Method Probability Of Selection
Crossover Method Mutation Method
Mutation Probability
Value/Method
( )
J θ =
( )
e2 θ 2
Normalized Geometric Selection 0.05
Scattering
Uniform Mutation 0.01
5. MRAC Simulation and Results
Fig. 5. Simulation flow chart for the computation of GA-LQR controller parameters
Fig. 6. Chromosome Definition θ = [Q , R ]
The simulation of the GMRAC with the system has been carried out with MATLAB and the Simulink in order to examine its effectiveness. MATLAB has m-files that can be used to build the controller and the optimization method; in the same time, Simulink can be used to show the results and analyze them. The first step of the simulation is running the GA program. For the GA, the convergence curve for each gain is called a particle. These particles, q11, q22, q33, q44 and R, are plotted in Figure 7 with population size 50 to give an initial idea how the GA converged to its final value.
Choosing the suitable GA tuning strategy to pick the target work is considered to be the most essential step, and it has been utilized to assess the fitness value of every chromosome. The objective function, J, is the sum of the square error between the tracking errors of ball-beam and the reference model along the same trajectory. It is crucial to use the squared error in the objective function in order to have more accurate results for smaller values of error.
J = ∑( y − y m ) / 2 2
where y is the system response, while ym is the model response. The genetic algorithm parameters chosen for the purpose of tuning are shown in Table 2. They were chosen depending on the system specifications, where these parameters are different from one system to another. Tab. 2. Parameters of GA. GA property
Value/Method
Population Size
50
Maximum Number Of Generations
100
(continued)
Fig. 7. Convergence curve for Q and R matrices of LQR Controller with population size 50 and reference model Gm The control weight matrix R and the state matrix Q obtained below are usually square and symmetric. 0 0 0 35.1 0 0.000191313 0 0 Q= 0 0 0.99505 0 0 0 0.998911 0
R= 0.349986 After finding the Q and R-values, the closed loop poles of the system and controller are: Articles
79
Journal of Automation, Mobile Robotics and Intelligent Systems
P _ cl = −108.87 ; − 1.81; − 0.9 ± i 1.55
In addition, the solution of the algebraic Riccati equation matrix P can be found to be: 39.4176 22.1122 22.7911 0.2022 22.1122 18.7683 26.1233 0.2340 P= 22.7911 26.1233 54.8376 0.4954 0.2022 0.2340 0.4954 0.0091
Finally, the feedback gain matrix K can be found using Equation 17. After running the simulation for the whole system, the response control law u is shown in Figure 8.
VOLUME 16,
N° 3
2022
method, the genetic algorithm (GA) has been used for parameter tuning of the LQR controller. A test of these results has been performed on SIMULINK, and the results show satisfactory performance. Adaptation of LQR based on MRGAC techniques improves the performance of the system, thus bringing up quick tracking and steady state control (% OS <= 10%, while Ts<= 3 sec).
AUTHORS
Abdulla I. Abdullah* – Systems and Control Engineering Department, Ninevah University, Mosul, 40001, Iraq, E-mail: abdullah.abdullah@uoninevah.edu.iq. Ali Mahmood – Systems and Control Engineering Department, Ninevah University, Mosul, 40001, Iraq, E-mail: ali.mahmood@uoninevah.edu.iq.
Mohammad A. Thanoon – Systems and Control Engineering Department, Ninevah University, Mosul, 40001, Iraq, E-mail: mohammed.alsayed@uoninevah. edu.iq. *Corresponding author
References Fig. 8. Response of control law u The response of the LQR controller tuned by the GA, according to the fitness function with reference model (Gm), are illustrated in Figure 9.
[1] [2] [3]
[4] Fig. 9. Response of ball-beam using MRGAC-LQR controller for reference model Gm
6. Conclusion
80
In this paper, the model of the ball-beam system has been presented and discussed in detail, and a linearized model with the modern state-space method has been used around the horizontal region. A Model Reference Adaptive Control (MRAC) using the MIT rule with the LQR controller was designed to control the position of a ball over a beam. As an optimization Articles
[5]
[6]
P. Jain and M. J. Nigam, “Design of a Model Reference Adaptive Controller Using Modified MIT Rule for a Second Order System,” vol. 3, no. 4,2013, pp. 477–484.
M. Mohan and P. CP, “A model reference adaptive pi controller for the speed control of three phase induction motor” International Journal of Engineering Research and, vol. V5, no. 07, 2016. X.-J. Liu, F. Lara-Rosano, and C. W. Chan, “Model-reference adaptive control based on Neurofuzzy Networks,” IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), vol. 34, no. 3, 2004, pp. 302–309. S. Kersting and M. Buss, “Direct and indirect model reference adaptive control for multivariable piecewise affine systems,” IEEE Transactions on Automatic Control, vol. 62, no. 11, 2017, pp. 5634–5649.
A, Abdulla, I. Mohammed, and A. Jasim. “Roll control system design using auto tuning LQR technique.” International Journal of Engineering and Innovative Technology, V7, no. 01, 2017. L. Lublin and M. Athans, “Linear quadratic regulator control,” in The Control Systems Handbook: Control System Advanced Methods, Second Edition, 2010.
Journal of Automation, Mobile Robotics and Intelligent Systems
[7]
[8] [9]
Y. S Dawood, A. K Mahmood, and M. A Ibrahim, “Comparison of PID, GA and Fuzzy Logic Controllers for Cruise Control System,” Int. J. Comput. Digit. Syst., vol. 7, no. 05, 2018, pp. 311–319. V. Dhiman, G. Singh, and M. Kumar, “Modeling and control of underactuated system using LQR controller based on GA,” in Lecture Notes in Mechanical Engineering, 2019. A. G. Pillai, E. R. Samuel, and A. Unnikrishnan, “Analysis of optimised LQR controller using genetic algorithm for isolated power system,” in Advances in Intelligent Systems and Computing, 2019, vol. 939.
[10] X.-S. Yang, “Genetic Algorithms,” Nature-Inspired Optim. Algorithms, pp. 91–100, Jan. 2021
[11] M. Rezaee and R. Fathi, “A new design for automatic ball balancer to improve its performance,” Mech. Mach. Theory, vol. 94, 2015, pp. 165–176. [12] E. A.Rosales, “A Ball-on-Beam Project Kit,” Proc. 22nd …, 2004.
[13] M. Shah, R. Ali, and F. M. Malik, “Control of ball and beam with LQR control scheme using flatness based approach,” 2019
[14] X. Li and W. Yu, “Synchronization of ball and beam systems with neural compensation,” Int. J. Control. Autom. Syst., vol. 8, no. 3, 2010
[15] C. G. Bolívar-Vincenty and Beauchamp-Báez, “Modelling the Ball-and-Beam System From Newtonian Mechanics and from Lagrange Methods,” Twelfth LACCEI Lat. Am. Caribb. Conf. Eng. Technol., vol. 1, 2014.
[16] Mr. Hrishikesh R. Shirke and Dr. Prof. Mrs. N. R. Kulkarni, “Mathematical Modeling, Simulation and Control of Ball and Beam System,” Int. J. Eng. Res., vol. V4, no. 03, Mar. 2015. [17] F. A. Salem “Mechatronics design of ball and beam system: education and research,” Mechatronics vol. 5, no. 4, 2015. [18] M. Keshmiri, A. F. Jahromi, A. Mohebbi, M. H. Amoozgar, and W. F. Xie, “Modeling and control of ball and beam system using model based
VOLUME 16,
N° 3
2022
and non-model based control approaches,” Int. J. Smart Sens. Intell. Syst., vol. 5, no. 1, 2012.
[19] D. Colón, Y. Smiljanic Andrade, A. M. Bueno, I. Severino Diniz, and J. Manoel Balthazar. “Modeling, control and implementation of a Ball and Beam system.” In 22nd International Congress of Mechanical Engineering-COBEM. 2013.
[20] M. Nokhbeh and D. Khashabi, “Modelling and Control of Ball-Plate System,” Math. Model., 2011. [21] K. B. Pathak Scholar, “MRAC BASED DC SERVO MOTOR MOTION CONTROL,” Int. J. Adv. Res. Eng. Technol., vol. 7, no. 2, 2016.
[22] S. A. Kochummen, N. E. Jaffar, and A. Nasar, “Model Reference Adaptive Controller designs of steam turbine speed based on MIT Rule,” 2016.
[23] M. Swathi and P. Ramesh, “Modeling and analysis of model reference adaptive control by using MIT and modified MIT rule for speed control of DC motor,” 2017. [24] W. Alharbi and B. Gomm, “Genetic Algorithm Optimisation of PID Controllers for a Multivariable Process,” Int. J. Recent Contrib. from Eng. Sci. IT, vol. 5, no. 1, 2017. [25] N. Razmjooy, M. Ramezani, and A. Namadchian, “A new LQR optimal control for a single-link flexible joint robot manipulator based on grey wolf optimizer,” Majlesi J. Electr. Eng., vol. 10, no. 3, 2016.
[26] A. Mahmood, M.Almaged, and A. Abdulla. “Antenna azimuth position control using fractional order PID controller based on genetic algorithm.” In IOP Conference Series: Materials Science and Engineering, vol. 1152, no. 1, p. 012016. IOP Publishing, 2021.
[27] A. Mahmood, A. Abdulla, and I. Mohammed, “Helicopter Stabilization Using Integer and Fractional Order PID Controller Based on Genetic Algorithm,” 2020. [28] P. Shen, “LQR control of double invertedpendulum based on genetic algorithm.” In 2011 9th World Congress on Intelligent Control and Automation, pp. 386-389. IEEE, 2011.
Articles
81