Journal of Automation, Mobile Robotics and Intelligent Systems, vol. 16, no. 4 (2022)

Page 1

Journal of Automation, Mobile Robotics and Intelligent Systems WWW.JAMRIS.ORG  pISSN 1897-8649 (PRINT)/eISSN 2080-2145 (ONLINE)  VOLUME 16, N° 4, 2022

Indexed in SCOPUS


Journal of Automation, Mobile Robotics and Intelligent Systems A peer-reviewed quarterly focusing on new achievements in the following fields: • automation • systems and control • autonomous systems • multiagent systems • decision-making and decision support • • robotics • mechatronics • data sciences • new computing paradigms • Editor-in-Chief

Typesetting

Janusz Kacprzyk (Polish Academy of Sciences, Łukasiewicz-PIAP, Poland)

PanDawer, www.pandawer.pl

Advisory Board

Webmaster

Dimitar Filev (Research & Advenced Engineering, Ford Motor Company, USA) Kaoru Hirota (Tokyo Institute of Technology, Japan) Witold Pedrycz (ECERF, University of Alberta, Canada)

TOMP, www.tomp.pl

Editorial Office ŁUKASIEWICZ Research Network – Industrial Research Institute for Automation and Measurements PIAP Al. Jerozolimskie 202, 02-486 Warsaw, Poland (www.jamris.org) tel. +48-22-8740109, e-mail: office@jamris.org

Co-Editors Roman Szewczyk (Łukasiewicz-PIAP, Warsaw University of Technology, Poland) Oscar Castillo (Tijuana Institute of Technology, Mexico) Marek Zaremba (University of Quebec, Canada)

The reference version of the journal is e-version. Printed in 100 copies.

Executive Editor

´ Katarzyna Rzeplinska-Rykała, e-mail: office@jamris.org (Łukasiewicz-PIAP, Poland)

Articles are reviewed, excluding advertisements and descriptions of products. Papers published currently are available for non-commercial use under the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0) license. Details are available at: https://www.jamris.org/index.php/JAMRIS/ LicenseToPublish

Associate Editor Piotr Skrzypczynski ´ (Poznan´University of Technology, Poland)

Statistical Editor ´ (Łukasiewicz-PIAP, Poland) Małgorzata Kaliczynska

Editorial Board: Krzysztof Malinowski (Warsaw University of Technology, Poland) Andrzej Masłowski (Warsaw University of Technology, Poland) Patricia Melin (Tijuana Institute of Technology, Mexico) Fazel Naghdy (University of Wollongong, Australia) Zbigniew Nahorski (Polish Academy of Sciences, Poland) Nadia Nedjah (State University of Rio de Janeiro, Brazil) Dmitry A. Novikov (Institute of Control Sciences, Russian Academy of Sciences, Russia) Duc Truong Pham (Birmingham University, UK) Lech Polkowski (University of Warmia and Mazury, Poland) Alain Pruski (University of Metz, France) Rita Ribeiro (UNINOVA, Instituto de Desenvolvimento de Novas Tecnologias, Portugal) Imre Rudas (Óbuda University, Hungary) Leszek Rutkowski (Czestochowa University of Technology, Poland) Alessandro Saffiotti (Örebro University, Sweden) Klaus Schilling (Julius-Maximilians-University Wuerzburg, Germany) Vassil Sgurev (Bulgarian Academy of Sciences, Department of Intelligent Systems, Bulgaria) Helena Szczerbicka (Leibniz Universität, Germany) Ryszard Tadeusiewicz (AGH University of Science and Technology, Poland) Stanisław Tarasiewicz (University of Laval, Canada) Piotr Tatjewski (Warsaw University of Technology, Poland) Rene Wamkeue (University of Quebec, Canada) Janusz Zalewski (Florida Gulf Coast University, USA) ´ (Warsaw University of Technology, Poland) Teresa Zielinska

Chairman – Janusz Kacprzyk (Polish Academy of Sciences, Łukasiewicz-PIAP, Poland) Plamen Angelov (Lancaster University, UK) Adam Borkowski (Polish Academy of Sciences, Poland) Wolfgang Borutzky (Fachhochschule Bonn-Rhein-Sieg, Germany) Bice Cavallo (University of Naples Federico II, Italy) Chin Chen Chang (Feng Chia University, Taiwan) Jorge Manuel Miranda Dias (University of Coimbra, Portugal) Andries Engelbrecht ( University of Stellenbosch, Republic of South Africa) Pablo Estévez (University of Chile) Bogdan Gabrys (Bournemouth University, UK) Fernando Gomide (University of Campinas, Brazil) Aboul Ella Hassanien (Cairo University, Egypt) Joachim Hertzberg (Osnabrück University, Germany) Tadeusz Kaczorek (Białystok University of Technology, Poland) Nikola Kasabov (Auckland University of Technology, New Zealand) ´ Marian P. Kazmierkowski (Warsaw University of Technology, Poland) Laszlo T. Kóczy (Szechenyi Istvan University, Gyor and Budapest University of Technology and Economics, Hungary) Józef Korbicz (University of Zielona Góra, Poland) Eckart Kramer (Fachhochschule Eberswalde, Germany) Rudolf Kruse (Otto-von-Guericke-Universität, Germany) Ching-Teng Lin (National Chiao-Tung University, Taiwan) Piotr Kulczycki (AGH University of Science and Technology, Poland) Andrew Kusiak (University of Iowa, USA) Mark Last (Ben-Gurion University, Israel) Anthony Maciejewski (Colorado State University, USA)

Publisher: ŁUKASIEWICZ Research Network – Industrial Research Institute for Automation and Measurements PIAP

All rights reserved ©

Articles

1


VOLUME 16, N° 4 2022 Journal of Automation, Mobile Robotics and Intelligent Systems

Concept of Using the Brain-Computer Interface to Control Hand Prosthesis Submitted: 8th November 2022 ; accepted 7th February 2023

Julia Żaba and Szczepan Paszkiel DOI: 10.14313/JAMRIS/4-2022/27 Abstract: This study examines the possibility of implementing intelligent artificial limbs for patients after injuries or amputations. Brain-computer technology allows signals to be acquired and sent between the brain and an external device. Upper limb prostheses, however, are quite a complicated tool, because the hand itself has a very complex structure and consists of several joints. The most complicated joint is undoubtedly the saddle joint, which is located at the base of the thumb. You need to demonstrate adequate anatomical knowledge to construct a prosthesis that will be easy to use and resemble a human hand as much as possible. It is also important to create the right control system with the right software that will easily work together with the brain-computer interface. Therefore, the proposed solution in this work consists of three parts, which are: the Emotiv EPOC + Neuroheadsets, a control system made of a servo and an Arduino UNO board (with dedicated software), and a hand prosthesis model made in the three-dimensional graphic program Blender and printed using a 3D printer. Such a hand prosthesis controlled by a signal from the brain could help people with disabilities after amputations and people who have damaged innervation at the stump site. Keywords: BCI, EEG, hand prosthesis, hand, prosthesis, 3D printing

1. Introduction Brain testing uses several methods, one of which is the measurement of brain waves. These brain waves can be collected in the form of electrical signals. The acquisition of brain signals can be done invasively and non-invasively. The invasive method involves placing sensors inside the scalp, but this is a risky course of action. The other method is noninvasive, and the sensors are implanted above the skin. However, this method is noisy, making it difficult to extract useful information. The connection between the brain and an external device is called the brain-computer interface (BCI) [1-3]. Currently, the most popular data source for BCI is EEG signals from surface brain activity. This is because these types of measurement are non-invasive [4, 5].

BCI can improve the quality of life for people with severe motor disabilities. BCI captures the user’s brain activity and translates it into commands that control an effector such as a computer cursor, a robotic limb, or a functional electrical stimulation device [6]. BCI has many applications, such as in medicine. RuiNa et al. [7] in their paper presented the control of an electric wheelchair using BCI. In their design they used visual evoked potentials: SSVEP. The wheelchair consists of a hybrid visual stimulator that combines the advantages of liquid crystal display (LCD) and light emitting diodes (LED). M. Vilela and L. R. Hochberg [6] described new developments to improve the user experience of BCI with effector robots. Full efficient manipulation of robots and prosthetic arms via a BCI system is challenging due to the inherent need to decode multidimensional and preferably real-time control commands from the user’s neural activity. Such functionality is fundamental if BCI-controlled robotic or prosthetic limbs are to be used for daily activities. BCI also has applications in rehabilitation, such as BCI-controlled robots. They are designed for motor assistance to help paralyzed patients to improve upper and lower limb mobility [8]. Different algorithms are used to classify brain signals. Channel selection is a key topic in BCI. Imagining hand movement is a frequently used component of the learning data set for algorithms. For example, Milanović [9] used a sequence of 70 tasks involving alternating imagining a right-hand movement and a resting hand movement. S. Soman and B. K. Murthy [10] created a design based on a BCI system for generating synthesized speech that operates on a blinking eye detected from the user’s electroencephalogram signals. Khan et al. [11] developed a broad overview of the applications of BCI interfaces in the context of the upper extremity. Gubert et al. [12] analyzed leftand right-hand motion imagery. They used publicly available databases and the CSP (Common Spatial Patterns) algorithm. Hernández-Del-Toro et al. [13] used the Emotiv EPOC interface. As a test sequence, they used a set of repetitions of imagined words spoken in Spanish (up, down, left, right, choice) repeated randomly 100 times each by 27 individuals. Fourteen EEG channels were used; the sampling rate was 128 Hz. Discrete wavelets transform (DWT) and fractal methods, among others, were used to analyze the signals. The nearest neighbor method (decision tree

2022 ® Żaba and Paszkiel. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)

3


Journal of Automation, Mobile Robotics and Intelligent Systems

4

method) and support vector machine (SVM) were used for classification, among other tools. Task irrelevant and redundant channels used in BCI can lead to low classification accuracy, high computational complexity, and application inconvenience. By choosing optimal channels, the performance of BCI can improve significantly. B. Shi et al. [14] in their paper proposed a novel binary harmony search (BHS) to select optimal channel sets and to optimize the accuracy of the BCI system. BHS is implemented on learning datasets to select optimal channels, and test datasets are used to evaluate the classification performance on the selected channels. The authors proposed a BHS method for selecting optimal channels in MI-based BCI. Their results validate the BHS algorithm as a channel selection method for motor imaging data. The BHS method, costing less computation time, gives better average test accuracy than steady-state genetic algorithms. The proposed method can improve the practicality and convenience of BCI system. F. M. Noori et al. [11] proposed a new technique for determining optimal feature combinations and obtaining maximum classification performance for BCI-based functional near-infrared spectroscopy (fNIRS). The results of the proposed hybrid GA-SVM technique, by selecting the optimal feature combinations for fNIRS-based BCI, provide opportunities to enhance classification performance. Janani et al. [15] applied a deep learning neural network algorithm to classify motion imagery based on infrared signal. Functional near-infrared spectroscopy (fNIRS) was used, in which infrared light passes through a hemodynamic system. The phenomenon of change in absorption of infrared radiation depending on the wavelength of radiation was used. The principle of operation is like the blood oxygen saturation meter. BCI will also find application in neuro-prosthetics. Neuroprosthetics is a combination of neuroscience and biomedical engineering. Implantable devices can significantly improve quality of life due to their unique performance. The combination of biomedical engineering and neuro-prosthetics has led to the development of new hybrid biomaterials that meet the needs of ideal neuroprosthetics. The site of implantation of the prosthesis determines the type of material and method of fabrication. P. Zarrintaj et al. [16] in their article described the types of biomaterials used for bionic neuroprostheses. The diversity of neuroprosthetics necessitates the use of a wide range of materials from organic to inorganic. However, using only metals, due to incompatibility with soft tissues, can cause inflammation. Metal-polymer hybrids can reduce the disproportion between soft tissues and electrodes, where the polymer part can regulate the modulus of the metal. Moreover, different types of electrodes should be selected for different types of signal recording. Therefore, the selection of biomaterials for neuroprostheses is crucial and requires knowledge of the electrode implantation site and material characteristics.

VOLUME 16,

N° 4

2022

2. Examples of Implementation Concept in the Field of Artificial Hand This article describes the concepts of a proprietary BCI-controlled hand prosthesis. The hand prosthesis controlled by the signal from the brain enables people with disabilities without a hand or after amputations, and people with damaged innervation at the stump site. This solution uses a non-invasive method, so people who are not entirely convinced of this method can test whether it suits them without interfering with their body. The main goals are to select an EEG device, design and construct a prototype of a hand prosthesis, select and program an appropriate control system. A prosthesis is a tool that supports or replaces an amputee in carrying out their daily tasks. Instead of passive devices that are purely aesthetic, the current devices have im-proved functionality using robotic technology. M. A. Abu Kasim et al. [17] presented their conceptual idea to use a non-invasive Emotiv headset to control a prosthetic hand using LabVIEW. This design is intended for the use of cost-effective upper limb prostheses controlled by signal artifacts and uses facial expressions. This device can be used and controlled by paralyzed persons with limited communication skills via a graphical user interface (GUI). It is worth noting that the non-invasive BCI method was used to create the project. The GUI is created with LabVIEW software connected to the Ar-duino board via a serial USB data connection. The use of body-powered prostheses can be tiring and lead to further compliance and prosthetic problems. BCI makes it possible to inspect dentures for patients who are otherwise unable to operate such devices due to physical limitations. The problem with BCIs is that they usually require invasive logging methods where surgery needs to be performed. G. Lange et al. [18] presented a study to test the ability to control the movement of an upper limb prosthetic terminal device by classifying electroencephalogram data from the actual grasping and releasing movement. Thus, they developed a novel EMG-assisted approach to classifying EEG data from hand movements. This demonstrates the possibility of a more intuitive control of the prosthetic end device of the upper limb with a low-cost BCI without the risk of invasive measurement. R. Alazrai, H. Alwanni, M. I. Daoud [19] described a new EEG-based BCI system that they used to decode the movements of each finger in the same hand. It is based on the analysis of EEG signals using the quadratic time frequency distribution (QTFD), or ChoiWilliam distribution (CWD). In particular, CWD is used to characterize the various components over time of spectral EEG signals and to extract functions that can capture motion-related information. The extracted CWD-based functions are used to create a two-tier classification structure that decodes the finger movements in the same hand. J. E. Downey, J. Brooks, S. J. Bensmaia [20] described technologies designed to sense the state of the hand and contact with objects and connect with the peripheral and central nervous systems. The skillful manipulation of objects is based not only on a


Journal of Automation, Mobile Robotics and Intelligent Systems

sophisticated motor system that moves the arm and hand, but also on the accumulation of sensory signals that convey information about the consequences of these movements. The development of a skillful bionic hand therefore requires the restoration of both control and sensory signals. It is important that the bionic hand is well constructed and allows for freedom of movement: to do this you need to properly attach the sensors. Research aims to create artificial sensory feedback through electrical nerve stimulation in amputees or electrical brain stimulation in tetraplegic patients. While artificial sensory feedback, still in its early stages, is already giving bionic hands more dexterity, ongoing research to make artificial limbs more natural offers hope for further improvements. Guger et al. [21] presented a system that uses EEG for hand prosthesis control. The digital input / output channels are used to control a remote control that is connected to a microcontroller to control the prosthesis. The microcontroller receives commands from the remote control and regulates the speed of the grip. The technique of imagining the movement was used to control the hand. After the appropriate beep was heard, the user had to imagine the movement of his left or right hand depending on the arrow that was displayed on the monitor. It all took a few seconds, and then for the next time the EEG signal was properly classified and used to control the prosthesis. One session required the authors to make as many as 160 attempts. The authors performed three sessions. The operation of the system is based on the BCI software and hardware. Matlab Simulink is used to calculate various parameters that describe the current EEG state in real time. Matlab also supports data acquisition, synchronization, and presentation of the experimental paradigm. In their article, G. R. Müller-Putz and G. Pfurtscheller [22] presented a prototype of a two-axis electric hand prosthesis control, which uses an asynchronous four-class BCI based on static and visual evoked potentials (SSVEP). The authors constructed a stimulation device. For the experiment with the prosthetic device, they modified the prosthesis of the hand in such a way that, in addition to the gripping function (opening and closing the fingers), it was also possible to rotate the wrist (left and right). Four red LEDs are mounted at specific locations on the armature. The authors used four healthy participants for their research. They performed four sessions of 40 attempts, and the participants had to follow the instructions given to them by a beep. Users also had to focus on the appropriate flashing lights attached to the prosthesis to trigger the appropriate prosthetic action. The LED lights were not attached accidentally. Each was attached precisely to make the right movement: one LED on the index finger to turn right, and one LED on the fifth finger to turn left. There were also two LEDs attached to the forearm. The first lamp was used to open the hand, and the second to close it. The authors proved that an SSVEP-based BCI, operating in asynchronous mode, is feasible for the control of neuroprosthetic devices. In this article, T. Beyrouthy et al. [23] presented a preliminary design of a mind-controlled, intelligent,

VOLUME 16,

N° 4

2022

3D-printed prosthetic arm. The arm is controlled by brain commands received from the headset via an EEG. The arm is equipped with a network of intelligent sensors and actuators. This smart network provides the arm with normal hand functionality and smooth movements. The arm has different types of sensors, including temperature sensors, pressure sensors, ultrasonic proximity sensors, accelerometers, potentiometers, strain gauges, and gyroscopes. EEG signals are recorded using the Emotiv EPOC wireless headset. The EEG signals provided by the input unit are sampled and processed by the processing unit. The arm is equipped with a special servo and an Arduino microcontroller, which ensures an appropriate interface between the mechanical and processing units. Multiple sensors allow the arm to interact with and adapt to the surrounding environment and to command the arm and provide feedback to the patient. In Constantine et al.’s application [24], they used a comprehensive model structure, from feature construction to classification, using a technological neural network. The process of starting from the beginning meant that the initial solution of the team was put together by the tools, starting from the beginning of the initial instantiation of the computer solution (CCI). The proposed architecture is complemented by the design and implementation of a hand prosthesis with Google Degree of Freedom (DOF). This incorporates a Field Programmable Gate (FPGA) that converts electroencephalographic (EEG) AR gates into prosthetic movement. They also proposed a new subject selection and grouping technique that is available with the subject’s motor intentions. The model implemented with the proposed architecture showed a successful pattern of 93.7% and a classification time of 8.8 years for FPGA. Their implementation allows the application of BCI for the technique used in FPGA practice. In their article, J. W. Sensinger, W. Hill, and M. Sybring explore the many aspects that influence the ability of an upper limb prosthesis to affect a person’s daily life. They argue that these influences can be categorized into four domains: aspects intrinsic to the person; factors focused on the design, control, and sensory feedback of the prosthesis; facets external to the person; and outcome measures used to evaluate devices, activities, or quality of life. The purpose of a prosthetic device is to improve a person’s quality of life [25].

3. Materials and Methods

The methodology has three stages: acquisition of EEG data from the selected BCI device, design and printing of a 3D-printed prosthesis hand, and programming of the control system. The EEG signal acquisition device is the Emotiv EPOC+ NeuroHeadset and has the following specifications [26]: - 14 recording electrodes and 2 reference electrodes, offering optimal positioning for accurate spatial resolution. - the channel names based on the international 10-20 electrode location system are: AF3, F7,

5


Journal of Automation, Mobile Robotics and Intelligent Systems

-

F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4, with CMS / DRL references at locations P3 / P4. uses sequential sampling method, single ADC, at 256 SPS (2048 Hz internal) - sample rate per second. operates at 16-bit resolution per channel with a frequency response of 0.16 - 43 Hz. supports Bluetooth Smart 4.0 LE. has high resolution (14-16 bit) typical operating time of the device from a full charge is 12 hours.

The control system is a microcontroller (Arduino UNO) and servo (Feetech servo FT6335M standard). Arduino is an open-source electronics platform based on easy-to-use hardware and software. The Arduino UNO is a microcontroller board that has 14 digital input/output pins (6 of which can be used as PWM outputs), 6 analog inputs, a 16 MHz ceramic resonator (CSTCE16M0V53-R0), a USB connector, a power jack, an ICSP jack, and a reset button. It is based on the ATmega328P. It is a low-power, 8-bit CMOS microcontroller type based on AVR® with an enhanced RISC architecture. By executing instructions in a single clock cycle, the device achieves processor throughput approaching one million instructions per second per megahertz, optimizing power consumption compared to processing speed. The Arduino UNO board is 5V. It has 32 kB of Flash memory, 2 kB of RAM, 14 digital I/Os of which 6 can be used as PWM channels, 6 analog inputs, and popular communication interfaces [27]. The prosthetic hand model was designed in Blender. Blender is a free and open-source 3D modeling software. It was developed by NeoGeo but has been developed by the Blender Foundation since 2002. From the beginning, Blender’s main programmer was Ton Roosendaal. It is available for various hardware and software platforms, including Microsoft Windows, macOS, and many others. The program caters to all the needs of 3D graphic designers. It can model, animate, simulate, render, compose and track motion, edit video, and create 2D and 3D animation [28].

4. Results

It is reasonable to assume that as a result of the loss of the hand (no hand), the brachial plexus is not functioning or may be damaged. This is a bundle of nerve fibers running from the spine all the way to the hand. It is important for the patient with the artificial hand to be able to control the prosthetic hand independently with the help of EEG signals. The task of such a prosthesis will be therefore the ability to execute the commands in correlation with Emotiv EPOC+ NeuroHeadset device. This solution will allow the patient to fully control his hand even if the nerves in the amputated limb are not fully functional.

VOLUME 16,

N° 4

The device that we chose to acquire EEG signals is the Emotiv EPOC+ NeuroHeadset. It allows communication with a computer based on brain activity, facial muscle tension, and emotions. It has 14 recording electrodes and 2 reference electrodes. This amount is sufficient in this case. It connects wirelessly to the computer and mobile devices and has 9-axis motion sensors. It stands out for its long working time (up to 12 hours). The device sets up quickly. It is also important to remember to properly moisten the reference sensors with saline solution so that signal reception occurs properly. In the box of the EPOC+ Headset (Fig. 1) are: - Brain-Computer Interface with built-in lithium battery, - universal USB receiver, - humidifier packet, - saline solution, - USB charger with Mini-B connector, - quick start guide.

4.2. Expressiv Suite Functions

The Expressiv Suite app in the Emotiv Control Panel features an avatar that mimics facial expressions and shows teeth clenching, left and right eye movements (Fig. 2), eye blinking, left or right eye blinking, eyebrow raising and smiling.

Fig. 1. Basic components of Emotiv EPOC+ Neuroheadset

4.1. EEG Signal Acquisition Device

6

In the global market, there are many companies producing Brain-Computer Interface devices. However, two companies play a key role: Emotiv Systems and NeuroSky.

2022

Fig. 2. Screenshot of the application Expressiv Suite during looking right


Journal of Automation, Mobile Robotics and Intelligent Systems

In this app, there is a control panel next to the avatar that allows you to adjust the sensitivity with sliders. For each facial expression, you can check its effectiveness. If the Expressiv Suite app does not respond easily to a particular facial expression, use the slider to increase the sensitivity. If the stimulus is triggered too easily, causing an unwanted result, then use the slider to decrease the sensitivity. You can increase or decrease the sensitivity by moving the sensitivity slider to the right or left respectively. Each of the seven types of facial expressions can also be assigned any action in the form of calling any combination of keys or mouse buttons. This makes it possible to operate applications, play games or control a device such as a wheelchair or prosthesis using facial expressions. The EmoKey is used for this (Fig. 3). Next to each slider is a key button, which is used to configure facial expressions for EmoKey. EmoKey combines Emotiv’s technology with applications, converting detected events into any combination of keystrokes. EmoKey runs in the background but is safe for your device and allows you to create mappings. EmoKey’s mappings are relatively simple, like linking the detection of teeth clenching to a mouse key press, for example. The app then immediately captures the moment when the user clenches their teeth. To configure facial expressions for EmoKey, you need to select the appropriate expression you want to link and click the Key button next to the description of, for example, clench teeth, which will bring up a configuration dialog. You can also set the facial expression to be continuous by selecting Hold in the key box. There are also options for further configurations, such as key hold time and key trigger delay; using these, only actions to which key presses are assigned are sent to the active application window. Some expressions have the option “occurs” and others have “is equal to,” “is greater than,” “is less than.” For example, when you type “0.3” in the condition field it will cause clench teeth to be shown when a clench greater than 30% of full scale is detected. You can also manage and save Emokey mappings using the EmoKey menu at the top of the Control Panel window. Mappings can be loaded or saved and can be suspended.

4.3. Methods of Prosthesis Design

The hand prosthesis was modeled in Blender. Many features of the rich software were used to create the hand. Among others, scaling, extrude function and Bevel function were used. Two solids were used to create the prosthesis: a cylinder and a cube. To enable the hand to have the right proportions Rotate and Move tools were used. The joints were created using cylinders, which were placed and scaled accordingly. The holes in the joints were made using the Boolean modifier. The fingers in the hand have two joints and resemble hinge joints. They consist of three parts, but the latter part is part of the metacarpus (Fig. 4).

VOLUME 16,

N° 4

2022

Fig. 3. Screenshot of the application Expressiv Suite with used EmoKey for clenching teeth

Fig. 4. Index finger design - side and top view (the red circle marks the hooks to which thin lines resembling tendons are attached) The saddle joint of the thumb is too complicated, so it was replaced by a hinge joint in the hand model. The thumb consists of 2 parts (Fig. 5). The latter part of the thumb connects immediately to the metacarpus, as in the other fingers. In addition, the thumb, so that it can replicate the behavior of the human hand, has been placed at an angle. The largest part of the hand and the prosthesis is the metacarpus (Fig. 6). It has a special depression at the bottom. At the top are parts that are supposed to reflect the tendons. The hand prosthesis resembles a human hand in appearance. However, its mobility is much less, as it has only 11 degrees of freedom and includes 9 movable joints. For the purpose of this project, however, this amount is sufficient. The final design of the hand prosthesis is shown in Fig. 7.

4.4. Final Model of Hand Prosthesis

The entire model consists of 10 parts that were printed on a 3D printer using PLA filament. PrusaSlicer software was used for 3D printing. The parts of the prosthesis were printed in two stages using the Creality Ender 3 printer. The parts of the fingers were printed together with proper spacing, and the metacarpals were printed separately. The

7


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 5. Thumb design Fig. 6. Metacarpus design

Fig. 8. 3D printed hand prosthesis – before assembly

Fig. 7. Final design of hand prosthesis metacarpal took the longest time to print: 10 hours. PLA filament in black and gray was used for printing. The prototype hand prosthesis consists of 10 parts. The parts were properly sawn after printing so that they could fit well. The parts of the prototype prosthesis were connected using 3 mm diameter screws. Figures 8 and 9 show the printed hand before and after assembly.

4.5. The Signal Transmission to the Prosthesis Hand

When performing a movement, the user does not need to make a muscle movement directly, but simply clenches his teeth or blinks his eye or raises his eyebrows. In creating an appropriate effective activity matrix, it is important to differentiate a given facial expression, and the movement should be appropriately assigned to a given facial expression. This provides the opportunity to properly classify the user’s intentions and thus build the executive system. Using Emotiv’s EPOC+ device, the EEG signal is acquired from the patient’s head surface using electrodes placed on the device. Using the Expressiv Suite 8

Fig. 9. 3D printed hand prosthesis – after assembly app included with the Emotiv EPOC+ NeuroHeadset hardware, it is possible to identify the facial expressions of the user using the device. The application uses EmoKey to assign the appropriate keys from the keyboard (i.e., time in µs of servo rotation) to a specific facial expression. The serial port monitor is a tool available to the Arduino software that allows the servo to be controlled. The minimum and maximum servo rotation time in µs is stored for a particular


Journal of Automation, Mobile Robotics and Intelligent Systems

Fig. 10. Diagram of signal transmission to the prosthesis

facial expression. Depending on the particular facial expression, the servo rotation time that was previously assigned to the particular facial expression is entered on the serial port monitor. This causes the servo to rotate, for example, by its maximum angle, which gives the effect of a hand movement. The acquisition of the EEG signal from the user’s head to the Expressiv Suite application is based on wireless communication using a Bluetooth connection. Receiving the signal from the computer by the control system, for the time being, is done by wire. A schematic of signal acquisition and transmission to the prosthetic hand is shown in Figure 10.

4.6. Communication

In the Expressiv Suite app, the user selects given facial expressions to which he assigns specific numbers using the EmoKey. These numbers are the corresponding rotation time of the servo. Two servo positions are demonstrated in the project: a 0-degree position and a 180-degree position. The 0-degree position corresponds to a time of 0 µs, and the 180-degree position corresponds to a time of 2400 µs. Table 1 shows the relationships, for example, of facial expressions to the finger movement of the prosthetic hand. A time of 2400 µs was assigned to the teeth clench expression and a time of 0 µs was assigned to the raised brow expression. When the user performs a given facial expression, this servo time is outputted on the serial port monitor, causing it to rotate and move the prosthetic hand. Facial expressions can be customized to the user’s liking, i.e., instead of a clench teeth, there can be a blink of the eye or a smile. Tab. 1. R elationship of facial expression to hand finger movement Facial expressions clench teeth raise brow

4.7. Tests

Servo rotation time

Movement

0 µs

finger bends

2400 µs

finger bends

The using of the hand prosthesis prototype was tested for a selected finger and for selected facial expressions. Cables were attached to the prototype and to the servo. For the test, the hand prosthesis was placed in such a position that it could be moved only through brain waves. It was also necessary to properly place the servo. Then the program dedicated to the microcontroller used was turned

VOLUME 16,

N° 4

2022

on along with the necessary tool — the monitor of the serial port, to be able to control the servo. The next step was to properly prepare the Emotiv EPOC+ NeuroHeadset. After preparing the device on the computer, the Expressiv Suite app was selected, to which appropriate servo rotation times were assigned to the given facial expressions using EmoKey. Lifting the eyebrows was assigned “2400” and “0” was assigned to the clenched teeth. The finger is bent when clamping with the teeth and when the eyebrow is lifted, the finger is straightened. Facial expressions can be adjusted depending on the user’s preferences; therefore, performance tests were also performed during sideways movement of the eyeballs and blinking. It is important to concentrate properly when performing a given facial expression. User can trace facial expressions by looking at the avatar in the Expressiv Suite app. It is also advisable that the user, before attempting to make movements of such a prosthesis, which is controlled by facial expressions, should practice the given facial expressions using the Expressiv Suite application itself. Fig. 11 shows the user during the prototype performance test.

4.7. Artifacts

The most common BCI is based on EEG signals, and there are a number of interferences during the electroencephalographic test. Artifacts can be divided into technical and biological. The sources of interference are artifacts introduced by physiological processes, i.e., muscle activity, facial expression, heart rate, and technical solutions, such as the power grid. Therefore, the signal must be significantly amplified and must also consider the voltages generated at the skin-electrode interface. After filtering out mains frequency interference and performing the filtering and feature extraction the signal should be clean. The result of these actions will be the expected signal properties. Undoubtedly, during diagnostic testing of EEG signals, artifacts are eliminated as much as possible. Normal signal EEG (no artifacts) shows in Fig. 12. Facial expressions, as already mentioned, are also among the artifacts, but for the purposes of using the Expressiv Suite in Emotiv Control Panel artifacts are as desirable as possible. In this application, therefore, there is a built-in algorithm for the detection of artifacts, or signal interference. Figures 12-15 show artifacts during different facial expressions.

5. Discussion

Some difficulties were encountered during the prototype development. The control problem was that initially in the concept implementation, the Congnitiv Suite app in Emotiv Control Panel could be used for control. However, this application requires a very high level of concentration and trained senses to be able to use it freely. Therefore, it was concluded that it would be better to control using the Expressiv Suite app,

9


Journal of Automation, Mobile Robotics and Intelligent Systems

which is more intuitive and simpler for the user. It is worth noting that, therefore, control by facial expressions is significant for this solution. When modeling the prosthetic hand, instead of using 14 joints as in

VOLUME 16,

N° 4

the human hand, it was decided that 9 moving parts would be sufficient for the purpose of this prototype, and the thumb saddle joint, which has too complicated a structure, was replaced by a hinge joint in the hand model.

Fig. 12. EEG signal – no artifacts Fig. 11. Test of using hand prosthesis

Fig. 13. Artifact – EEG signal during clenching teeth

Fig. 15. Artifact – EEG signal during smiling 10

2022

Fig. 14. Artifact – EEG signal during raising brows

Fig. 16. Artifact – EEG signal during blinking eyes


Journal of Automation, Mobile Robotics and Intelligent Systems

6. Conclusion

This article shows one of the many proposed solutions for improving the functioning of people without an upper limb. This proposal is to control a prosthetic arm using brain waves. The use of such a prosthesis is very important for disabled people. Such a hand prosthesis controlled by facial expressions can help amputees and people who have damaged innervation in the stump area. This solution uses a non-invasive method, so people who are not fully convinced by this method can test it for themselves without interfering with their bodies.

AUTHORS

Julia Żaba – Faculty of Electrical Engineering, Automatic Control and Informatics, Opole University of Technology, Opole, 45-758, Poland, E-mail: j.zaba. rzedow@gmail.com.

Szczepan Paszkiel* – Faculty of Electrical Engineering, Automatic Control and Informatics, Opole University of Technology, Opole, 45-758, Poland, E-mail: s.paszkiel@po.edu.pl. *Corresponding author

References [1] [2]

[3]

[4]

[5] [6]

Ramadan, R.A.; Vasilakos, A. V. Brain computer interface: control signals review. Neurocomputing, vol. 223, 2017, 26–44, doi:10.1016/J. NEUCOM.2016.10.024. Bernal, S.L.; Celdrán, A.H.; Pérez, G.M. Neuronal Jamming cyberattack over invasive BCIs affecting the resolution of tasks requiring visual capabilities. Comput. Secur. vol. 112, 2022, doi:10.1016/J.COSE.2021.102534.

Shivwanshi, R.R.; Nirala, N. Concept of AI for acquisition and modeling of noninvasive modalities for BCI. Artif. Intell. Brain-Computer Interface, 2022, 121–144, doi:10.1016/ B978-0-323-91197-9.00007-2. Dagdevir, E.; Tokmakci, M. Optimization of preprocessing stage in EEG based BCI systems in terms of accuracy and timing cost. Biomed. Signal Process. Control, 2021, 67, doi:10.1016/j. bspc.2021.102548.

Bassi, P.R.A.S.; Rampazzo, W.; Attux, R. Transfer learning and SpecAugment applied to SSVEP based BCI classification. arXiv 2020, doi:10.1016/j. bspc.2021.102542. Vilela, M.; Hochberg, L.R. Applications of brain-computer interfaces to the control of robotic and prosthetic arms. In Handbook of Clinical Neurology; Elsevier B.V., 2020; vol. 168, pp. 87–99.

VOLUME 16,

[7]

[8]

[9]

N° 4

2022

Na, R.; Hu, C.; Sun, Y.; Wang, S.; Zhang, S.; Han, M.; Yin, W.; Zhang, J.; Chen, X.; Zheng, D. An embedded lightweight SSVEP-BCI electric wheelchair with hybrid stimulator. Digit. Signal Process. vol. 116, 2021, 103101, doi:10.1016/J. DSP.2021.103101.

Robinson, N.; Mane, R.; Chouhan, T.; Guan, C. Emerging trends in BCI-robotics for motor control and rehabilitation. Curr. Opin. Biomed. Eng. vol. 20, 2021, 100354, doi:10.1016/J. COBME.2021.100354.

Miladinović, A.; Ajčević, M.; Jarmolowska, J.; Marusic, U.; Colussi, M.; Silveri, G.; Battaglini, P.P.; Accardo, A. Effect of power feature covariance shift on BCI spatial-filtering techniques: A comparative study. Comput. Methods Programs Biomed. vol. 198, 2021, doi:10.1016/j.cmpb.2020.105808.

[10] Soman, S.; Murthy, B.K. Using Brain Computer Interface for synthesized speech communication for the physically disabled. In Proceedings of the Procedia Computer Science; Elsevier B.V., vol. 46, 2015; 292–298. [11] Noori, F.M.; Naseer, N.; Qureshi, N.K.; Nazeer, H.; Khan, R.A. Optimal feature selection from fNIRS signals using genetic algorithms for BCI. Neurosci. Lett. vol. 647, 2017, 61–66, doi:10.1016/j.neulet.2017.03.013. [12] Gubert, P.H.; Costa, M.H.; Silva, C.D.; Trofino-Neto, A. The performance impact of data augmentation in CSP-based motor-imagery systems for BCI applications. Biomed. Signal Process. Control, vol. 62, 2020, doi:10.1016/j.bspc.2020.102152.

[13] Hernández-Del-Toro, T.; Reyes-García, C.A.; Villaseñor-Pineda, L. Toward asynchronous EEG-based BCI: Detecting imagined words segments in continuous EEG signals. Biomed. Signal Process. Control, vol. 65, 2021, doi:10.1016/j. bspc.2020.102351. [14] Shi, B.; Wang, Q.; Yin, S.; Yue, Z.; Huai, Y.; Wang, J. A binary harmony search algorithm as channel selection method for motor imagery-based BCI. Neurocomputing, vol. 443, 2021, 12–25, doi:10.1016/j.neucom.2021.02.051.

[15] Janani A.; Sasikala M.; Chhabra, H.; Shajil, N.; Venkatasubramanian, G. Investigation of deep convolutional neural network for classification of motor imagery fNIRS signals for BCI applications. Biomed. Signal Process. Control, vol. 62, 2020, 102133, doi:10.1016/j.bspc.2020.102133. [16] Zarrintaj, P.; Saeb, M.R.; Ramakrishna, S.; Mozafari, M. Biomaterials selection for neuroprosthetics. Curr. Opin. Biomed. Eng., vol. 6, 2018, 99–109.

11


Journal of Automation, Mobile Robotics and Intelligent Systems

[17] Kasim, M.A.A.; Low, C.Y.; Ayub, M.A.; Zakaria, N.A.C.; Salleh, M.H.M.; Johar, K.; Hamli, H. User-Friendly LabVIEW GUI for Prosthetic Hand Control Using Emotiv EEG Headset. In Proceedings of the Procedia Computer Science; Elsevier B.V., vol. 105, 2017; 276–281.

[18] Lange, G.; Low, C.Y.; Johar, K.; Hanapiah, F.A.; Kamaruzaman, F. Classification of Electroencephalogram Data from Hand Grasp and Release Movements for BCI Controlled Prosthesis. Procedia Technol., vol. 26, 2016, 374–381, doi:10.1016/j.protcy.2016.08.048. [19] Alazrai, R.; Alwanni, H.; Daoud, M.I. EEG-based BCI system for decoding finger movements within the same hand. Neurosci. Lett., vol. 698, 2019, 113–120, doi:10.1016/j.neulet.2018.12.045.

[20] Downey, J.E.; Brooks, J.; Bensmaia, S.J. Artificial sensory feedback for bionic hands. In Intelligent Biomechatronics in Neurorehabilitation; Elsevier, 2019; pp. 131–145 ISBN 9780128149423. [21] Guger, C.; Harkam, W.; Hertnaes, C.; Pfurtscheller, G. Prosthetic Control by an EEG-based BrainComputer Interface (BCI).

[22] Müller-Putz, G.R.; Pfurtscheller, G. Control of an electrical prosthesis with an SSVEP-based

12

VOLUME 16,

N° 4

2022

BCI. IEEE Trans. Biomed. Eng., vol. 55, 2008, 361–364, doi:10.1109/TBME.2007.897815.

[23] Beyrouthy, T.; Al Kork, S.K.; Korbane, J.A.; Abdulmonem, A. EEG Mind controlled Smart Prosthetic Arm. In Proceedings of the 2016 IEEE International Conference on Emerging Technologies and Innovative Business Practices for the Transformation of Societies (EmergiTech); IEEE, 2016; pp. 404–409. [24] Constantine, A.; Asanza, V.; Loayza, F.R.; Peláez, E.; Peluffo-Ordóñez, D. BCI System using a Novel Processing Technique Based on Electrodes Selection for Hand Prosthesis Control. IFAC-PapersOnLine, vol. 54, 2021, 364–369, doi:10.1016/J.IFACOL.2021.10.283.

[25] Sensinger, J.W.; Hill, W.; Sybring, M. Prostheses— Assistive Technology—Upper. Encycl. Biomed. Eng. vols. 2-3, 2019, 632–644, doi:10.1016/ B978-0-12-801238-3.99912-4. [26] EMOTIV Website online: (accessed on August 2022)

www.emotiv.com

[27] ARDUINO Website online: https://store.arduino.cc/ (accessed on August 2022) [28] BLENDER Website online: www.blender.org (accessed on August 2022)


VOLUME 16, N° 4 2022 Journal of Automation, Mobile Robotics and Intelligent Systems

NON‐LINEAR MODEL‐BASED PREDICTIVE CONTROL FOR TRAJECTORY TRACKING AND CONTROL EFFORT MINIMIZATION IN A SMARTPHONE‐BASED QUADROTOR Submitted: 15th June 2021; accepted: 6th September 2022

Luis García, Esteban Rosero DOI: 10.14313/JAMRIS/4‐2022/28 Abstract: In this paper, the design and implementation of a non‐ linear model‐based predictive controller (NMPC) for pre‐ defined trajectory tracking and to minimize the control effort of a smartphone‐based quadrotor are developed. The optimal control actions are calculated in each iter‐ ation by means of an optimal control algorithm based on the non‐linear model of the quadrotor, considering some aerodynamic effects. Control algorithm implemen‐ tation and simulation tests are executed on a smartphone using the CasADi framework. In addition, a technique for estimating the energy consumed based on control signals is presented. NMPC controller performance was compared with other works developed towards the con‐ trol of quadrotors, based on an H∞ controller and an LQI controller, and using three predefined trajectories, where the NMPC average tracking error was around 50% lower, and average estimated power and energy con‐ sumption slightly higher, with respect to the H∞ and LQI controllers. Keywords: Quadrotor, Model‐based Predictive Control, Smartphone, Trajectory Tracking, Energy Consumption.

Due to their features, quadrotors are increasingly used in smartphone‑based robots and unmanned vehicles as part of closed‑loop control systems. In [2], a mobile application is developed to measure and control the angular position of a test‑bed, where a smartphone is located, feeding back its position using its embedded sensors and calculating the control signal using a PD controller. In [3], an au‑ tonomous robot controlled via an Android‑based smartphone is developed using odometry derived from the robot’s wheels and image recognition using the smartphone’s camera to obtain a stereo image to describe three‑dimensional objects. This allows the robot to reach a certain position in a room and avoid obstacles placed in its way. Taking advantage of the inherent bene its of using an embedded system such as a smartphone, in the work developed in [4] the jOptimizer framework (based on java) and the CasADi framework (based on C) were tested to solve an MPC control optimization problem for the control of a smartphone‑based quadrotor for trajectory tracking. It can be seen that CasADi enables the solving of non‑linear programming (NLP) problems, achieving shorter processing times and less tracking error.

1. Introduction Interest in unmanned aerial vehicles (UAV) control has grown signi icantly in recent years in research areas focused on both military and civil applications (e.g., intelligence, reconnaissance, surveillance, ex‑ ploration of dangerous environments). One type of UAV that has caught the attention of the scienti ic community is the quadrotor which, because of its size, mechanical simplicity, low cost, maneuverability, light autonomy, and wide range of applications, is increasingly deployed and able to replace humans in dif icult and risky tasks. To develop a quadrotor’s control system, it is necessary to measure the position and orientation of the vehicle, which requires a light control card and various sensors. These features can be found in systems embedded in devices such as smartphones, whose use has grown rapidly in recent decades. Hard‑ ware features such as processing power, memory, sensors, and communication technologies, as well as its programmable architecture, combine to pro‑ vide a lot of potential bene its in mobile application development, so that the smartphone can execute complex computational tasks and make use of its peripherals [1].

Aerodynamic effects have signi icant impact on energy optimization and trajectory tracking, helping researchers to get closer to real‑life quadrotor perfor‑ mance, so that more appropriate control signals can be calculated. In [5], the quadrotor’s energy consump‑ tion is studied in three types of trajectories (minimum acceleration, jerk, and snap), contributing to the trajectory planning based on aerodynamic effects to improve energy ef iciency in vehicles using an opti‑ mization algorithm. In [6], position error reduction by incorporating aerodynamic drag into the dynamic model of a quadrotor and performing a drag and thrust compensation is achieved in single direction displacement tests and control actions computing via an ODROID‑C2 card. In [7], the mathematical model of a DC motor (including aerodynamic drag losses) and the dynamic model of a quadrotor are considered, and two problems are formulated to optimize trajectories with minimum energy consumption, subject to system constraints, with minimal computing time and ixed energy consumption, to identify an energy ef iciency function that quanti ies the energy saved in a mission. Developments have been made around optimal control for trajectory tracking and energy consump‑

2022 ® Garcia and Rosero This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)

13


Journal of Automation, Mobile Robotics and Intelligent Systems

tion optimization in quadrotors. One of the most widely used optimal controllers is the Model‑based Predictive Control (MPC), which aims to ind the optimal control signals by minimizing a cost function, subject to the system’s dynamics, inputs, and state variables, among other constraints. For this purpose, an optimal control problem (OCP) is solved at each iteration along a prediction horizon. In [8], a linear MPC controller is used to move a quadrotor in one direction (round trip) using information provided by a disturbance estimation model to suppress its effects. In [9], a particle ilter MPC (PF‑MPC) control is presented, which has advantages of conventional MPC and adds measured noise and unmeasured disturbances effects to follow a 2D trajectory and minimize disturbances in a quadrotor. In [10], a method to generate trajectories through a 3D terrain for a quadrotor light using an MPC with acceleration, position, and jerk linear constraints is presented, as well as the terrain map cost, solving a convex optimal control problem using the CVX package. In [11], a hierarchical MPC control is applied to a leet of quadrotors, which consists of a linear and time‑varying MPC (LTV‑MPC) at the top level to generate trajectories and avoid obstacles, and a linear and invariant in time MPC (LTI‑MPC) at the lower level to stabilize each quadrotor. In reviewed studies, no research and develop‑ ment around MPC controller implementation using a quaternion‑based non‑linear quadrotor model were found. In addition, no online non‑linear MPC (NMPC) has been implemented using a smartphone to reduce trajectory tracking error or energy con‑ sumption reduction. For these reasons, this paper presents a novel design and implementation of NMPC controllers based on the dynamic model of a quadrotor, considering aerodynamic effects, for pre‑ de ined trajectories tracking and energy consumption monimization. The optimal control problem is solved using the CasADi framework for the control algorithm. This paper is structured as follows: a dynamic model of the quadrotor in X con iguration, including the aerodynamic effects and the estimation of energy consumption, is presented in Section 2; an MPC op‑ timal control problem is de ined in Section 3; in Sec‑ tion 4, the implementation of the designed MPC con‑ trollers and trajectory tracking tests are performed; and conclusions are presented in Section 5.

2. Quadrotor Model A quaternion‑based quadrotor dynamic model is used [12–15], which has advantages for eliminating ef‑ fects such as gimbal‑lock and the discontinuities of the Euler angles‑based model [13]. This model includes the aerodynamic effects due to translational drag, ro‑ tational drag, and gyroscopic torque. The system dy‑ namics can be expressed in relation to an inertial ref‑ erence frame to measure the quadrotor position, and to a body‑ ixed frame to measure the quadrotor rota‑ 14

VOLUME 16,

F3

2022

F4

zb

ω3

N° 4

ω4

L

F2 /4

L

yb

ω1

xb

zw ω2

F1

yw xw Fig. 1. Quadrotor scheme in “X” configuration. tion, as shown in Figure 1. The quadrotor model is based on the state vector which is de ined as: ]T [ X = ξ T ξ̇ T q T η T , (1) [ ]T [ ]T where ξ = x y z and ξ̇ = ẋ ẏ ż indi‑ cate the position and velocity, respectively, in the in‑ [ ]T ertial reference frame, q = q0 q1 q2 q3 is the [ ]T orientation quaternion, and η = ωx ωy ωz rep‑ resents the angular velocity of the quadrotor, both in the body‑ ixed frame. Furthermore, the input vector [ ]T [ ]T U = Ft h τ = Ft h τx τy τz is de ined as: 

  Fth 1  τx  −kM   U =  τy  =  −Lx τz −Lx

  1 F1  F2  kM   , L x   F3  −Lx F4 (2) where Fth , τx , τy , and τz indicate the thrust on the z axis and the torques on the x, y, and z axis in the body‑ ixed frame, and Fi = kT i 2 (i = 1, 2, 3, 4) is the thrust applied by the i‑th motor. The quaternion‑ based quadrotor model will be de ined as: 1 kM −Lx Lx

1 −kM Lx Lx

[ ]   0 1 ∗    q ⊗ Fth /m ⊗ q + g − m Dv  ξ̈   [ ] , 0 f (x, k) = q̇  =  1   q⊗ 2   η η̇ Jq−1 (τ − η × Jq η − τgyro − Dη ) (3) where m is the quadrotor mass, g is the gravity vec‑ tor, and Jq = diag (Jxx , Jyy , Jzz ) is the vehicle inertia matrix. Dv , Dη , and τgyro correspond to the aerody‑ namic losses due to translational drag, rotational drag, and gyroscopic torque, respectively, which are de ined as follows: Dv = Kv diag(Rq ξ̇)Rq ξ̇,   4 0 ∑ τgyro = Jr η ×  0 , i=1 (−1)i ωi

(4)

Dη = Kη diag(η)η,

(6)

(5)


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

where Kv and Kη are the diagonal matrices that de‑ ine the translational and rotational drag coef icients, respectively, Jr is the rotor moment of inertia, and Rq is the quaternion rotation matrix de ined as: )  ( 2 2 q0 + q12 − 1 Rq =  2 (q0 q3 + q1 q2 ) 2 (q1 q3 − q0 q2 )

2 ((q1 q2 − q)0 q3 ) 2 q02 + q22 − 1 2 (q0 q1 + q2 q3 )

Pi = Fi ωi = kT ωi3

2 (q1 q3 + q0 q2 ) 2 ((q2 q3 − q)0 q1 )  . 2 q02 + q32 − 1

[N ]

(7)

By replacing equation 7 in equation 2, a relation‑ ship between control signals and motor speeds is pre‑ sented. In this way, solving the equation for the speeds, it is determined that: √ 1 4kT

ω1 = √

1 4kT

ω2 = √

1 4kT

ω3 = √ ω4 =

1 4kT

( Fth −

τx τy τz − − L L kM

Fth +

τx τy τz − + L L kM

( ( Fth + ( Fth −

τx τy τz + − L L kM τx τy τz + + L L kM

) ,

1 4

ω̄2 ≈

1 4

ω̄3 ≈

1 4

ω̄4 ≈

1 4

√ √ √ √

mg 1 + √ kT 4 kT mg mg 1 + √ kT 4 kT mg mg 1 + √ kT 4 kT mg mg 1 + √ kT 4 kT mg

(

)

(8)

j=1

3. NMPC Controller NMPC controller design is based on the mathemat‑ ical non‑linear modelling of the system. An optimiza‑ tion problem is de ined to minimize tracking error and the control signal. For this, optimal control inputs are calculated by solving an optimization problem in each instant of time. The optimal control problem is repre‑ sented as [16, 17]: Np [ ∑

subject to

.

Fth +

τx τy τz + − L L kM τx τy τz + + L L kM

T

(ξk − rk ) H (ξk − rk ) + uTk Ruk

]

k=1

)

τy τz τx − + L L kM

Fth −

ωi3 .

T

u

,

Fth +

(

4 ∑

Furthermore, the energy consumed over time T is related to the power through: ∫ E= P.dt.

minimize

τy τz τx − − L L kM

(

Pi = k T

j=1

,

Fth − (

4 ∑

)

xk+1 = RK4 (f (xk , uk ) , Ts ) , k = 1, . . . , Np − 1, xmin ≤ xk ≤ xmax ,

k = 1, . . . , Np − 1,

≤ uk ≤ u

k = 1, . . . , Np − 1,

u

Speeds are used to calculate the gyroscopic torque given in equation 5. To reduce the computational load caused by the square roots that contain the estimated speeds of the motors, a Taylor series linear approxima‑ tion of these equations is obtained as follows, based on [ ]T the inputs Ū = mg 0 0 0 (hovering): ω̄1 ≈

i = 1, 2, 3, 4.

Therefore, the net power consumed by the rotors

P =

, i = 1, 2, 3, 4.

,

is:

2.1. Rotor Speeds Estimation

Fi = kT ωi2

2022

elements. The power consumed by each motor is cal‑ culated by: 

Since there are no sensors to measure motor speeds, an estimation is obtained using the relation‑ ship between the control inputs and the thrust forces of the motors. Furthermore, it is known that the thrust force of the motor is proportional to the square of its speed, such that:

N° 4

) , ) , ) , ) .

Solving equation 5:   ωy τgyro = Jr −ωx  (−ω1 + ω2 − ω3 + ω4 ) . 0 2.2. Energy Consumption Estimation A key focus of this study is to reduce the power con‑ sumption of the quadrotor. For this reason, the con‑ sumption from the quadrotor’s rotors is considered since it is much higher than that of the other electronic

min

max

∥qk ∥ = 1,

,

k = 1, . . . , Np ,

x0 = xest , (9) where RK4 is the fourth order Runge‑Kutta integrator applied to the model given by equation 3 and is evalu‑ ated at (xk , uk ). H and R are diagonal matrices which contain the reference tracking and control weights for the cost function. The estimated states vector xest is fed back at each iteration to compute the next opti‑ mal control signal. This algorithm was implemented in Android using C code generated through the CasADi framework.

4. Implementation Results Performance tests were developed using two C functions created in Matlab via the CasADi framework: one function calculates the optimal control signal uk and the other feeds back the estimated states xk+1 by evaluating the quadrotor dynamic function using a fourth order Runge‑Kutta integrator. Estimated states vector xest corresponds to the xk+1 state calculated in the previous iteration, which is used as the initial state for the optimal control problem. To implement the light control algorithm, an LG Nexus 5X smart‑ phone with a Hexacore Qualcomm Snapdragon 808 CPU (with a maximum clock speed of 1.8 GHz), 2 GB of RAM, and Android 8.0 operating system was used. In addition, three test trajectories were de ined: the 15


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Tab. 1. Comparison of MPC controllers implemented on Android. Model Non‑linear Non‑linear + Aero

Trajectory Square Helical Crop Square. Helical Crop

tev [ms] tmean σt 42.29 10.94 59.47 31.82 59.47 23.97 84.69 39.55 86.06 45.48 73.15 19.40

square trajectory is de ined as a square with side lengths of 2 m, whose vertices are located at (0, 0, 2), (2, 0, 2), (2, 2, 2), and (0, 2, 2); the helical trajectory is de ined as an ascending spiral with a radius of 2 m and origin at (0, 0, 3) that reaches a height of 8 m; and the crop trajectory is represented by a series of lines that simulate movement through a crop, so that the quadrotor starts its trip at position (0, 5, 2) and ends at position (7.5, 0, 2). A prediction horizon Np = 20 is used in the simulations. In preliminary tests, a comparison was made between two types of MPC controllers: an MPC controller based on the non‑linear model without aerodynamics and an MPC controller based on the non‑linear model including the mentioned aerody‑ namic effects. This comparison was carried out to analyze the evaluation times, tracking errors, and es‑ timated power and energy consumption. The results of the tests implemented on the smartphone using the CasADi framework are shown in Table 1, where tev is the evaluation time with mean evaluation time tmean and standard deviation σt , RMSE is the root mean square trajectory tracking error, and Pmean and Emean are the mean power and energy consumption. The mean evaluation time for the non‑linear MPC control without aerodynamics was 53.74 ms, while the average evaluation time for the non‑linear MPC control with aerodynamics was 81.30 ms. The average trajectory tracking error for the non‑linear MPC control without aerodynamics was 0.32 m, while for the non‑linear MPC control with aerodynamics it was 0.31 m. The estimated average power consumption was 9.74 kW for the non‑linear MPC control both with and without aerodynamics. The estimated average energy consumption for the non‑linear MPC control without aerodynamics was 698.44 kJ, while for the non‑linear MPC control with aerodynamics it was 698.45 kJ. Subsequently, trajectory tracking tests were performed, where the performance of the non‑linear MPC controller based on the model that includes the aerodynamic effects was compared with H∞ and LQI controllers, which were designed based on the linearized model of the quadrotor, as shown in [18]. Tests were performed using Matlab and Simulink, and results are shown in Table 2. Figure 2 shows square trajectory tracking, where the MPC control shows an RMS error of 0.04 m, which is much lower than 16

RMSE [m] 0.09 0.49 0.38 0.07 0.49 0.38

Pmean [kW ] 9.73 9.74 9.74 9.73 9.74 9.74

Emean [kJ] 779.47 584.91 730.95 779.47 584.91 730.94

the tracking error of the H∞ and LQI controllers. However, estimated average power and energy con‑ sumption was slightly higher than the LQI control, but lower than the H∞ control. Figure 3 shows helical trajectory tracking, where the MPC control shows an RMS error of 0.22 m, which is much lower than the tracking error reached by H∞ and LQI controllers, but the estimated average power and energy consump‑ tion was slightly higher than both the LQI and H∞ controllers. Figure 4 shows crop trajectory tracking, where the MPC controller shows an RMS error of 0.19 m, which is almost half that of the tracking error for the H∞ and LQI controllers, while the estimated average power and energy consumption was slightly higher than for the LQI and H∞ controllers. Because light conditions outside are not ideal for a quadrotor, a translation test is de ined where pulses are applied to the position of the system to ana‑ lyze MPC control disturbance rejection. These distur‑ bances emulate wind low that can affect the move‑ ment of the aircraft along the trajectory. As can be seen in Figure 5, the MPC controller stabilizes the quadro‑ tor to track reference trajectory, reaching a tracking error of 0.07 m. Tab. 2. Comparison of MPC, H∞ , and LQI controllers for trajectory tracking. Model

Control

Square

Helical

Crop

MPC H∞ LQI MPC H∞ LQI MPC H∞ LQI

RMSE [m] 0.04 0.07 0.08 0.22 0.95 0.84 0.19 0.36 0.40

Pmean [kW ] 9.91 9.76 9.75 9.75 9.75 9.75 9.74 9.73 9.73

Pmean [kW ] 647.52 634.73 634.31 490.78 487.83 487.64 636.73 632.97 633.01

5. Conclusion In this work, non‑linear model‑based predictive control (NMPC) algorithms were implemented to control a quadrotor for trajectory tracking, where an optimization problem must be solved at each sam‑ pling instant. Controllers were designed via Matlab


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

1 1 0.5

0.5

0 0

10

20

30

40

50

0

60

0

5

10

15

20

25

30

35

40

45

5

10

15

20

25

30

35

40

45

5

10

15

20

25

30

35

40

45

1

1 0.5

0.5

0 0

10

20

30

40

50

0

60

0

3

1.5

2

1

1

0

10

20

30

40

50

0.5 0

60

Fig. 2. Comparison of controllers on square trajectory tracking.

2

0

-2 0

10

20

30

40

50

60

0

10

20

30

40

50

60

2 0 -2 8 6 4 2

0

10

20

30

40

50

60

Fig. 3. Comparison of controllers on helical trajectory tracking.

6 4 2 0

0

10

20

30

40

50

power, thus demonstrating the advantages of us‑ ing smartphones in dynamic system control loops and their ability to handle an MPC controller with computationally‑intensive calculations. Also, these type of devices provide various sensors that can be used for state estimation in real implementation. It is recommended to use a smartphone with high pro‑ cessing power, so that calculations can be executed without affecting the sampling time of the system. Establishing the same optimal control problem for both simulation cases (nonlinear MPC with and without aerodynamic effects), a signi icant reduction in tracking error and slight reduction of estimated energy consumption was obtained using the NMPC controller which considers aerodynamic effects. How‑ ever, a very similar performance can be achieved with the NMPC controller without aerodynamics, which requires less evaluation time to solve the optimal control problem. However, an evaluation of the aero‑ dynamic effects on the system must be considered according to the structure of the quadrotor used in tests.

60

4 2 0 0

10

20

30

40

50

60

0

10

20

30

40

50

60

3

2

1

Fig. 5. NMPC controller disturbance rejection.

Fig. 4. Comparison of controllers on crop trajectory tracking.

using the CasADi framework and exported as C lan‑ guage iles, which were used as libraries to implement an Android‑based smartphone application. The main challenge of this development was to execute MPC algorithms and emulate a quadrotor’s behavior on a smartphone, taking advantage of its processing

A nonlinear MPC (NMPC) controller was com‑ pared with H∞ and LQI controllers [18] in path tracking tests, where NMPC control reached 50% lower tracking error in square trajectory, 80% less for helical trajectory and 40 − 50% less for the crop trajectory. Estimated energy consumption was about 2% higher than H∞ and LQI controllers in the square trajectory and about 1% higher in the helical and crop trajectory, due to the control effort. Also, depending on the test trajectory, H∞ and LQI controllers had to be adjusted to correct the tracking error, while the MPC control used only one design for all tests. It should be noted that NMPC controller was based on the non‑linear model of the quadrotor, while the H∞ and LQI controllers were based on the linearized model, so input thrust compensation may be required to hold the quadrotor on a desired height. It could be noted that the energy consumption was lower in the helical trajectory, which has a smooth shape compared to the square and crop trajectories. This is 17


Journal of Automation, Mobile Robotics and Intelligent Systems

because abrupt changes in trajectory due to ”corners” require strong movements for the quadrotor and, therefore, a greater control effort is required to follow the reference path. Although a quadrotor is an inherently unstable system which has complex dynamics, a smooth tran‑ sient response was achieved by implementing the NMPC controller using a smartphone. It was observed that the settling time obtained by using the NMPC con‑ troller was lower than in the H∞ and LQI controllers (about 80% less), which contributed to the reduction of trajectory tracking error. This shows that the NMPC control algorithm developed in this work can be im‑ plemented for real‑life applications where the aircraft can be tested on demanding paths.

AUTHORS

Luis García∗ – School of Electrical and Electronic Engineering, Universidad del Valle, Ciudad Univer‑ sitaria Melendez, Calle 13 # 100‑00, Cali, Colom‑ bia, e‑mail: luis.linares@correounivalle.edu.co, www: https://gici.univalle.edu.co/. Esteban Rosero – School of Electrical and Electronic Engineering, Universidad del Valle, Ciudad Universi‑ taria Melendez, Calle 13 # 100‑00, Cali, Colombia, e‑mail: esteban.rosero@correounivalle.edu.co, www: https://gici.univalle.edu.co/. ∗

Corresponding author

REFERENCES [1] A. Banerjee and A. Roychoudhury, “Future of mobile software for smartphones and drones: Energy and per‑ formance,” pp. 1–12, 2017. [2] J. A. Frank, A. Brill, J. Bae, and V. Kapila, “Exploring the role of a smartphone as a motion sensing and control device in the wireless networked control of a motor test‑bed,” in 2015 12th International Conference on In‑ formatics in Control, Automation and Robotics (ICINCO), 2015, pp. 328–335. [3] C. Bodenstein, M. Tremer, J. Overhoff, and R. P. Wü rtz, “A smartphone‑controlled autonomous robot,” in 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2015, pp. 2314–2321. [4] L. Garcia, A. Astudillo, and E. Rosero, “Fast model predictive control on a smartphone‑based light con‑ troller,” in 2019 IEEE 4th Colombian Conference on Au‑ tomatic Control (CCAC), 2019, pp. 1–6. [5] N. Kreciglowa, K. Karydis, and V. Kumar, “Energy ef i‑ ciency of trajectory generation methods for stop‑and‑ go aerial robot navigation,” in 2017 International Con‑ ference on Unmanned Aircraft Systems (ICUAS), 2017, pp. 656–662. [6] J. Svacha, K. Mohta, and V. Kumar, “Improving quadro‑ tor trajectory tracking by compensating for aerody‑ namic effects,” in 2017 International Conference on Un‑ manned Aircraft Systems (ICUAS), 2017, pp. 860–866. [7] F. Yacef, N. Rizoug, L. Degaa, O. Bouhali, and M. Hamer‑ lain, “Trajectory optimisation for a quadrotor heli‑ copter considering energy consumption,” in 2017 4th International Conference on Control, Decision and Infor‑ mation Technologies (CoDIT), 2017, pp. 1030–1035. 18

VOLUME 16,

N° 4

2022

[8] Z. Wang, K. Akiyama, K. Nonaka, and K. Sekiguchi, “Ex‑ perimental veri ication of the model predictive con‑ trol with disturbance rejection for quadrotors,” in 2015 54th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), 2015, pp. 778–783. [9] K. Shimada and T. Nishida, “Particle ilter‑model pre‑ dictive control of quadcopters,” in Proceedings of the 2014 International Conference on Advanced Mecha‑ tronic Systems, 2014, pp. 421–424. [10] R. Singhal and P. B. Sujit, “3d trajectory tracking for a quadcopter using mpc on a 3d terrain,” in 2015 In‑ ternational Conference on Unmanned Aircraft Systems (ICUAS), 2015, pp. 1385–1390. [11] A. Bemporad and C. Rocchi, “Decentralized linear time‑ varying model predictive control of a formation of un‑ manned aerial vehicles,” in 2011 50th IEEE Conference on Decision and Control and European Control Confer‑ ence, 2011, pp. 7488–7493. [12] M. E. Guerrero‑Sanchez, H. Abaunza, P. Castillo, R. Lozano, and C. D. Garcı́a‑Beltrá n, “Quadrotor energy‑based control laws: A unit‑quaternion ap‑ proach,” Journal of Intelligent & Robotic Systems, no. 2, pp. 347–377, 2017. [13] J. Cariñ o, H. Abaunza, and P. Castillo, “Quadrotor quaternion control,” in 2015 International Conference on Unmanned Aircraft Systems (ICUAS), 2015, pp. 825– 831. [14] W. Dong, G.‑Y. Gu, X. Zhu, and H. Ding, “Modeling and control of a quadrotor uav with aerodynamic con‑ cepts,” International Journal of Aerospace and Mechan‑ ical Engineering, no. 5, pp. 901–906, 2013. [15] A. Chovancová , T. Fico, L. Chovanec, and P. Hubinsk, “Mathematical modelling and parameter identi ication of quadrotor (a survey),” Procedia Engineering, pp. 172–181, 2014. [16] T. T. Ribeiro, A. G. Conceiçao, I. Sa, and P. Corke, “Non‑ linear model predictive formation control for quad‑ copters,” IFAC‑PapersOnLine, no. 19, pp. 39–44, 2015. [17] M. Neunert, C. De Crousaz, F. Furrer, M. Kamel, F. Farshidian, R. Siegwart, and J. Buchli, “Fast nonlin‑ ear model predictive control for uni ied trajectory opti‑ mization and tracking,” in 2016 IEEE international con‑ ference on robotics and automation (ICRA). IEEE, 2016, pp. 1398–1404. [18] A. Astudillo, B. Bacca, and E. Rosero, “Optimal and robust controllers design for a smartphone‑based quadrotor,” in 2017 IEEE 3rd Colombian Conference on Automatic Control (CCAC), 2017, pp. 1–6.


VOLUME 16, N° 4 2022 Journal of Automation, Mobile Robotics and Intelligent Systems

Amplitude-Energy Parameters of Acoustic Radiation with Composite Properties Changing and Mises Destruction Submitted: 10th June 2022; accepted 4th August 2022

Sergii Filonenko, Anzhelika Stakhova DOI: 10.14313/JAMRIS/4-2022/29 Abstract: The main problem with using acoustic emission to control and diagnostics of composite materials and products from composite materials is the interpretation and identification of recorded information during development processes occurring in the material’s structure. This is due to the high sensitivity of the acoustic emission method to various influencing factorsand the practical absence of acoustic radiation models. To solve this problem, it is necessary to determine the influence of various factors on acoustic radiation parameters. In this study, based on the acoustic radiation developed model we simulate the influence of one parameter characterizing composite properties on acoustic emission energy parameters during composite material destruction by shear forces according to the von Mises criterion. Simulation of acoustic radiation under given conditions makes it possible to determine the patterns of acoustic emission signals energy parameters changes and their sensitivity to changes of influencing factor, as well as to obtain mathematical expressions for describing obtained patterns. The results of this case study can be useful for developing methods of control, monitoring and diagnostics of composite materials and products made from composite materials. Keywords: composite, destruction, acoustic emission, signal amplitude, signal energy, von Mises criterion

1. Introduction Methods of the control, monitoring, and diagnostics of composite materials (CM) are aimed at ensuring quality at all stages of the product life cycle[1, 2]. One of the widely used methods is the method of acoustic emission (AE)[3]. The AE method is highly sensitive to sub-micro, micro, and macro processes occurring in the structure of various materials during their deformation, including CM. One research areaisthe study of CM destruction processes under shear force[4, 5]. In studies of the destruction of CM by transverse force, the concept of the destruction of CM represented as a bundle of fibers (FBM model) has become widespread [6, 7]. In studies of CM destruction by transverse force, the analysis of models and modeling of the CM destruction

process is carried out. At the same time, the complex dynamics of developing processes, and significant amounts of AE information recorded at various levels, lead to ambiguity in the patterns of AE parameter changes and their use for constructing methods for the control, monitoring, and diagnostics of CM. In some articles, we obtained analytical dependencies that describe acoustic radiation during CM destruction by shear force using the OR criterion and von Mises criterion. It was shown that the generated parameters of AE signals are influenced by various factors: the rate of composite deformation, the composite physical and mechanical characteristics, the dispersity properties of composite, and the area of composite destruction. Theoretical studies make it possible to obtain patterns of AE parameters change under the action of various factors. Such regularities provide the interpretation of AE information and can be the basis for the development methods of control, monitoring, and diagnostics of CM. At the same time, to increase the reliability of the methods of control, monitoring, and diagnostics of CM, it is important to determine the sensitivity of AE amplitude-energy parameters to the action of various factors.

2. Review of Publications

Conducted studies using the FBM concept [8, 9, 10, 11] are based on some assumptions. CM is represented as a bundle of discrete fibers or elements. If an external linearly increasing load is applied to a material, then its destruction is considered as a process of successive destruction fibers. The fibers have a linear elastic behavior before failure. Each fiber fractures brittle when its strength value is reached. The value of fiber fracture strength is a random variable with a certain probability density and distribution function. To study the process of CM fibers destruction, the rule for redistributing load on the remaining fibers is determined. These rules are the uniform distribution of load, that is, when the fiber is destroyed, the applied load is evenly redistributed to all remaining fibers; local distribution of load, that is, when the fiber is destroyed, the applied load is redistributed only to the nearest fibers. The main provisions of the FBM concept are used for the analysis of CM fracture processes both under tension [12, 13, 14] and under the action of shear

2022 ® Filonenko and Stakhova. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)

19


Journal of Automation, Mobile Robotics and Intelligent Systems

force [15, 16, 17]. However, in most of the articles, the process of CM destruction under tension conditions is considered. At the same time, using the general provisions of the FBM concept for the analysis of CM destruction processes, additional conditions are introduced: the introduction of two subsets of fibers, one of which is characterized by a probabilistic distribution tensile strength [18]; each fiber having thermal fluctuations of its energy characteristics in the form of white Gaussian noise, which add to the fiber fracture stresses [19]; varying dimension of the system from location fibers at an equal distance from each other along the line to location the fibers at the nodes a square lattice with side L [20]; restoration (sintering) of broken fibers and relaxation of load inhomogeneities (sintering compensates for damage by creating additional undamaged load-bearing fibers, which leads to increase the strength of bundle) [14]; the randomness of fiber configuration, that is, random distribution of fibers along the length [21]; strength distribution of elements according to Weibull [22]; the use of different distributions of threshold levels of destruction [23]; modeling of two materials with different mechanical properties that interact with each other [24]; and others. When modifying models, as a rule, studies are carried out on stresses change, the number of destroyed or remaining elements, time of composite full destruction change, distributions of destruction avalanches change (the number of fiber failures that occur due to the destruction of one fiber) and other characteristics. At AE analysis in studies[25, 26, 27], AE is considered during the destruction of a CM according to the FBM model. The studies were based on the fact that an AE event is formed when a fiber is destroyed. It was believed that the radiation energy in the event is proportional to the fracture stress, and the rate of destruction obeys a power law with an exponent of 2-5 (determined experimentally). When analyzing the destruction process of a CM, we consider not the process of generating an AE signal, but the process of releasing and accumulating the energy of acoustic radiation. The research results made it possible to obtain an expression for the AE energy release rate. In this case, at the moment of destruction, the functions have a discontinuity. In other articles [15, 16], the process of CM destruction under the action of shear force was studied for the following cases: independent CM fibers destruction by bending or tension or “or rule” (OR); destruction of fibers according to von Mises criterion; and failure only by tension. Analytical expressions for change of the equivalent stresses and the number of remaining elements during the development destruction process are obtained. Analysis was completed of the patterns of change in equivalent stresses and patterns of change in the number of destroyed CM fibers for different destruction modes, as well as destruction avalanches distributions for different fracture criteria. 20

VOLUME 16,

N° 4

2022

Analytical expressions for the number of remaining fibers and AE generated signal in time during CM destruction by shear force using OR criterion and von Mises criterion are considered in articles by Filonenko et al.[28, 29]. These studies were based on the main provisions of the FBM works [15, 16]. It was believed that when a single CM element fails, a single perturbation pulse is formed, the amplitude of which is proportional to the fracture stress. In this case, the kinetics of the destruction process were taken into account, the rate of which, according to the kinetic theory, changes according to an exponential law. The conducted studies’ results have shown that with the development of the CM destruction process, dependencies of the number of remaining elements (fibers) change over time and have a continuous falling character. In this case, continuous pulsed AE signals are formed. It was also shown that expressions for the number of remaining fibers and AE-generated signal over time include parameters that affect the CM destruction and AE-generated signals. These parameters are composite deformation rate, composite physical and mechanical characteristics, composite dispersion properties, and composite destruction area. In one article by Filonenko et al. [30], the influence of composite properties on amplitude-time parameterAE during its destruction by shear force using the von Mises criterion was studied. It was shown that increasing the value of parameter characterizing the CM properties leads to increasing steepness of the fall of the change curves of the remaining elements number over time, decreasing of AE signals’ maximum amplitude and duration. The patterns of AE signal maximum amplitude and duration change are determined and described. It is also shown that decreasing AE generated signal maximum amplitude is ahead decreasing of AE signal duration with increasing parameters characterizing the CM properties. The obtained patterns can be used in the development of the methods of control, monitoring, and diagnostics of CM. However, to improve the reliability of methods it is important to determine the sensitivity of AE amplitude-energy parameters to changes in CM properties.

3. Research Results

3.1. Simulation Conditions The study influence of CM properties on acoustic radiation amplitude parameters in the previously mentionedarticle by Filonenko et al. [30] was carried out when modeling AE signals by expression of the form U(t)=U0v0[σ m (t)-σ (t 0 )]⋅ er[σ m (t)-σ 0 (t 0 )] ⋅ -v0

⋅e

t

∫t0 e m r[σ

(t)]-σ 0 (t 0 )]

dt

,

(1)

where σ m (t), σ 0(t 0 ) is,respectively, the equivalent stress change on CM elements in time and threshold


Journal of Automation, Mobile Robotics and Intelligent Systems

stress corresponding to a time t0 of CM beginning destruction; U0 is the maximum possible displacement during the instantaneous CM destruction, consisting of N0 elements; v0, r are constants depending on CM physical and mechanical characteristics. Modeling of AE signals, according to expression 1, was carried out under the following conditions. The CM deformation rate α was taken equal to α = 10. The time t 0 of CM beginning destruction was taken equal to t0 =0.004. This time t0 corresponds to the threshold stress σ (t 0 ) of CM beginning destruction equal to σ0 = 0.03037385029676163. The value of the parameter r was taken equal to r = 10000. The value of the parameter v0, which characterizes the CM properties, changed in the range of values from v0 = 100000 to v0 = 500000 with an incremental step ∆v 0 = 100000. According to the calculation dependence of AE signals amplitudes change at v0 change, we will study the patterns of AE signal energy and AE signal total energy changes. We will calculate the energy of AE signals and total energy of AE signals using expressions of the form

VOLUME 16,

N° 4

2022

The values of approximating expression coefficients (4) are: for the AE signals maximum energy (Fig. 3, a) - a = 0.00001, b = -0.24322; for the AE signals total energy (Fig. 3, b) - а = 6.79966, b = -0.97332. In describing the dependence in Fig. 3, the determination coefficient R2 was R2 = 0.97435, and the dependence in Fig. 3, b - R2 = 0.99997. Thus, residual dispersion SD2 made: for the maximum energy of AE signals - SD2 = 3.9944∙10-16; for the total energy of AE signals - SD2 =3.1908∙10-16. For describing the dependencies of AE signals maximum and total energy change with the increasing v0, shown in Fig. 3, a criterion for choosing expression 4 was the minimum value of residual dispersion. To compare the sensitivity of AE signals amplitude and energy parameters change to parameter v0, processing of AE signals maximum amplitude, maximum

E(t)=∆t k iUi2 (2) Esum =∆t k i

∑ U . (3) i

2 i

where i = 0,…., k is the number of AE signal amplitude calculated value at its duration t ; ∆t k is the time interval between the AE signal amplitudes calculated values (∆t k =constant). Modeling dependencies of generated AE signal energy change in time, according to expression 2, taking into account (1) at v0 change will be carried out in relative units under the conditions considered above. The time interval ∆t k between AE signal amplitudes calculated values is the ∆t k = 1 ⋅10-7.

3.2. Simulation Results

The results of the calculations of the dependencies of the change in the energy of the AE signals over time in relative units, according to expression 2, are shown in Fig. 1. The results of the calculations of the dependencies of the change in the total energy of AE signals over time in relative units, according to expression 3, are shown in Fig. 2. When plotting Fig. 1 and Fig. 2, the time is given to the time t 0 = 0.004 of CM elements beginning destruction. The obtained data processing in the form of AE signals maximum and total energy dependencies change in relative units at increase the value of a parameter v0 is shown in Fig. 3. Analysis dependencies (Fig. 3) showed that they are well described by a power function of the form E EA

av (4)

where E EA is the AE signals maximum or total energy; a and b are the coefficients of the approximating expression.

Fig. 1. Graphs of AE signal energy changes in time in relative units during CM destruction by shear force with a change of a parameter v0. The value of parameter v0 : 1 - 100000; 2 - 200000; 3 - 300000; 4 - 400000; 5 - 500000. Simulation parameters: α = 10, r = 10000, σ0 = 0.008897277688462064

Fig. 2. Graphs of changes in the total energy of AE signals in time in relative units during the destruction of the CM by a transverse force with a change in the parameter v0. The value of the parameter v0: 1 - 100000; 2 - 200000; 3 - 300000; 4 - 400000; 5 - 500000. Simulation parameters: α = 10, r = 10000, σ0 = 0.008897277688462064 21


Journal of Automation, Mobile Robotics and Intelligent Systems

energy and total energy decrease concerning their initial values at v0 equal to v0 = 100000 as a percentage was carried out. The results of performed processing are shown in Fig. 4.

a

b Fig. 3. Dependencies of AE signals maximum energy change (a) and AE signals total energy change on a parameter v0 value in relative units during CM destruction by shear force. Simulation parameters: α = 10, r =10000, σ0 = 0.008897277688462064

Fig. 4. Dependencies of AE signals maximum amplitude (1), maximum energy (2) and total energy change as a percentage with the increasing parameter v0 value 22

VOLUME 16,

N° 4

2022

 In Fig. 4, the following notation is adopted: A, % isthe analyzed parameter of AE signals – maximum amplitude, maximum energy or total energy of AE signal.

4. Discussion of Research Results

The study of the influence of various factors on AE is important from the point of view of control, monitoring and diagnostics of the state of CM and CM products. In previous studies [30], the influence of the parameter characterizing the properties of the QM on the amplitude-time parameters of the AE was determined. It was shown that an increase in the value of the analyzed parameter leads to a decrease in the amplitude and duration of the generated AE signals. At the same time, the regularity of the decrease in the maximum amplitude of the generated AE signals is ahead of the regularity of the decrease in the duration of the AE signals, which can be used in the development of methods for controlling, monitoring and diagnosing the state of CM and CM products. However, to improve the reliability of the methods, it is important to determine the sensitivity of the amplitude-energy parameters of the AE to changes in the properties of the CM. In this work, we studied the influence of the parameter characterizing the properties of the CM on the energy parameters of the AE, and also determined the sensitivity of the amplitude-energy parameters to its change. The results of the conducted research show that at CM destruction by shear force according to von Mises criterion, with increasing parameter v0, which characterizes CM properties, both the maximumenergy (Fig. 1) and total energy (Fig. 2) of AE signals decrease. Thus, the dependencies of AE signals’ maximum energy and total energy change with the increasing v0 have a non-linear nature of decrease (Fig. 3). Both the maximum amplitude of the AE signals and their duration have similar nonlinear dependencies on the decrease. Analysis of simulation results shows that the patterns of AE signal maximum and total energy change are well described by power functions, as in the case of the maximum amplitude and duration of AE signals. Such a change in the parameters of AE signals requires determining their sensitivity to a change in the parameter characterizing the properties of the KM. To compare the sensitivity of the amplitude-energy parameters of the AE signals to a change in the parameter v0, the processing of the decrease in the maximum amplitude, maximum energy, and total energy of the AE signals concerning their initial values at v0 equal to v0 =100000 as a percentage was carried out.The calculation results showed (Fig. 4) that with increasing parameter v0 value decrease of AE signals maximum energy ahead decreasesAE signals maximum amplitude. At the same time, a decrease AE signals total energy ahead decreases AE signals maximum energy. Thus, at increasing v0 2 times (from v0 = 100000 to v0 = 200000) AE signals’ maximum amplitude decreases by 10.29%, and AE signal maximum


Journal of Automation, Mobile Robotics and Intelligent Systems

and total energy, respectively, decrease by 19.53% and by 48, 78%. At increasing v0 4 times (from = 100000 to v0 = 400000) AE signals maximum amplitude decreases by 15.95%, and AE signal maximum and total energy, respectively, decrease by 29.35% and by 74.15 %. At increasing v0 in 5 times (from v0= 100000 to v0 = 500000), AE signals maximum amplitude decreases by 17.11%, and AE signal maximum and total energy, respectively, decrease by 31.3% and by 79.29%. It can be seen from the obtained results that the most sensitive parameter of acoustic radiation to a parameter change, which characterizes CM properties, is registered AE signal total energy (accumulated energy) – its decrease is much ahead of the decrease of AE signal maximum amplitude and maximum energy. The obtained pattern of changes in the total energy of AE signals with a change in the parameter up to v0 can be used in the development of methods for control, monitoring and diagnostics of CM, as well as predicting the destruction of products from CM when registering AE signals under given conditions. At the same time, the developed methods should be based on monitoring the rate of change in the total energy of AE signals, the pattern of change of which is well described by a power function. At the same time, according to expression 1, the AE parameters are affected by several other factors, which are the dispersity of the properties of the CM, the area of the destroyed CM, and the rate of loading of the CM. Undoubtedly, the analysis of their AE influence will make it possible to determine the most significant factors, patterns of change in the amplitude-energy parameters of the AE and their sensitivity. The choice of the analyzed parameters will increase the reliability of the developed methods for control, monitoring and diagnostics of CM and CM products.

VOLUME 16,

2022

48.78%. At increasing v0 in 5 times, AE signals’ maximum amplitude decreases by 17.11%, and AE signal maximum and total energy, respectively, decrease by 31.3% and by 79.29%. Obtained results showed that the most sensitive AE parameter to change of parameter v0 is the registered AE signal total energy. Research data can be used in the development methods of control, monitoring, and prediction of product destruction from CM at the stages of their production and operation. In the future, it may bein­ teresting to study the influence of CM properties’ dispersity on AE signals’ amplitude-energy parameters.

AUTHORS

Sergii Filonenko – Department of Computerized Electrical Systems and Technologies, National Aviation University,Liubomyra Huzara ave. 1, Kyiv, 03058, Ukraine, fils0101@gmail.com. Anzhelika Stakhova* – Department of Computerized Electrical Systems and Technologies, National Aviation University,Liubomyra Huzara ave. 1, Kyiv, 03058, Ukraine, sap@nau.edu.ua. *Corresponding author

References [1]

[2]

5. Conclusion

The paper presents the results of modeling AE signal energy during CM consisting of N0 elements destruction by shear force using the von Mises criterion depending on a parameter v0, which characterizes CM properties. It is determined that increase of parameter v0, there is a decrease in acoustic radiation maximum energy and total energy. It is shown that the patterns of AE signals maximum and total energy change have a non-linear nature decrease. It is determined that the dependencies of AE signal maximum and total energy change are well described by power functions. The influence of parameter v0 on the acoustic radiation amplitude and energy characteristics is compared. It is shown that with an increasing value of the parameter v0, the decrease in the maximum energy is ahead of the decrease in the AE signals maximum amplitude, and a decrease of AE signals total energy ahead decreases AE signals maximum energy. Thus, at increasing v0 2 times AE signals maximum amplitude decreases by 10.29%, and AE signal maximum and total energy, respectively, decrease by 19.53% and by

N° 4

[3] [4]

[5] [6]

S. Clay et al., „Comparison of Diagnostic Techniques to Measure Damage Growth in a Stiffened Composite Panel,” Composites Part A: Applied Science and Manufacturing, vol. 137, 2020, 106030. B. Wang et al., „Non-Destructive Testing and Evaluation of Composite Materials/Structures: A State-of-the-Art Review,” Advances in mechanical engineering, vol. 12, no. 4, 2020, 1687814020913761.

R. Gupta et al., „A Review of Sensing Technologies for Non-Destructive Evaluation of Structural Composite Materials,” Journal of Composites Science, vol. 5, no. 12, 2021, p. 319.

Z. Fan, M.H. Santare, and S.G. Advani, „Interlaminar Shear Strength of Glass Fiber Reinforced Epoxy Composites Enhanced With Multi-Walled Carbon Nanotubes,” Composites Part A: Applied Science and Manufacturing, vol. 39, no. 3, 2008, pp. 540–554. Y. Liuet al., „Experimental Research on Shear Failure Monitoring of Composite Rocks Using Piezoelectric Active Sensing Approach,” Sensors, vol. 20, no. 5, 2020, 1376. B.D. Coleman, „Time Dependence of Mechanical Breakdown Phenomena.” Journal of Applied Physics, vol. 27, no. 8, 1956, pp. 862–866.

23


Journal of Automation, Mobile Robotics and Intelligent Systems

[7] [8] [9]

A. Hansen, P.C. Hemmer, and S.Pradhan. The Fiber Bundle Model: Modeling Failure in Materials, John Wiley & Sons, 2015.

F. Kun, S. Zapperi, and H.J. Herrmann, „Damage in Fiber Bundle Models,” The European Physical Journal B-Condensed Matter and Complex Systems, vol. 17, no. 2,2000, pp. 269–279. Y. Moreno, J.B. Gómez, and A.F. Pacheco, „SelfOrganized Criticality in a Fibre-Bundle-Type Model,” Physica A: Statistical Mechanics and its Applications, vol. 274, no. 3–4, 1999, pp. 400–409.

[10] Hemmer, P. C., and Hansen, A. (December 1, 1992). “The Distribution of Simultaneous Fiber Failures in Fiber Bundles.” ASME. J. Appl. Mech. December 1992; 59(4): 909–914. [11] W.I.Newman and S.L. Phoenix, „Time-Dependent Fiber Bundles with Local Load Sharing,” Physical Review E, vol. 63, no. 2, 2001, 021507. [12] S. Pradhan, A. Hansen, and B.K. Chakrabarti, „Failure Processes in Elastic Fiber Bundles,” Reviews of Modern Physics, vol. 82, no. 1, 2010, p. 499.

[13] A. Hader, et al., „Failure Kinetic and Scaling Behavior of the Composite Materials: Fiber Bundle Model with the Local Load-Sharing Rule (LLS),” Optical Materials, vol. 36, no.1, 2013, 3–7. [14] A. Capelli et al., „Fiber-Bundle Model with TimeDependent Healing Mechanisms to Simulate Progressive Failure of Snow,” Physical Review E, vol. 98, no. 2, 2018, 023002.

[15] F. Raischel, F. Kun, and H.J. Herrmann, „Simple Beam Model for the Shear Failure of Interfaces,” Physical Review E, vol. 72, no. 4, 2005, 046126.

[16] F. Raischel, F. Kun, and H.J. Herrmann, „Local Load Sharing Giber Bundles with a Lower Cutoff of Strength Disorder,” Physical Review E, vol. 74, no. 3, 2006, 035104. [17] G. Michlmayr, D. Or, and D.Cohen, „Fiber Bundle Models for Stress Release and Energy Bursts During Granular Shearing,” Physical Review E, vol. 86, no.6, 2012, 061307. [18] K. Kovácset al., „Brittle-to-Ductile Transition in a Fiber Bundle with Strong Heterogeneity,” Physical Review E, vol. 87, no. 4, 2013, 042816. [19] S.G. Abaimov, „Non-Equilibrium Annealed Damage Phenomena: A Path Integral Approach,” Frontiers in Physics, 2017, p. 6.

24

VOLUME 16,

N° 4

2022

[20] Z. Danku, G. Ódor, and F.Kun, „Avalanche Dynamics in Higher-Dimensional Fiber Bundle Models,” Physical Review E, vol. 98, no. 4, 2018, 042126. [21] Y. Yamadaand Y.Yamazaki, „Avalanche Distribution of Fiber Bundle Model with Random Displacement,” Journal of the Physical Society of Japan, vol. 88, no.2, 2019, 023002.

[22] A.R. Oskoueiand M.Ahmadi, „Fracture Strength Distribution in E-Glass Fiber Using Acoustic Emission,” Journal of Composite Materials, vol. 44, no.6, 2010, 693–705. [23] S. Pradhan, J.T. Kjellstadli, and A.Hansen, „Variation of Elastic Energy ShowsReliable Signal of Upcoming Catastrophic Failure,” Frontiers in Physics, vol. 7, 2019, p. 106.

[24] M. Monterrubio-Velascoet al., „A Stochastic Rupture Earthquake Code Based on the Fiber Bundle Model (TREMOL v0. 1): Application to Mexican Subduction Earthquakes.” Geoscientific Model Development, vol. 12, no. 5, 2019, pp. 1809–1831.

[25] Shcherbakov, R. “On modeling of geophysical problems: a dissertation for degree of doctor of philosophy/Robert Shcherbakov.-Cornell university, 2002.-209 р.” (2002). [26] D.L. Turcotte, W.I. Newman, and R.Shcherbakov, „Micro and Macroscopic Models of Rock Fracture,” Geophysical Journal International, vol. 152, no. 3, 2003, pp. 718–728. [27] F. Bosiaet al., „Mesoscopic Modeling of Acoustic Emission Through an Energetic Approach,” International Journal of Solids and Structures, vol. 45, no. 22–23,2008, pp. 5856–5866.

[28] S. Filonenko, V. Kalita, and A. Kosmach, „Destruction of Composite Material by Shear Load and Formation of Acoustic Radiation,” Aviation, vol. 16, no. 1, 2012, pp. 1–9. [29] S. Filonenkoand V.Stadychenko, „Influence of Loading Speed on Acoustic Emission During Destruction of a Composite by Von Mises Criterion,” American Journal of Mechanical and Materials Engineering, vol. 4, no.3, 2020, pp. 54–59.

[30] S. Filonenkoand A.Stakhova, „Acoustic Emission at Properties Change of Composite Destructed by von Mises Criterion,” Electronics and Control Systems, vol. 1, no.67, 2021, pp. 54-60.


VOLUME 16, N° 4 2022 Journal of Automation, Mobile Robotics and Intelligent Systems

Hybrid Navigation of an Autonomous Mobile Robot to Depress an Elevator Button Submitted: 4th July 2021; accepted: 29th September 2022

Pan-Long Wu, Zhe-Ming Zhang, Chuin Jiat Liew, Jin-Siang Shaw DOI: 10.14313/JAMRIS/4-2022/30 Abstract: The development of an autonomous mobile robot (AMR) with an eye-in-hand robot arm atop for depressing elevator button is proposed. The AMR can construct maps and perform localization using the ORB-SLAM algorithm (the Oriented FAST [Features from Accelerated Segment Test] and Rotated BRIEF [Binary Robust Independent Elementary Features] feature detector-Simultaneous Localization and Mapping). It is also capable of real-time obstacle avoidance using information from 2D-LiDAR sensors. The AMR, robot manipulator, cameras, and sensors are all integrated under a robot operating system (ROS). In experimental investigation to dispatch the AMR to depress an elevator button, AMR navigation initiating from the laboratory is divided into three parts. First, the AMR initiated navigation using ORB-SLAM for most of the journey to a waypoint nearby the elevator. The resulting mean absolute error (MAE) is 8.5 cm on the x-axis, 10.8 cm on the y-axis, 9.2-degree rotation angle about the z-axis, and the linear displacement from the reference point is 15.1 cm. Next, the ORB-SLAM is replaced by an odometry-based 2D-SLAM method for further navigating the AMR from waypoint to a point facing the elevator between 1.5 to 3 meter distance, where the ORB-SLAM is ineffective due to sparse feature points for localization and where the elevator can be clearly detected by an eye-in-hand machine vision onboard the AMR. Finally, the machine vision identifies the position in space of the elevator and again the odometry-based 2D-SLAM method is employed for navigating the AMR to the front of the elevator between 0.3 to 0.5 meter distance. Only at this stage can the small elevator button be detected and reached by the robot arm on the AMR. An average 60% successful rate of button depressing by the AMR starting at the laboratory is obtained in the experiments. Improvements for successful elevator button depressing rate are also pointed out. Keywords: ROS, AMR, ORB-SLAM, Robot Manipulator, Machine Vision.

1. Introduction With the advent of smart manufacturing in a wide range of industries, unmanned factories and automation have become the future trend. Therefore, automated

machines such as AMRs and robot manipulators have been utilized to perform multiple tasks within the production line, such as the transportation of cargo and highly repeated workloads to replace labor resources, and even reduce cost. Commonly used AMRs can be categorized into two different types by guiding methods, namely, rail-guided and trackless automated guided. A rail-guided mobile platform utilizes special tracks that are tiled to the floor, which generate electromagnetic fields to guide movement. A trackless automated guided mobile platform is normally based on laser range-finder and camera as the sensor’s data input to create a surrounding map and to determine a possible route within that map. The designated AMR in this report utilizes a trackless automated guided mobile platform to address the limitations associated with track leading, but has additional freedom of movement to perform any possible route. To navigate an AMR, a map is required to appropriately define its localization. However, there is considerable causality between map construction and localization. For instance, Smith et al. [1] proposed that the presentation and calculation of uncertain spatial data requires an unbiased map, but such a map requires accurate location estimation to build. DurrantWhyte and Bailey [2,3] proposed a simultaneous localization and mapping algorithm (SLAM), which has become a core technology in the field of mobile robots. SLAM is a solution for mobile robots to facilitate motion in an unknown environment. The SLAM method has been considered in two dimensions in which scanning data is acquired using a 2D laser range-finder, and in three dimensions wherein point cloud information is acquired via 3D laser range-finder or cameras. For 2D-SLAM, commonly used mapping algorithms include GMapping SLAM [4], Hector SLAM [5], and Cartographer SLAM [6]. The 3D SLAM method is also popular and is utilized in MonoSLAM [7], parallel tracking and mapping (PTAM) [8], and ORB-SLAM [9]. Each mentioned SLAM algorithm has its own specific properties. Whenever an AMR is successfully located in a prepared map, it should be navigated and guided along an assigned route when a valid destination is set as a goal on the map. Hart et al. [10] proposed a heuristic search algorithm, Bostel and Saigar [11] proposed the A* algorithm, Koeing and Likhachev [12] proposed the D*lite algorithm, and Fox et al. [13] proposed a dynamic window approach (DWA) to achieve these objectives.

2022 ® Wu et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)

25


Journal of Automation, Mobile Robotics and Intelligent Systems

26

In previously published works, it has been shown that a robot manipulator integrated with AMR has improved utility and working efficiency in the production line. For instance, Kousi et al. [14] simulated SLAM and navigation to a mobile dual-arm robot under ROS framework for a safe and collision-free path during the execution of the different assembly task. Apart from the AMR and the robot manipulator, machine vision is also one of the key technologies used for automation, which can replace the vision of workers in highly repeated detection tasks by exploiting different image processing techniques. Machine vision is reliable because the methods and algorithms that are used yield consistent and accurate results. Therefore, a robot manipulator integrated with machine vision is a widely implemented technology. For instance, Sangeetha et al. [15] developed a low-cost eye-in-hand system that included a robot arm and a set of stereo cameras. The object position was detected by the stereo camera based on image processing. The inverse kinematics of the robot manipulator were solved to perform a pick-and-place task. Similar eye-in-hand robotic manipulator architecture has been proposed by Shaw and Chi [16] for image-based visual servoing (IBVS) for fetching moving objects in a production line. Hosseininia et al. [17] also employed machine vision to guide a robot arm for porcelain edge polishing. Moreover, a mobile platform can also be integrated with machine vision. Laganowska [18] investigated the detection of road lines as aids in navigation control. As previously indicated, this investigation will be based on a trackless automated guided vehicle as the mobile platform to obviate the need for presetups such as specific wiring configurations and limitations associated with a particular movement. In our case, both the laser range finder and camera are primarily used to obtain environmental information, wherein the laser scanning data is accurate and the camera can provide 3D spatial information. The experimental field in this study is the corridor in the building basement with approximate size 45m × 20m. The goal is to navigate the AMR from the laboratory to an elevator and depress the button (as if it is going to take the elevator). However for a corridor with this spacious size, the 2D mapping methods such as GMapping, Hector, and Cartographer SLAM, might cause several types of errors in the selected regions. In this work, the ORB-SLAM algorithm, a 3D SLAM method based on an RGB-D camera that acquires feature points from each frame to facilitate positioning is utilized. Nevertheless, calibration of the camera should be performed as a precaution since the cameras might exhibit radial and tangential distortion in the image. Therefore, calibration can modify the camera parameters such as the focal length, center of the image, and the distortion coefficients [19]. Given that the experimental field is vast and the displacement error associated with SLAM might lead to navigation failure, a machine vision system is hence built to lead the AMR to a precise position at the final stage for depressing the elevator button. In this respect, Zhou and Liu

VOLUME 16,

N° 4

2022

[20] installed a camera on a mobile robot to scan a 2D barcode and successfully localized the position based on the acquired image. The vision-based movement control allowed the mobile robot to precisely move to an assigned position. Schueftan et al. [21] used the KUKA robot arm mobile platform to perform autonomous navigation after the creation of a map using LiDAR. Several Vicom cameras were installed around the field to observe and determine the moving deviation, and the positioning accuracy was hence secured.

2. System Design

2.1 AMR Construction The AMR presented in this study was designed and built using 30 cm × 30 cm aluminum frames to reduce cost and improve strength. The final size of the constructed AMR was 60 cm × 60 cm × 90 cm (in length × width × height). A laptop computer was used on the AMR to operate the ROS system. The AMR was designed based on a differential drive control, in which two sets of 12V DC motors were utilized and controlled using pulse width modulation (PWM) signals. Encoders attached to the DC motors were used for motor PID speed control and for measuring the movement traveled by the AMR as well. Several Arduino microcontrollers were connected to the computer via USB ports and used primarily for motor control and for I/O signals. A SICK LiDAR with better accuracy and longer range was attached to the front end of the AMR for scanning the environment and RBPF (RaoBlackwellized particle filter) in GMapping SLAM [4] was applied for constructing a 2D map for navigation. Two sets of Hokuyo LiDAR were also installed at the front left and rear right of the cart to facilitate 360° object detection. They were connected to the computer via USB ports to receive laser data for obstacle avoidance during navigation. In order to compensate the positioning errors with SLAM, particularly in the orientation angle, an inertial measurement unit (IMU) sensor was also employed. Kinect v2, an RGB-D camera, was used to obtain 3D point cloud information [22]. A Niryo robot served as the 6-degrees of freedom robot manipulator and was mounted on the top layer of the AMR. It was connected to the computer via Ethernet and programmed based on Python code. An IDS-XS industrial camera was mounted at the end effector of a Niryo robot arm, forming an eye-in-hand system to acquire an RGB image. Image processing was performed using OpenCV libraries for detecting the elevator. An NVIDIA Jetson TX2 module was also connected to the laptop computer via Ethernet to share the calculation burden during image processing. Figure 1 shows the constructed AMR and the associated main components used in the study.

2.2 ORB-SLAM Method

ORB-SLAM is a 3D SLAM method, which is classified as an indirect and sparse mapping procedure based on oriented FAST and rotated BRIEF algorithms [9]. This approach was developed based on PTAM architecture [8] by applying the ideas of place recognition,


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 1. Equipment used in the study: (a) AMR; (b) Niryo robot arm; (c) SICK LiDAR; (d) IDS-XS camera; (e) NVIDIA TX2 scale-aware loop closing, and covisibility graph, and redesigned into a new methodology. The main body includes map initialization and closed-loop detection functions, optimization of key frame selection, and map construction methods, which results in excellent processing speed, tracking effect, and map accuracy. The features selected by the algorithm are mainly based on the FAST feature extraction method, but with rotation invariance. It is then converted into a binary form using the BRIEF algorithm, which is very efficient for both building and matching. ORB-SLAM was originally developed for monocular cameras but later expanded for application to stereo and RGB-D cameras. In this study, the Kinect V2 was utilized as an RGB-D camera to generate color images and provide depth information to identify valid point clouds for feature matching. Figure 2 shows the image processing result obtained for the ORB-SLAM algorithm and the subsequent constructed mapping and localization.

2.3 Mapping and Navigation Methods

In this investigation, both ORB-SLAM and GMapping SLAM are used because of their different advantages in constructing maps. ORB-SLAM involves the acquisition of featured key points based on each frame image to generate a 3D map, which is mainly used for localizing the AMR position, and is more appropriate in such a long corridor environment. However, a 2D map is still needed. The GMapping algorithm generates a map from the SICK LiDAR scanning data, which is imported as a static map for the path planner to produce a navigating route, and for real-time obstacle avoidance as well. GMapping was developed based on the RBPF to obtain the correct map. RBPF uses the known initial pose of the mobile platform x0, and the map environment data collected by the sensor m0 to represent the map environment information using the Markov process as z1:t = z1, z2, …, zt, and the mobile platform motion information as u1:t = u1, u2, …, ut. The motion trajectory is estimated as x1:t = x1, x2, …, xt , and the posterior probability as p. According to Bayes’ theorem, the posterior probability of SLAM at time t can be expressed as a recursive function of the posterior probability, environmental state, and motion state at time t-1, as shown in (1) [23]: p ( x1:t , m|z1:t , u1:t −1 ) = p ( x1:t |z1:t , u1:t −1 ) p ( m|x1:t , z1:t ) (1)

Fig. 2. ORB-SLAM algorithm: (a) result of image processing; (b) constructed mapping and localization where x1:t is the mobile platform state, z1:t is the observed state, and u1:t is the mobile platform input. Therefore, from (1), the mobile platform pose x can be obtained using the observed state z and the input signal u. Subsequently, the map m can be approximated based on the mobile platform position x and the observed state z. The map is then imported as static map information for the navigation stacks to deliver a global path. The global path planner in the navigation stacks is mainly based on a static map to plan an effective path according to a set target point. Commonly used algorithms such as the A* [11] can search for the shortest path. The A* algorithm uses an evaluation function, as shown in (2):

F (n) = G(n) + H(n) (2)

where F(n) is the evaluation score for reaching the endpoint, G(n) represents the actual distance from the starting point to the current node, and H(n) is the estimated distance from the current node to the endpoint. The search direction of all nodes points to

27


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

the target point, which can remove redundant and unnecessary paths. Therefore, it takes less time and the path is more accurate. The route generated based on the static map can only be applied in the same environment. However, there could be obstacles that exist after the map is established, and thus are not included in the consideration of global path planning. The dynamic window approach (DWA) [13] is hence proposed in local path planning in the obstacle layer for real-time dynamic obstacle avoidance. Finally, the inflation map layer is constructed to prevent the mobile platform from being too close to obstacles. Using an eye-in-hand architecture, an IDS industrial camera is installed at the end effector of a Niryo robot arm, and machine vision is utilized for leading both the mobile platform and the robot manipulator close to the elevator and elevator button, respectively. First, a 7 × 5 image calibration chessboard is used for camera calibration. Next, an image of the elevator door is acquired and an image processing series is carried out to the image, including color space conversion from RGB to HSV, filtering using median filter and morphology, thresholding, image AND operation, contour detection, area computation, and 3D reconstruction [24]. The purpose of the image processing is to identify the position in space of the elevator and its button so that the mobile platform and the robot manipulator can be guided forward in order to depress the button.

3. Experimental Results

3.1 AMR SLAM Navigation to Elevator The goal in this study is to navigate the AMR from the laboratory in the building basement to an elevator and to depress the button. The basement corridor that the AMR can navigate is about 45m × 20m. First, a 2D map depicting the floor layout is generated by using GMapping SLAM with the SICK LiDAR sensor, as shown in Figure 4. In this constructed map, point ‘S’ in front of the laboratory is the starting position for AMR navigation. West elevator (26.3 m apart) facing point ‘A’ or east elevator (40 m apart) facing point ‘B’ is the elevator to which the AMR can be heading for. The 2D map along with the well-known and commonly-used 2D localization method AMCL (Adaptive Monte Carlo Localization) [25] was tested for navigation, and the AMR failed to reach either elevator mainly due to the long corridor problem where at some locations no distin­guishable features were available on both sides of AMR. Consequently, we resort to the 3D localization method of ORB-SLAM. However, due to the limitation on Kinec V2 such as detection range and field of view,

2.5 System Architecture

This study is based on the ROS framework to integrate both hardware and firmware of the mobile robot. Figure 3 shows the system architecture with a laptop computer running the core system. A GPU module NVIDIA-TX2 is used to receive the image from the IDS camera and to perform image processing for detecting

USB

LiDAR USB

Mobile platform

IDS camera

USB

USB

PC

NVIDIA TX2

EtherNet

EtherNet Router

Fig. 3. The system architecture

28

2022

the elevator. In addition, an Arduino Mega2560 microcomputer is employed to control the speeds of the two driving motors of the mobile platform based on the encoder signals using the PID controller, upon receiving the velocity commands from the main computer. In navigation the main computer drives the mobile platform based on signals from the Kinect v2 for 3D localization for most of the navigation journey and later from motor encoders and IMU for 2D localization when approaching the elevator, and from the Hokuyo LiDAR sensors for dynamic obstacle avoidance. When the mobile platform arrives to a point facing the elevator between 0.3 to 0.5 meter distance where elevator button is reachable by the robot arm, the last stage SLAM navigation terminates and the IDS camera will guide the robot arm trying to depress the button.

2.4 Image Processing Method

KinectV2

N° 4

Niryo robot

EtherNet


Journal of Automation, Mobile Robotics and Intelligent Systems

AMR cannot be guided too close to the elevator where the ORB-SLAM is ineffective due to sparse feature points for localization. Therefore, point ‘a’ and point ‘b’ are the two waypoints chosen for AMR to navigate to the west and east elevator, respectively. At such waypoints ‘a’ and ‘b’, ORB localization method terminates. For AMR navigation and obstacle avoidance from point ‘S’ to waypoint ‘a’ (or ‘b’), move_base package in ROS navigation stack is employed. Figure 5 shows the ­navigation architecture, where ORB localization is adopted and A* and DWA algorithms are used for global and local path planner, respectively. The A* algorithm plans the route from the current location to reach the goal, whereas the DWA algorithm is capable of updating a real-time route to bypass obstacles. Note that the 2D map in Figure 4 is used here for both path

VOLUME 16,

N° 4

2022

planners, data from Hokuyo LiDAR sensor is used in DWA algorithm for dynamic obstacle avoidance, and 3D point clouds from Kinect v2 are employed for ORB localization. Navigation experiments have been carried out from point ‘S’ to waypoints ‘a’ and ‘b’, each path for 30 runs. The resulting positioning errors in x-y plane for this AMR navigation to waypoints ‘a’ and ‘b’ are shown in Figure 6. The mean absolute error (MAE) is 8.5 cm on the x-axis, 10.8 cm on the y-axis, 9.2-degree rotation angle about the z-axis, and the ­ linear ­displacement from waypoints ‘a’ and ‘b’ is 15.1 cm. It is observed that, due to longer distance and non-uniform lighting conditions including area open to the sky, navigation to point ‘b’ has larger positioning error than point ‘a’.

Fig. 4. The constructed 2D map of experiment field

Fig. 5. The AMR navigation architecture 29


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

‘ ’

N° 4

2022

‘ ’

Fig. 6. Positioning errors at waypoint ‘a’ and ‘b’ of AMR navigation

elevator panel

Point ‘A’ or ‘B’

Fig. 7. AMR in front of elevator: (a) AMR location illustration; (b) elevator seen at point ‘A’ or ‘B’ For the next continuous journey from waypoint ‘a’ to point ‘A’ (and from waypoint ‘b’ to point ‘B’), the 3D ORB localization in Figure 5 is automatically replaced by an odometry-based 2D localization method using signals from the two motor encoders and an IMU. Points ‘A’ and ‘B’ are selected facing the two elevators between 1.5 to 3 meter distance, at which the AMR can see and detect the elevator without difficulty, as shown in Figure 7. Experimental results of navigation from waypoint ‘a’ to point ‘A’ (and ‘b’ to ‘B’) show MAE of 5.25-degree rotation angle about the z-axis and the linear displacement 50.0 cm from the target point ‘A’ or ‘B’. Hence it is easy to see this 2D SLAM has a bigger positioning error of 50.0 cm than the first 3D ORB-SLAM navigation of 15.1 cm. This is mainly due to accumulation errors from the waypoints and from integrating motor speed to get displacement information as well during this trip. Nonetheless, for all 60 runs from the start point ‘S’ to points ‘A’ and ‘B’ with such positioning errors, the AMR indeed is able to detect the elevator without difficulty so that the third AMR journey to get closer to the elevator for depressing the button can be made possible, as will be seen in the next section.

3.2 Machine Vision Guided SLAM Navigation

30

For the third (also the last) stage of the AMR journey, AMR has to determine its own navigation target,

unlike the first two journeys at which the target points are preset on the 2D map by operators. As AMR arrives at the first location (namely point ‘A’ or ‘B’) in Figure 7(a), the IDS camera will be used to detect and determine the true elevator position relative to the AMR. When it is done, a target point (as shown, the second location in Figure 7(a) between 0.3 to 0.5 meter distance from the elevator) can be issued to the AMR for the last navigation. It is noted that at the second location the robot arm atop the AMR with R500 mm working space can reach and depress the elevator button. Consequently, two steps are involved in this stage: IDS camera detects the elevator and the AMR moves to the second location. The success rate for each step will be summarized in Table 1. As shown in Figure 7(b), the elevator up button (D) is not detected in first priority because it is too small and thus unrecognizable in the acquired image when the AMR is at the first location. There is, however, a yellow sticker below the up button, which can be detected as a valid target. A series of image processing algorithms are applied to the acquired image with yellow sticker inside, including RGB to HSV model transformation (as an example see Figure 8(a)), yellow color thresholding (Figure 8(b)), noise removal by median filter, and by morphology transformation such as closing (Figure 8(c)), and finally contour finding and the Perspective-n-Point (PnP) transform [26]


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Tab. 1. The success rate for each task From position

Yellow sticker detection

Moving to second location

Button detection

Button depressed

Combined success rate

‘B’

100 % (30/30)

86.7 % (26/30)

80.8 % (21/26)

85.7 % (18/21)

60.0 % (18/30)

‘A’

100 % (30/30)

90.0 % (27/30)

59.3 % (16/27)

87.5 % (14/16)

46.7 % (14/30)

Fig. 8. Image processing to the yellow sticker detection: (a) HSV image; (b) color thresholding; (c) noise filtering; (d) position and rotation of the yellow sticker identified

(Figure 8(d)). The space coordinate of the yellow sticker (and hence the elevator) including position (x, y, z) and rotation (rx, ry, rz) relative to the camera frame is determined, as clearly seen in Figure 8(d). For all 30 experiment runs from point ‘S’ to point ‘A’ and another 30 runs to point ‘B’, the IDS camera succeeds in detecting the yellow sticker location every time at these locations, with approximate 30 mm displacement error in distance. Once the elevator position and rotation relatisticker ve to the camera frame is determined, i.e., Tvision , the target point for AMR to navigate to can be obtained from sticker 0 sticker (3) TAMR = TAMR T06T6visionTvision

0 where TAMR , T06 , T6vision

represents respectively the homogeneous transformation matrix from robot arm base frame to AMR frame, from robot manipulator flange frame to robot arm base frame, and from camera frame to robot manipulator flange

frame. Then the AMR at the first location can be navigated again toward the elevator to the target point (namely, the second location shown in Figure 7(a)) using the odometry based 2D SLAM technique. Experimental results show navigation success rates of 90.0% (27/30) and 86.7% (26/30) from point ‘A’ and point ‘B’ respectively to the target position. That is, three navigations from point ‘A’ and four journeys from point ‘B’ fail to reach the second location (in fact the AMR collides with the elevator). The failure is due to both the navigation error and the limited working space of the robot arm (R500 mm), which might overlap with the inflation map layer with radius 400 mm.

3.3 Elevator Button Detection and Depressing

As the AMR has successfully moved forward to the second location with 0.3 to 0.5 meter distance from the elevator, the IDS camera may see the button clearly (for example Figure 9(a)) and hence can guide the robot arm to depress it. Similar to the previous image

31


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

erode

Fig. 9. Image processing to the button detection: (top) flow chart; (bottom) sample results

32

processing for detecting yellow sticker, Figure 9 depicts a series of image operations applied to the source image of elevator button and shows some resulting samples along the process. It is found that as long as the button is inside the captured image, its space coordinates can be correctly determined, as shown in Figure 9(f). However, because AMR is driven by the differential wheels system, it is difficult to correct displacement error in y-axis, especially during such

a short travel distance from point ‘A’ or point “B’ to the second location (about 2 m from point ‘A’ and 3 m from point ‘B’). Therefore, 59.3% (16/27) and 80.8 (21/26) success rates are recorded with the elevator button lying within the camera FOV for path ‘A’ and path ‘B’ respectively when AMR arrives at the second location. It is readily seen that with shorter travel distance for path ‘A’, it is not easy to have a large shift in y-axis displacement while requiring AMR to face


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 10. The elevator button depressing: (a) AMR navigation to the second location and button detection; (b) depressing button successfully directly the elevator in the end. More often the up buttons are located outside of the camera FOV for path ‘A’ than path ‘B’. In Figure 4, it is seen that an automobile is parked at lower left of point ‘A’ while point ‘B’ has nothing around it. This explains why point ‘A’ is set with a shorter distance to the elevator, for safety concerns. Finally, the robot arm is ready to depress the button, whose coordinates with respect to the robot base frame can be calculated as button T0button = T06T6visionTvision (4)

The reader is referred to Figure 9(f) for detected button position (robot_x, robot_y, robot_z) and rotation (robot_rx, robot_ry, robot_rz) with respect to robot base frame, as an example. The robot gripper is then controlled to depress the target button, as shown in Figure 10. In this final task of button depressing by robot arm, success rate of 87.5% (14/16) and 85.7% (18/21) are achieved for path ‘A’ and ‘B’ respectively. It is found the rare depressing failures (five out of 37) are all caused by the limited working space of the robot arm (R500 mm) that could not reach the button due to the AMR positioning error. As a summary for the performance of the developed mobile AMR to depress the button at west elevator (26.3 m away) or east elevator (40 m away), Table 1 lists the success rate for each task and the overall combined success rate from beginning to end. For path A to west elevator, 14 of 30 navigations beginning at laboratory can finish the button depressing resulting in 46.3% success rate, while 18 of 30 navigations can finish the job with 60% success rate for path B to east elevator. Note that 60 trials for AMR to arrive at point ‘A’ and ‘B’ to detect the elevator position were all successful. However, for the ensuing tasks, mistakes happened, such as AMR bumping into the elevator, the small up button was not within the camera FOV, and robot arm was not long enough to touch the button, etc. To improve the success rate and enhance the system’s robustness, several possible approaches are proposed: positioning accuracy from the first location to the second location could be increased by multiple SLAM navigations instead of just one journey; the

differential drive system of AMR could be replaced by omni-­directional drive, such as using Mecanum wheels for easy maneuvering in short travel distance while requiring AMR orientation in the end; scan the up button horizontally by robot arm when it is out of camera FOV; and lastly, use of longer robot arm for larger working space to reach the button, which ­would also be helpful in avoiding AMR bumping into the elevator.

4. Conclusion

A hardware system for a mobile AMR was developed, including an aluminum mobile platform, a robot manipulator with an eye-in-hand IDS industrial camera, an NVIDIA-TX2 GPU module, some embedded Arduino microcontroller units, and several sensors such as Kinect v2 RGB-D camera, LiDARs, encoders and an IMU. The software framework was based on the ROS architecture installed on a laptop running Ubuntu 16.1.04. The required functions were implemented using both C++ and Python programming. The aim of the study was to navigate the AMR towards an elevator and to summon one by depressing the button. AMR moving to front of the elevator for button depressing was made possible by three consecutive SLAM navigations. A 3D map of the experiment field for localization purposes by Kinec V2 and a 2D map for both localization and path planning by SICK LiDAR were constructed and corresponding 3D ORB localization and 2D odometry-based localization methods were employed in the navigation stacks. In addition, real time and dynamic obstacle avoidance via DWA was also implemented using two Hokuyo LiDARs. With the developed hardware and software systems integrated in an AMR, an average 60% successful rate of button depressing by the AMR starting at the laboratory was obtained in the experiments. Improvements of successful elevator button depressing rate are also pointed out for future work.

Funding

This study was supported by the National Taipei University of Technology - Nanjing University of Science and Technology Joint Research Program

33


Journal of Automation, Mobile Robotics and Intelligent Systems

N° 4

2022

(NTUT-NUST-108-01) and by the Ministry of Science and Technology, Taiwan (MOST 111-2218-E-027-001).

[7]

Pan-Long Wu – Nanjing University of Science and Technology, China, e-mail: plwu@njust.edu.cn

[8]

Chuin Jiat Liew – National Taipei University of Technology, Taiwan, e-mail: jianjiat@gmail.com

[9]

*Corresponding author

[10] P. E. Hart, N. J. Nilsson and B. Raphael, “A formal basis for the heuristic determination of minimum cost paths”, IEEE transactions on Systems Science and Cybernetics, vol. 4, no. 2, 1968, 100-107, DOI: 10.1109/TSSC.1968.300136.

AUTHORS

Zhe-Ming Zhang – National Taipei University of Technology, Taiwan, e-mail: qaz9517532846@ gmail.com . Jin-Siang Shaw* – National Taipei University of Technology, Taiwan, e-mail: jshaw@ntut.edu.tw

References [1]

[2]

[3]

[4]

[5]

[6]

34

VOLUME 16,

R. Smith, M. Self and P. Cheeseman, “Estimating uncertain spatial relationships in robotics”. In: IEEE International Conference on Robot and Automation, Raleigh, NC, USA, 31 March-3 April 1987, DOI: 10.1109/ ROBOT.1987.1087846. H. Durrant-Whyte and T. Bailey, “Simultaneous localization and mapping (SLAM): Part I”, IEEE Robot & Automation Magazine, vol. 13, no. 2, 2006, 99-110, DOI: 10.1109/ MRA.2006.1638022. T. Bailey and H. Durrant-Whyte, “Simultaneous localization and mapping (SLAM): Part II”, IEEE Robot & Automation Magazine, vol. 13, no. 3, 2006, 108-117, DOI: 10.1109/ MRA.2006.1678144

G. Grisetti, C. Stachniss and W. Burgard, “Improved techniques for grid mapping with Rao-Blackwellized particle filters”, IEEE Transactions on Robot, vol. 23, no. 1, 2007, 34-46, DOI: 10.1109/TRO.2006.889486. S. Kohlbrecher, J. Meyer, T. Graber, K. Petersen, U. Klingauf and O. von Stryk, “Hector open source modules for autonomous mapping and navigation with rescue robots”, Robot Soccer World Cup, 2013, 624-631, DOI: 10.1007/978-3-662-44468-9_58. W. Hess, D. Kohler, H. Rapp and D. Andor, “Real-time loop closure in 2D LIDAR SLAM”. In: IEEE International Conference on Robot and Automation. Stockholm, Sweden, 1620 May 2016, 1271-1278. DOI: 10.1109/ ICRA.2016.7487258.

A. J. Davison, I. D. Reid, N. D. Molton and O. Stasse, “MonoSLAM: real-time single camera SLAM”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, 2007, 1052-1067, DOI: 10.1109/TPAMI.2007.1049.

G. Klein and D. Murray, “Parallel tracking and mapping for small AR workspaces”, In: Proceedings of the 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, Nara, Japan, 13-16 November 2007, 225-234, DOI: 10.1109/ISMAR.2007.4538852.

R. Mur-Artal, J. M. M. Montiel, J. D. Tardós, “ORB-SLAM: a versatile and accurate monocular SLAM system”, IEEE Transactions on Robot, vol. 31, no. 5, 2015, 1147-1163, DOI: 10.1109/ TRO.2015.2463671.

[11] A. J. Bostel and V. K. Saigar, “Dynamic control systems for AGVs”, IEEE Trans. Computing & Control Engineering, vol. 7, no. 4, 1996, 169-176, DOI: 10.1049/cce:19960403.

[12] S. Koenig and M. Likhachev, “Fast replanning for navigation in unknown terrain”, IEEE Transactions on Robot, vol. 21, no. 3, 2005, 354-363, DOI: 10.1109/TRO.2004.838026. [13] D. Fox, W. Burgard and S. Thrun, “The dynamic window approach to collision avoidance”, IEEE Robot & Automation Magazine, vol. 4, no. 1, 1997, 23-33, DOI: 10.1109/100.580977. [14] Kousi N, Gkournelos C, Aivaliotis S, et al. Digital twin for adaptation of robots’ behavior in flexible robotic assembly lines, Procedia Manufacturing, 2019, 28: 121-126.

[15] G. R. Sangeetha, N. Kumar, P. R. Hari and S. Sasikumar, “Implementation of a stereo vision based system for visual feedback control of robotic arm for space manipulations”, Procedia Computer Science, vol. 133, 2018, 1066-1073, DOI: 10.1016/j.procs.2018.07.031. [16] J. Shaw and W. L. Chi, “Automatic classification of moving objects on an unknown speed production line with an eye-in-hand robot manipulator. Journal of Marine Science and Technology, vol. 26, no. 3, 2018, 387-396, DOI: 10.6119/ JMST.2018.06_(3).0010. [17] S. J. Hosseininia, K. Khalili and S. M. Emam, “Flexible automation in porcelain edge polishing using machine vision”, Procedia Technology,


Journal of Automation, Mobile Robotics and Intelligent Systems

vol. 22, 2016, 562-569, j.protcy.2016.01.117.

DOI:

10.1016/

[18] M. Laganowska, “Application of vision systems to the navigation of the mobile robots using makers”, Transportation Research Procedia, vol. 40, 2019, 1449-1452, DOI: 10.1016/ j.trpro.2019.07.200.

[19] Y. M. Wang Y. Li and J. B. Zheng, “A camera calibration technique based on OpenCV”. In: The 3rd International Conference on Information Sciences and Interaction Sciences, Chengdu, China, 23-25 June 2010, DOI: 10.1109/ ICICIS.2010.5534797.

[20] C. Zhou and X. Liu, “The study of applying the AGV navigation system based on two dimensional bar code”. In: International Conference on Industrial Informatics - Computing Technology, Intelligent Technology, Industrial Information Integration (ICIICII), Wuhan, China, 3-4 Dec. 2016, 206-209, DOI: 10.1109/ ICIICII.2016.0057. [21] D. S. Schueftan, M. J. Colorado and I. F. M. Bernal, “Indoor mapping using SLAM for applications in flexible manufacturing systems”. In: IEEE 2nd Colombian Conference on Automatic Control

VOLUME 16,

N° 4

2022

(CCAC), Manizales, Colombia, 14-16 Oct. 2015, DOI: 10.1109/CCAC.2015.7345226.

[22] A. S. Sabale, “Accuracy measurement of depth using Kinect sensor”. In: Conference on Advances in Signal Processing (CASP), Pune, India, 9-11 June 2016, DOI: 10.1109/CASP.2016.7746156. [23] J. P. M. dos Santos, SmokeNav - simultaneous localization and mapping in reduced visibility scenarios. MSc thesis, Department of Electrical and Computer Engineering, University of Coimbra, Coimbra, Portugal, September 2013. http://hdl. handle.net/10316/26963

[24] M. Rahchamani, “Developing and evaluating a low-cost tracking method based on a single camera and a large marker”. In: 25th National and 3rd International Iranian Conference on Biomedical Enginnering (ICBME). Tehran, Iran, 29-30 November 2018, DOI: 10.1109/ ICBME.2018.8703592. [25] B. P. Gerkey. AMCL package, http://wiki.ros.org/ amcl.

[26] Perspective-n-Point (PnP) transform, https:// docs.opencv.org/master/dc/d2c/tutorial_real_ time_pose.html.

35


VOLUME 16, N° 3 2022 Journal of Automation, Mobile Robotics and Intelligent Systems

Automatic Detection of Brain Tumors Using Genetic Algorithms With Multiple Stages in Magnetic Resonance Images Submitted: 1st September 2021; accepted 22nd August 2022

Karthik Annam, Sunil Kumar G, Ashok Babu P, Narsaiah Domala DOI: 10.14313/JAMRIS/4-2022/31 Abstract: The field of biomedicine is still working on a solution to the challenge of diagnosing brain tumors, which is now one of the most significant challenges facing the profession. The possibility of an early diagnosis of brain cancer depends on the development of new technologies or instruments. Automated processes can be made possible thanks to the classification of different types of brain tumors by utilizing patented brain images. In addition, the proposed novel approach may be used to differentiate between different types of brain disorders and tumors, such as those that affect the brain. The input image must first undergo pre-processing before the tumor and other brain regions can be separated. Following this step, the images are separated into their respective colors and levels, and then the Gray Level Co-Occurrence and SURF extraction methods are used to determine which aspects of the photographs contain the most significant information. Through the use of genetic optimization, the recovered features are reduced in size. The cut-down features are utilized in conjunction with an advanced learning approach for the purposes of training and evaluating the tumor categorization. Alongside the conventional approach, the accuracy, inaccuracy, sensitivity, and specificity of the methodology under consideration are all assessed. The approach offers an accuracy rate greater than 90%, with an error rate of less than 2% for every kind of cancer. Last but not least, the specificity and sensitivity of each kind are higher than 90% and 50%, respectively. The usage of a genetic algorithm to support the approach is more efficient than using the other ways since the method that the genetic algorithm utilizes has greater accuracy as well as higher specificity. Keywords: MRI brain tumor, GLCM, SURF, genetic optimization, advanced machine learning

1. Introduction

36

The brain and spinal column are components of the central nervous system (CNS). The CNS is responsible for controlling all of the body's energizing processes, including cognition, speech, vision, breathing, and movement. When aberrant cells form in the CNS, it can have an effect on a person's thoughts as well as the way their body moves. The spinal cord extends all the

way from the bottom of the brain down to the middle of the lower back. The spinal cord is the pathway via which messages go to and from the brain and the rest of the body. There are between 50 and 100 billion neurons in the brain, which is a very significant number of cells. The functions of the brain's individual cells may be broken down into categories. It is highly difficult to identify a brain tumor in its early stage since the brain is covered by the skull. Additionally, brain tumors do not display distinct clinical signs, making it even more difficult to diagnose. In most cases, the diagnosis of brain tumors is based on the presence of three symptoms [1]. Because of an increase in cranial pressure, the first symptom is a headache, along with vomiting and altered states of consciousness [2–4]. The second symptom is that the affected individual may experience changes in personality or emotions. This is caused by disorder in the brain. The final sign to look out for is irritability, which can also manifest as absences, weariness, or convulsions. However, brain tumors are not the only possible cause of these symptoms. Therefore, imaging techniques are the primary method utilized in the diagnosis of brain tumors. The features of the tumor, as well as its origin, location, and size, are taken into consideration when classifying brain tumors. The identification of brain cancer in its earlier stages is one of the most important issues in the field. According to the World Health Organization (WHO), there are 120 different forms of brain tumors. In addition, the WHO has rated the tumors from grade I to grade IV [5]. The doctor is able to provide the therapy necessary to preserve the patient's life depending on the grade level of the patient [6]. In most cases, there are two types of brain tumors: primary and secondary. Primary brain tumors are those in which the tumor first developed in the brain itself. The initial brain tumor might be classed as either benign or malignant depending on the degree to which it has spread. A brain tumor is considered benign if it does not spread to other regions of the body and begins there. A non-cancerous tumor is another name for this particular kind of growth. The malignant kind of tumor begins in the brain, but it can migrate to other regions of the body, such as the spine. Because the development of additional cells is confined to the periphery of the brain, benign brain tumors are far simpler to cure than their malignant counterparts. Surgery is the only treatment necessary for benign brain tumors; radiotherapy, chemotherapy, and other

2022 ® Annam et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)


Journal of Automation, Mobile Robotics and Intelligent Systems

treatments are not necessary. Because a malignant brain tumor spreads quickly, even after surgery there is a potential for the tumor to reappear, the only way to effectively treat the tumor is with a combination of chemoradiotherapy and radiotherapy. In the next sections, an explanation of the many detection methods that may be used to treat brain tumors is provided, as well as a hybrid algorithm for detecting tumors in the brain.

2. Related Works

In this part, a general review of medical image analysis pertaining to brain tumors is provided. When attempting to diagnose cancerous tissues in a human body, medical technology employs a variety of diagnostic approaches. The cancer cells are diagnosed by a surgeon based on the patient's family history as well as the diagnostic report from the patient's physical examination, which include diagnostic procedures such as magnetic resonance imaging (MRI), computed axial tomography (CAT), biopsy, brain angiogram, magnetic resonance angiogram (MRA), and electroencephalogram. The discovery of the tumor at an earlier stage leads to an improvement in the patient's chance of survival [7]. It is necessary to do brain image analysis in order to arrive at an accurate prognosis. The picture of the brain is examined by a doctor using effective segmentation algorithms, which allows the doctor to arrange therapy. Radiologists have a very laborious and time-consuming task ahead of them when segmenting tumors. At the moment, surgeons are making use of cuttingedge, non-invasive imaging tools in order to conduct cancer tissue analysis. The MRI test is the gold standard non-invasive method for diagnosing brain tumors. However, a single MRI scan is not adequate for categorizing and segmenting the tissues for the purpose of detecting tumors; as a result, employing numerous MRI sequences is required [8]. In order to make the analysis of the picture simpler, many image segmentation methods are employed. Intensity, threshold level, edge detection, watershed segmentation, and Markov Random Field model are just a few of the several segmentation approaches that may be used [9]. These days, computer-aided diagnosis (CAD) is used to find out whether there are any abnormalities in the patient's brain [10–16]. The location of the tumor may be determined from CAD by using an algorithm called k-means clustering [17]. It was discovered that using this procedure prevented the formation of the misclustered area that occurs in the MRI technique. However, this strategy produces quite diverse findings depending on which cluster you look at. The MRI method of diagnosing brain tumors inspired the development of the CAD system [18]. The performance of CAD may be improved to better study the location of the tumor by making use of dynamic contour models. They utilize a number of techniques, including Distance Regularized Level Set Evolution (DRLSE) for medical image

VOLUME 16,

N° 4

2022

segmentation and fuzzy clustering using Level Set Method (LSM) in order to segment the images [19]. An approach that is only semi-automatic was described in Sauwen et al. for evaluating dead cells found in the brain [20]. The approach known as semi-automatic requires participation from the user as well as software stages. A surgeon requires a limited number of input parameters and then has to visualize the data. This procedure is efficient in terms of computing when it comes to dividing up the brain tumor. In Joshi and Channe [21], it is suggested that structural MRI might be utilized to identify the structure of the brain in order to investigate the proliferation of the cells in the brain. In order to determine the classification based on the segmentation of an image, machine learning techniques such as support vector machines and the random forest algorithm are utilized. An overview of brain tumor detection and cascaded architecture is provided in Havaei et al. [22], which made use of deep learning. An approach to segmentation based on deep learning was suggested in Akkus et al. [23]. In the article, supervised learning was employed to detect brain tumors. They require a vast quantity of data in order to provide accurate results. Using MRI imaging and CAD, a hybrid abnormality detection technique was published in Devasena and Hemalatha [24]. This approach is utilized to locate dysfunctional cells in the image data. Goswami and Bhaiya demonstrated a categorization of pictures obtained from an MRI using artificial neural networks that had a self-organizing map [25]. After doing some pre-processing on the photos, such as histogram equalization, filtering, and edge detection, the images are then retrieved.

3. Methodology

3.1 Datasets and Method The diagnosis of a brain tumor increases the patient's chance of survival directly. A multi-phase automated brain tumor identification from MRI, which utilizes a hybrid approach based on genetic algorithm and deep learning techniques, has been offered as a solution to this problem. This solution is intended to address the issue. The performance of the proposed technique is confirmed by utilizing publicly accessible datasets, specifically Open Access Series of Imaging Studies (OASIS) [27] and Brain Tumor Segmentation datasets [26]. Nine hundred and eighty-five MRI scans are included in each of the datasets. MRI scans were taken from 255 different patients to create these photos. Both of these datasets contain photos of the skull taken from a variety of perspectives. The network is trained based on the MRI scans as well as their angles through the use of modified deep learning and genetic algorithms. In the approach that has been presented, there are 1970 photographs total, and out of them, 394 are utilized for validation while the remaining images are used for testing. The example datasets that were used in the detection of brain tumors using genetic algorithms and deep learning are presented in Figure 1. Articles

37


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 1. Sample dataset

Fig. 2. Methodology of brain tumor detection

3.2 Image Pre-processing

3.3 Extraction of Feature

Figure 2 below illustrates the methodology presented. It has been shown that the dataset's sample MRI scans do not produce a clear image due to noise and reduced intensity. The clarity of an image is crucial for analyzing it to detect sickness. In order to tell one thing from another in an image, contrast is a crucial characteristic to have. Equation (1) is the mathematical expression for automatically adjusting the brightness and contrast of a scene based on a power law transmission, and expression (2) is the matching complement (2).

The color and the texture of the tumor are taken into consideration in this procedure, which is utilized to determine the clinical characteristics of the tumor. The color variation serves as an accurate reflection of the severity of the tumor's grade. In the figure, you can see a collection of color variants that each represent a distinct grade level of brain tumor. From level I all the way up to level IV, the degree of color variety is the primary focus of attention. Based on quantitative research, it has been shown that the first, second, third, and fourth orders of distinct color variations are caused by differences in hue and saturation level. Calculations of image intensities T1, T2, TLC, and FLAIR are performed on the basis of variations in color as shown in Figure 3. Therefore, the levels of hue and saturation that correspond to first grade and second grade may be derived from equation (3) and (4).

s ( x , y ) = r ( x , y ) + k (1)

s = kr γ (2)

38

In this equation, s and r represent the gray levels of the pixels in the output picture and the input image, respectively, and k is a constant value. Filtration and segmentation are two forms of pre-processing that should be applied to the sample data in order to get a higher overall picture quality. The primary purpose of this procedure is to improve the picture quality so that the surgeon can pinpoint the precise site of the tumor and determine its grade. The sample data have to be enlarged in order to get a higher level of precision with the image. After being scaled, the picture is then sent through the filter to have the noise removed. Following an analysis of the various filtering methods, it was determined that the median filter plays a significant part in picture pre-processing. This filter is utilized to eliminate noise from an image without affecting its other properties. The clustering technique is performed to a noise-free picture in order to achieve the desired result of separating dysfunctional or abnormally functioning cells from the background. Articles

Si =

σi =

N

1 Rij (3) N j −1

N

2 1 Rij − S i (4) N j −1

∑(

)

3.4 Texture Feature Extraction

In addition to the color features, the extraction of the texture features is also an essential aspect of the analysis of the photographs. The Gray Level Co-occurrence Matrix (GLCM) [28] and the Speeded Up Robust Feature (SURF) [29] are both used in the process of texture extraction to provide the desired results. The integrated algorithms utilized in order to cut down on the quantity of overlapping characteristics. When compared to previous methods for the extraction of textures, this approach ­ produces ­ results that are more ­ accurate.


Journal of Automation, Mobile Robotics and Intelligent Systems

0.5

VOLUME 16,

N° 4

2022

HSV Color Variation Vs Grade Level of the Tumor

0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0

H1

H2

H2

S1

Grade I

S2 Grade II

S3

V1

Grade III

V2

V3

Grade IV

Fig. 3. Identification of grade level of the brain tumor

This integrated algorithm identifies all of the traits that are the same across photos of people with and without brain tumors. Image convolution is the method that the SURF extraction algorithm uses in order to locate the spots in two pictures that are identical to one another. The Haar Wavelet matrix is initially utilized in the calculation of the surf by this approach. The H matrix is used to compute the circular area surrounding the key points, which is then used to determine the orientation of the pictures relative to one another. Finding picture angles and pixel distances is accomplished with the help of the GLCM algorithm in equations 5 and 6. The GLCM method is used to determine the primary four properties of a texture, such as distance, direction, and gray value in equations 7 and 8. This suggested work includes a total of 161 characteristics, the majority of which center on tumor form. Energy =

L −1 L −1

∑∑ R (i , j;d ,θ ) (5) i = 0 j =0

Entropy =

2

L −1 L −1

∑∑ R (i , j;d ,θ ) log R (i , j;d ,θ ) (6) i = 0 j =0

Moment of Inertia =

∑∑ (i − j ) R (i , j;d ,θ ) (7)

L −1 L −1

Correlation =

L −1 L −1 i = 0 j =0

2

∑∑ ij R (i − j ) − µ µ i = 0 j =0

x

σ x2 σ 2y

y

(8)

3.5 Image Optimization Using Genetic Algorithm The advanced learning machine (ALM) training model uses single hidden layer instead number of hidden layers. In this training model, the hidden layer threshold value of hidden layer neurons and connection weight between input layer and hidden layer are randomly generated without any adjustment

c­ ompared to traditional feed forward network training models. Where are number of neurons in the input layer, hidden and output layer respectively. , are the excitation function of neurons and the threshold value of neurons of the hidden layer respectively. The training model of ALM can be expressed as in equation 9. m

∑α a (W x +b ) = O ; j =1,2,3,..............N (9) i =1

i

i

i

i

j

Where Wi = W1i ,W2i ,W3i ,.............Wmi is the weight

vector of input and hidden layer T

α i = α i 1 ,α i 2 α i 3 ,.............α im  is the weight vector

of output and hidden layer

T

Oi = Oi 1 , Oi 2 Oi 3 ,.............Oim  denotes the network

output value.

As part of the simulation, the tumor's size and location are analyzed using photographs obtained in 1970. Researchers analyze the tumor's texture in terms of its hue, saturation, and value (HSV) colors to identify its grade using an integrated technique termed SURF + GLCM. Due to the one-of-a-kind landscape, four separate paths will be accessible. By collecting data in all four directions simultaneously, we are able to take a holistic strategy and then compare our results to 394 separate validation datasets. This strategy adopts a consistent methodology. There are two primary considerations and four distinct methods to implement to get the best data. Step 1: Choose the number of hidden layers to develop an ALM brain tumor identification model. Step 2: Initialize the weights of input and hidden layer threshold of the ALM model to get the optimal solution. Step 3: By means of derivative-free optimization [30]to find out the output error of ALM to achieve crossover and selection to find next comparison point Articles

39


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

using heuristic search based on empirical rules and s fitting the objective function with samples. Step 4: Check whether the maximum number of iterations is reached and also find out there is no better substitute for the next samples. Step 5: Stop the algorithm to get an optimized image.

4. Results and Discussion

The following section explains the type of tumors, location of tumor, grade level, and sensitivity and specificity of the tumors is analyzed with the help of ALM technique.

4.1 Performance of Accuracy and Error

During the simulation, pictures from 1970 are analyzed to determine the tumor's grade and location. The hue, saturation, and value (HSV) color feature is

Grade I

Grade II

Tumor Grade Level

95

Average Accuracy 98.3

98.75 97.6

Grade III Grade III

Grade IV

97.35

97.2

89.65

90.34 89.65

93.8 92.44 90.26

Accuracy (%)

90

87.9 86.12

85

84.7 83.67 80.19

80 75 ALM 70

Grade I

Improved ELM Grade II

RF Grade III

Tumor Grade Level

Fig. 4b. Average recognition accuracy performance analysis of size of the tumor 40

Articles

2022

used to detect the tumor grade and an integrated algorithm (SURF + GLCM) is used to identify the texture feature. Since this is a textured feature, it will provide four unique orientations. This holistic approach gathers data from all four axes, then compares it to 394 different validation sets. Using the vector feature formed by the two color and four texture qualities, we can locate the tumor and determine how far along its progression is. The algorithms used to determine the disease type from a collection of samples are compared and contrasted. The provided hybrid learning technique outperforms alternative learning algorithms in terms of performance. Figures 4a and 4b show the contrasting outcomes in terms of accuracy and error when applying different learning algorithms to each disease. In Figure 5, we compare the effectiveness of the improved ELM algorithm, the RF method, and the SVM algorithm to that of the suggested ALM t­echnique.

Fig. 4a. Average recognition error 100

N° 4

SVM Grade IV


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 5. Performance analysis of size of the tumor sensitivity

Fig. 6. Sensitivity specificity

Fig. 7. Specificity

Articles

41


Journal of Automation, Mobile Robotics and Intelligent Systems

The recommended algorithm possesses higher performance metrics when contrasted with those of other algorithms. The performance analyses for the proposed ALM module show that meningiomas, gliomas, and pituitary tumors have respective values of 0.78, 0.59, and 0.49. The suggested ALM training module's sensitivity comparison is depicted in Figure 6. When compared to the currently used learning modules, the suggested approach shows increased sensitivity for all three types of brain tumors. Sensitivity is 0.68 for detecting meningiomas, 0.59 for gliomas, and 0.44 for pituitary tumors. The only form of tumor for which the recommended ALM training technique obtains a specificity of 0.96 is meningioma. The improved ELM approach has the same type-tumor specificity for meningiomas as it does for pituitary tumors. This is because both types of tumors originate in the pituitary gland. The technique that has been suggested is one that is more particular than the typical one.

References

In order to identify brain tumors mechanically, the authors of this study believe that MRI should be employed instead of conventional approaches. In order to estimate the angles and distances that exist between each pixel in a picture, one can use either the GLCM or SURF technique. Through the use of the GLCM technique, we are able to find the top four attributes of a texture. Distancing, orientation, gradient, and grayscale value are all examples of such factors. The kind of brain tumor illness, tumor grade, and tumor location may all be identified with the use of the ALM, the Genetic Algorithm, and an optimization method. Through computational modelling, we show that the suggested method outperforms the current standard of care for cancer detection in terms of both sensitivity and specificity.

[5]

5. Conclusion

AUTHORS

Karthik Annam* – Electronics and Communications Engineering, Institute of Aeronautical Engineering, Hyderabad, 500100, India, Email: karthik011190@ gmail.com. Sunil Kumar G – Department of E&TC, Sharadchandra Pawar College of Engineering, Dumberwadi (Otur), Pune, India, Email: gsunilmtech@gmail.com.

Ashok Babu P – Electronics and Communications Engineering, Institute of Aeronautical Engineering, Hyderabad, 500100, India, Email: ashokbabup2@ gmail.com.

Narsaiah Domala – Electronics and Communications Engineering, Lords Institute of Engineering and Technology, Himayathsagar, Near TSPA, Hyderabad, 500091, India, Email: narsiurs@gmail.com. 42

VOLUME 16,

*Corresponding author Articles

[1]

[2]

[3]

[4]

[6]

[7] [8] [9]

N° 4

2022

C. Buckner, P.D. Brown, B.P. O’Neill, F.B. Meyer, “Central Nervous System Tumors”, Symposium on Solid Tumors, Mayo Foundation for Medical Education and Research, vol. 82, no. 10, 2007, 1271–1286. K.P. Sridhar, S. Baskar, P.M. Shakeel, V.R.S. Dhulipala, “Developing brain abnormality recognize system using multi-objective pattern producing neural network,” J Ambient Intell Humaniz Comput, vol. 10, no. 4, 2018, 1–8.

R. Anitha and D.S.S. Raja, “Development of computer-aided approach for brain tumor detection using random forest classifier”, Int J Imaging Syst Technol, vol. 28, 2018, 48–53. R. Grant, “Medical management of adult glioma”, in: Management of Adult Glioma in Nursing Practice. London, UK: Springer, 2019, 61–80.

D.R. Johnson, J.B. Guerin, C. Giannini, J.M. Morris, L.J. Eckel, and T.J. Kaufmann, “2016 updates to the WHO brain tumor classification system: what the radiologist needs to know”, Radiographics, vol. 37, 2019, 2164–2180. Kalyani, G., Janakiramaiah, B., Prasad, L.V.N. et al. Efficient crowd counting model using feature pyramid network and ResNeXt. Soft Comput 25, 10497–10507 (2021). https://doi. org/10.1007/s00500-021-05993-x

S. Banerjee, S. Mitra, F. Masulli, and S. Rovetta, “Deep radiomics for brain tumor detection and classification from multi-sequence MRI”, arXiv preprint arXiv:1903.09240, 2019. N. Nida, M. Sharif, M.U.G. Khan, M. Yasmin, S.L. Fernandes, “A framework for automatic colorization of medical imaging”, IIOAB J, vol. 7, supp. 1, 2019, 202–209.

J. Amin, M. Sharif, Y. Mussarat, T. Saba, M. Raza, “Use of machine intelligence to conduct analysis of human brain data for detection of abnormalities in its cognitive functions”, Multimed Tools Appl, vol. 79, no. 3, 2019, 1–19.

[10] S. Naqi, M. Sharif, M. Yasmin, S.L. Fernandes, “Lung nodule detection using polygon approximation and hybrid features from CT images”, Curr Med Imaging Rev, vol. 14, no. 1, 2018, 108–117. [11] A. Liaqat, M.A. Khan, J.H. Shah, M. Sharif, Y. Mussarat, S.L. Fernandes, “Automated ulcer and bleeding classification from WCE images using multiple features fusion and selection”, J Mech Med Biol, vol. 18, no. 4, 2018, 1850038.


Journal of Automation, Mobile Robotics and Intelligent Systems

[12] M. Sharif, M.A. Khan, M. Faisal, Y. Mussarat, S.L. Fernandes, “A framework for offline signature verification system: best features selection approach”, Pattern Recognit Lett, vol. 139, 2018.

[13] Ramu, G. A secure cloud framework to share EHRs using modified CP-ABE and the attribute bloom filter. Educ Inf Technol 23, 2213–2233 (2018). https://doi.org/10.1007/s10639-0189713-7 [14] M. Raza, M. Sharif, M. Yasmin, M.A. Khan, T. Saba, S.L. Fernandes, “Appearance based pedestrians’ gender recognition by employing stacked auto encoders in deep learning”, Future Gener Comput Syst, vol. 88, 2018, 28–39.

[15] G.J. Ansari, J.H. Shah, Y. Mussart, M. Sharif, S.L. Fernandes, “A novel machine learning approach for scene text extraction”, Future Gener Comput Syst, vol. 87, no. 10, 2018, 328–340. [16] M. Sharif, M. Raza, J.H. Shah, M. Yasmin, S.L. Fernandes, “An overview of biometrics methods”, in: Handbook of Multimedia Information Security: Techniques and Applications. London, UK: Springer, 2019, 15–35. [17] R.P. Joseph and C.S. Singh, “Brain tumor MRI image segmentation and detection in image processing", Int J Res Eng Technol, vol. 3, no. 13, 2014, 1–5.

[18] Kalyani, G., Janakiramaiah, B., Karuna, A. et al. Diabetic retinopathy detection and classification using capsule networks. Complex Intell. Syst. (2021). https://doi.org/10.1007/s40747-02100318-9 [19] Solomon C. and Breckon T., Fundamental of digital image processing: a practical approach with examples in Matlab, Wiley Blackwell: Chichester, West Sussex, 2011.

[20] N. Sauwen, M. Acou, D.M. Sima, J. Veraart, F. Maes, U. Himmelreich, et al., “Semi-automated brain tumor segmentation on multi-parametric MRI using regularized non-negative matrix factorization”, BMC Med Imaging, vol. 17, no. 1, 2017, 1–14. [21] D. Joshi and H. Channe, “A survey on brain tumor detection based on structural mri using machine learning and deep learning techniques”, Int J Sci Technol Res, vol. 9, no. 4, 2020.

VOLUME 16,

N° 4

2022

[22] M. Havaei, N. Guizard, H. Larochelle, P.M. Jodoin, “Deep learning trends for focal brain pathology segmentation in MRI”, in: Lecture Notes in Computer Science. London, UK: Springer, 2016, 125–148. [23] B. Padmaja, P. V. Narasimha Rao, M. Madhu Bala and E. K. Rao Patro, "A Novel Design of Autonomous Cars using IoT and Visual Features," 2018 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2018 2nd International Conference on, Palladam, India, 2018, pp. 18-21, doi: 10.1109/I-SMAC.2018.8653736. [24] C.L. Devasena and M. Hemalatha, “Efficient computer aided diagnosis of abnormal parts detection in magnetic resonance images using hybrid abnormality detection algorithm”. Cent Eur J Comput Sci, vol. 3, no. 3, 2013, 117–128. [25] S. Goswami and L.K.P. Bhaiya, “Brain tumor detection using unsupervised learning based neural network”, 2013 International Conference on Communication Systems and Network Technologies, Gwalior, 2013, 573–577. [26] S. Bakas, M. Reyes, A. Jakab, S. Bauer, M. Rempfler, A. Crimi, et al. “Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge”, arXiv Prepeint.arXiv:1811.02629, 2018.

[27] D.S. Marcus, A.F. Fotenos, J.G. Csernansky, J.C. Morris, and R.L. Buckner, “Open access series of imaging studies: longitudinal MRI data in nondemented and demented older adults”, J Cogn Neurosci, vol. 22, 2010, 2677–2684. DOI: 10.1162/ jocn.2009.21407 [28] Dash, S.C.B., Mishra, S.R., Srujan Raju, K. et al. Human action recognition using a hybrid deep learning heuristic. Soft Comput 25, 13079– 13092 (2021). https://doi.org/10.1007/ s00500-021-06149-7 [29] H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded Up Robust Features”, European Conference on Computer Vision, vol. 3951, 2006, 404–417. [30] A.S. Berahas, R.H. Byrd, and J. Nocedal, “Derivative-free optimization of noisy functions via quasi-newton methods,” SIAM J Optimiz, vol. 29, 2019, 965–993.

Articles

43


VOLUME 16, N° 4 2022 Journal of Automation, Mobile Robotics and Intelligent Systems

Firefly Algorithm Optimization of Manipulator Robotic Control Based on Fast Terminal Sliding Mode Submitted: 18 June 2022; accepted: 08 August 2022

Mallem Ali, Douak Fouzi, Benaziza Walid, Bounouara Asma DOI: 10.14313/JAMRIS/4-2022/32 Abstract: In this paper a new algorithm of optimization in the field of manipulator robotic control is presented. The proposed control approach is based on fast terminal sliding mode control (FTSMC), in order to guarantee the convergence of the position articulations errors to zero in finite time without chattering phenomena, and the Firefly algorithm in order to generate the optimal parameters that ensure minimum reaching time and mean square error and achieve better performances. This ensures the asymptotic stability of the system using a Lyapunov candidate in the presence of disturbances. The simulations are applied on a two-link robotic manipulator with different tracking references by using Matlab/ Simulink. Results show the efficiency and confirm the robustness of the proposed control strategy. Keywords: Manipulator robotic, fast terminal sliding mode, firefly algorithm, Lyapunov stability

1. Introduction

44

The field of trajectory tracking control is one of the important researches on the manipulator robotics [1], [2]. Therefore, to improve the performance of robotic systems, several control approaches must be applied and implemented to obtain a robust and efficient control system. To solve the problems of tracking control in this area, several researches are deployed: Such as PID tracking control [3], [4], computed torque control [5], adaptive control [6], [7], sliding mode control [8], [9], adaptive backstepping trajectory tracking [10], [11], fuzzy logic controller [12], neural network strategy [13]. Sliding mode control is one of the recent approaches which has shown its robustness against uncertainties and external disturbances [14], but conventional sliding mode control is known to incur chattering phenomena. Moreover, in order to eliminate this phenomenon, the sign function should be changed. However, the fast terminal sliding mode control displayed the problem of chattering, which guarantees the convergence of the errors in finite time. In new researches on tracking control, a finite time tracking control was deployed [15][16]. A global finite-time tracking controller for manipulator robots

based on inverse dynamics is presented in Su and Zheng [17]. For the differential equation of time analysis, the literature proposed finite time stability. Optimization is a very important technique in the field of control where the control laws are based on gains and coefficients. The problem that arises, is how are these coefficients chosen? Then the main objective is to find the search process that maximizes or minimizes a cost function. Recently, optimization techniques have attracted the attention of researchers who deal with the control problems of manipulator robotics. PSO-based PID and sliding mode control for manipulator robotics were implemented for optimizing and tuning the gains of PID, and improve the parameters of dynamic design in sliding mode control [18][19]. A combined genetic algorithm with sliding mode control with sliding perturbation provides the optimal gains in order to obtain the robust routine [20]. Another work used optimal sliding mode control based on a multi-objective genetic algorithm for manipulator robotic tracking control, which aims to not only minimize the chattering phenomenon but also increase performances of the system. This is presented in [21]. Sliding mode controller (SMC) with PID surface is introduced for the tracking control of a robot manipulator using Antlion optimization algorithm (ALO) and compared with another technique called gray wolf optimizer (GWO) [22]. A novel sliding mode (NSMC) which is combined with PID based on the extended grey wolf optimizer (EGWO) is applied on manipulator robot to optimize the control parameters (NSMC) [23]. In this paper a new meta-heuristic optimization method is introduced, namely, the firefly algorithm (FA) which is inspired by the real fire- flies’ behavior. The firefly algorithm was developed in late 2007 and early 2008 by Xin-She Yang [24]. FA has been demonstrated to be very efficient in solving problems of global optimization, multimodal, and nonlinear problems. This paper performs an optimal control on a twolink manipulator robot which is based on a new method of optimization in the field of tracking control of manipulator robotics by optimizing the objective function defined by Root Mean Square Error (RMSE). Based on the previous works using conventional sliding mode control, this paper aims to propose a fast terminal sliding mode control (FTSMC) avoids the chattering phenomenon and ensures the convergence of sliding surfaces in finite time, and also guarantees

2022 ® Ali et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

the stability of the system. For this purpose a combined firefly algorithm with FTSMC is suggested in order to demonstrate the efficient of the control strategy proposed. The paper is organized as follows: Section 2 presents dynamic modeling of a two-link manipulator robot. The principal concepts of firefly algorithm are resumed in section 3. The control strategy based on FA-FTSMC is presented in section 4. The simulation and analysis of the improved control strategy are presented in Section 5. Finally, conclusions are drawn in Section 6.

Where m1, m2 are the link masses, a1 and a2 are the link lengths, g is the gravitational acceleration.

Consider the two-link manipulator robot indicated in Figure 1. The robot has two junctions where the joint variables are defined by q1 and q2 as shown in the following figure: The dynamic equation of the manipulator robot is given as follow:

Obviously, these indicators and their intensity are subject to physical laws. FA is based on the following three rules [24]:

2. Dynamic Modeling of Two-Link Manipulator Robot

M(q)q + C(q , q )q + G(q) + τ d = τ

(1)

Where M(q) ∈ R2×2 define positive is the inertia matrix. C(q , q )q ∈ R2×2 is the matrix that represents the effects of centrifugal and coriolis, G(q) ∈ R2×1 is the gravitational vector, τ d ∈ R2×1 are the external disturbances and τ ∈ R2×1 is the torque of the junctions, q , q , q are angular position, velocity and acceleration of the junctions. The matrices and vectors presented the dynamic equation are defined as follows: m a 2 + 2m2a1a2 cos(q2 ) + (m1 + m2 )a12 M( q ) =  2 2 m2a22 + m2a1a2 cos(q2 ) 

cos(q2 ) + (m1 + m2 )a12

m2a1a2 cos(q2 )

m2a22 + m2a1a2 cos(q2 )  m2a22 

 −2m2a1a2 sin(q2 )q2 C (q ) =   m2a1a2 sin(q2 )q1

− m2a1a2 sin(q2 )q2   0 

m a g cos(q1 + q2 ) + (m1 + m2 )a1 g cos(q1 ) G(q) =  2 2  m2a2 g cos(q1 + q2 )  

Fig. 1. Geometry of two-link manipulator robot

3. Firefly Algorithm

Firefly algorithm (FA) was developed by Xin-She Yang in 2008 and in the last years it has become one of the important tools for solving the problems of optimization. FA attempts to imitate the flashing pattern and attraction behavior of fireflies. The purpose of these flashing lights is based on the following two points: • •

Attract mating partners Warn potential predators.

1. Fireflies are unisex, so a firefly will be attracted to other fireflies, regardless of gender. 2. Attractiveness is proportional to luminosity, they decrease as their distance increases. So, for two flashing fireflies, the less bright will move to the brighter. If there is none brighter than a particular firefly, it will move randomly. 3. The brightness of a firefly is defined by the landscape of the objective function in order to obtain efficient optimal solutions. The variation of attractiveness λ with the distance r can be defined by the following equation:

λ = λ0e −γ r

2

(2)

Where r is the distance between two fireflies, and ϒ is the absorption parameter, λ0 is the attractiveness at r=0. The movement of a firefly i that is attracted to another more attractive (brighter) firefly j can be defined by the following equation: 2

X i t +1 = X i t + λ0e −γ r ( X j t − X i t ) + µt ε i t

(3)

µ = µ0θ t ,θ ∈ (0,1)

(4)

2

The term λ0e −γ r ( X j t − X i t ) represents the attraction, and the term µt ε i t represents the randomization where μ is the randomization parameter, ε i t is a vector of random numbers obtain from a Gaussian distribution at time t, and ϒ controls the scaling. Randomness should be gradually lowered to ensure that the algorithm converges correctly, then to achieve this convergence one way is used: Where μ0 is the initial randomness factor, and t is the index of generations/ iterations. Articles

45


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

The FA can be resumed in the following ­algorithm: Begin algorithm

Initialize the parameters; Define the objective function f(X); Generate the initial population of fireflies or Xi given in equation (3) where i=1,2,….n; Define the light intensity of Ii at Xi with the function f(Xi); Determinate ϒ which is the light absorption; Repeat

For i=1 to n For j=1 to n If (Ij>Ii) Displace firefly i towards j ; Attractiveness varies with the distance r; End if Evaluate new solutions and update light intensity; End for j End for i Classify the fireflies and define the current best; Until (t>MaxGeneration) Postprocess results and visualisation;

Where α, β > 0 and k, p (k<p) are positives odd numbers. The reaching time of the sliding surface to zero is determinate as the following equation: p−k

p α s(0) p + β ts = ln α(p − k ) β s = e + ce

(7)

Where c defined positive, e is the articulation error of the robot, and e is the derivative of the error which are defined as follow: e = q − qd

4. Control Strategy

(6)

In this work the sliding surface is based on the conventional sliding mode which is defined as follow:

e = q − qd

End algorithm.

(8)

(9)

In this work the control strategy is based on the fast terminal sliding mode controller, where the optimal parameters of the controller are delivered by the firefly algorithm in order to ensure a fast convergence and avoid the phenomenon of chattering, thus having a minimal MSE based on the objective function. Controller parameter must be considered as free parameters to be adjusted. The objective of this strategy is to put the robot articulations following their references trajectory in presence of disturbances. The following figure resumes this control strategy. From the figure, the control signal is presented by the torque vector τ = τ 1 τ 2  , where the control laws are defined through the fast terminal sliding mode controller that it will be presented in the next section. Note that the articulation error is set to be an input for the controller as well as the optimized parameters α, β, k, p, c. The input signal is defined by the desired articulation variable qd = [qd 1 qd 2 ], The control loop presented in Figure 2 will be repeated until reaching the number of iterations and optimal values of parameters are obtained.

And the second derivative of the articulation error is obtained as : e = q − qd (10)

A new global fast terminal sliding surface proposed by Park et al [25] is given as follows:

Replacing (15) in (14), the control low becomes:

4.1 Fast Terminal Sliding Mode Control

S = s + α s + β s k / p = 0

46

Fig. 2. Control strategy of manipulator robot

Articles

(5)

The error signal of tracking is defined in what follow: qr = qd − ce (11)

By replacing equation (11) in (7), the equation (7) can be rewritten as follow: s = q − qd + ce = q − qr

(12)

The derivative of sliding surface can be obtained as: s = q − qr

(13)

To obtain the control low, we start from the equation (13), so this last is multiplied by the inertia matrix M(q), and according to equation (1) the control law can be given as follow:

τ = M(q)s + C(q , q )q + G(q) + M(q)qr

(14)

s = −α s − β s k / p

(15)

According to equation (5):

τ = M(q)( −α s − β s k / p ) + C(q , q )q + G(q) + M(q)qr (16)


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

Proof: to ensure the stability of the system, the Lyapunov candidate is choosing as: 1 v = ssT (17) 2 The derivative of this equation is obtained as the following steps: With

v = sT s = sT (q − qr )

q = M −1 (q)(τ − C(q , q )q − G(q))

(18)

(19)

According to equations (19) and (16), equation (18) can be rewritten as: v = sT ( M −1 (q)(τ − C(q , q )q − G(q)) − qr )) = sT ( −α s − β s k / p )

(20)

Such that α, β, k, p are defined positive, therefore v is negative definite.

4.2 Objective Function Definition

The main objective of the proposed FA-FTSM controller is to tune optimally as fast as possible the FTSM controller parameters by minimization of the objective function, which can be formed by different performances specification such as the root mean square error (RMSE). In this paper the objective function is defined as the MSE error which can be given as follow: f (X ) =

N

1 e12 (i ) + e22 (i ) N i =1

(21)

where X = α , β , k , p , c  is a parameter set of FTSM controller and N is the number of data samples, e1 and e2 are the tracking articulation errors. According to the FA, the optimizer will define the unknown FTSMC free parameters by updating the solutions based on the objective function.

5. Results and Discussion

In this section MATLAB/SIMULINK is choosing in order to simulate the proposed method of control, and to verify its effectiveness. So we evaluate through computer simulation, the ability of the proposed FTSM controller to deal with the controller tuning and permit to the manipulator follow the trajectory generated by the input signal which is the desired articulation variables. However the optimization performances are evaluated using the RMSE criterion. In these simulations a two-link manipulator robot shown in Figure 1 is considered. The dynamic equation of this two-link robot is given in equation 1. Let us consider a sinusoidal desired position commands of two articulations are given as qd = 0.1sin(t ) 0.1sin(t ) , the disturbance is selecting as τ d = 0.2sin(t ) 0.2sin(t ) , and for simulation purposes parameters of the robot are taken as: m1=m2=0.5Kg, a1=a2=1m, and g=9/8 m/s2.

N° 4

2022

The FA parameters values are resumed in Table 1: In the first step of simulations an offline optimization is considered in order to define the upper bound and lower bound of each control parameter (α, β, k, p) with different criteria of the proposed optimization algorithm. Note that the objective function in this step is considered as the reaching time of the sliding surface giving in equation (6). Therefore, the objective of proposed work is to minimize the reaching time and obtain the minimum MSE with optimal solutions. The fact that the parameters are defined we introduce them in the control loop strategy given in Figure 2, when the objective function is given in equation (21). Figure 3 shows the iteration of firefly algorithm with mean square error and objective function, where the RMSE decreases until it reaches the optimal value for parameters, with best fitness= 1.5665e-04; which prove the performance of the optimization process. Hence, in Figure 4 the position tracking of two joints is shown, where the proposed controller shows good tracking and rapidly convergence towards the proposed reference joint positions. Moreover, in the Figure 5 the joint velocities can attain their reference velocities in finite time. F ­ igure 6 represents the control input torques which can guarantee the convergence of the tracking joint errors to zero and assure the stability of the system in presence of disturbances. The results shown in previous figures confirm the effectiveness and robustness of the proposed method, also the fast ­convergence. In Table 2, the optimal value of RMSE which corresponds to the best estimate parameters of the proposed controller are defined. In the next step of simulation, a conventional sliding mode controller tuned by the firefly algorithm Tab. 1. Parameters values of firefly algorithm Parameter n

μ0 λ ϒ ng

Designation

Value

Randomness

0.2

Absorption coefficient

1

Number of fireflies Initial attractiveness

Number of generation

50 2

60

Fig. 3. Objective function evolution

Articles

47


Journal of Automation, Mobile Robotics and Intelligent Systems

Fig. 4. Position tracking of joints 1 and 2

Fig. 5. Velocity tracking of joints 1 and 2

Fig. 6. Control inputs of joints 1 and 2 48

Articles

VOLUME 16,

N° 4

2022


Journal of Automation, Mobile Robotics and Intelligent Systems

(FA-CSMC) is introduced to validate the proposed approach in robustness and stability; a comparative study is presented in order to compare the proposed approaches. Therefore, in this case the control law given in equation (16) can rewrite as follow:

τ = C(q , q )q + G(q) + M(q)qr − cM(q)sign( s ) (21)

Tab. 2. Simulation results using FA-FTSMC Results

Number of fireflies n=25

Number of fireflies n=50

α

8.4480

7.631

MSE β k p c

1.5921e-04 20.00

0.1000 2.7033 30.00

1.5665e-04 20.00 0.500

10.00

29.5262

VOLUME 16,

N° 4

2022

The results obtained presented in Figure 7 and Figure 8 show good tracking and convergence towards the proposed reference joint positions. Figure 9. represents the control input torques, in which the chattering phenomenon is appeared. It can be noticed that the control input torque response in the case of FA-FTSMC is more stable, and the chattering is eliminated. Table 3 resume the comparative study of the approaches presented, when the comparing criteria was done by using Mean Square Error (MSE) and reaching time (ts) defined in equation (6). Based on the comparative study given in Table 3, we confirm that the FA-FTSMC approach ensure more effectiveness and robustness than other proposed approaches in term of fast convergence and in term of less RMSE. The results obtained shown in Figure 10 and Figure 11 confirm the efficiency of FA-FTSMC addressed in Table 3.

Fig. 7. Position tracking of joints 1 and 2(FA-CSMC)

Fig. 8. Velocity tracking of joints 1 and 2(FA-CSMC)

Articles

49


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 9. Control inputs of joints 1 and 2(FA-CSMC)

6. Conclusion In this paper a FA-FTSM control is proposed to ensure the optimal tracking of two-link manipulator robot, taking into account the dynamics of the robot. The proposed controller demonstrate that it can be make the system converges to the reference in a finite time without chattering phenomena even in presence of disturbances. Firefly algorithm is introduced in this paper for tuning the FTSM controller parameters in order to obtain the less RMSE and reaching time which permit a fast convergence. The obtained simulations results show that the proposed approach perform an efficient search for the optimal FTSM controller parameters. Simulations results have demonstrated the robustness and the effectiveness of the approach proposed where it was confirmed by a comparative study with a combined firefly algorithm and conventional sliding mode (CSM), in which it proved that the optimization in the FA-FTSM case is faster than FA-CSM.

Fig. 10. Tracking errors of joint 1

AUTHORS Fig. 11. Tracking errors of joint 2 Tab. 3. Comparative study between the proposed approaches Results

FTSMC

Reaching time(ts)

0.6065

RMSE α β k p c

50

Articles

FA-FTSMC

FA-CSMC

0.0020

1.5665e-04

2.448 e-04

4

7.631

/

2 5 7

20

0.0533 20.00 0.500 10.00

29.5262

0.6931 / / /

25.38

Ali Mallem* – Industrial Engineering Department, University Abbès Laghrour, Khenchela, Advanced Electronic Laboratory, University of Batna 2, Algeria, E-mail: ali_mallem@hotmail.fr. Fouzi Douak – Industrial Engineering Department, University Abbès Laghrour, Khenchela, Algeria, E mail: douak.fouzi@gmail.com. Walid Benaziza – Advanced Electronic Laboratory, University of Batna 2, Batna, Algeria, E-mail: w.benaziza@univ-batna2.dz.

Asma Bounouara – Electronics Department, University of Batna 2, Batna, Algeria, E-mail: asma.bounouara@gmail.com. *Corresponding author


Journal of Automation, Mobile Robotics and Intelligent Systems

References [1]

Yin, Meng, et al., “Mechanism and position tracking control of a robotic manipulator actuated by the tendon-sheath,” Journal of Intelligent & Robotic Systems, vol.100, no.3, 2020, pp. 849862. DOI: 10.1007/s10846-020-01245-6

[2] B. Xiao, S. Yin, and O. Kaynak, “Tracking control of robotic manipulators with uncertain kinematics and dynamics,” IEEE Transactions on Industrial Electronics, vol. 63, no. 10, 2016, pp. 6439-6449. DOI: 10.1109/ TIE.2016.2569068 [3] [4]

[5]

[6] [7] [8] [9]

I. Cervantes and J. Alvarez-Ramirez, “On the PID tracking control of robot manipulators,” Syst. Control Lett, vol. 42, no. 1, 2001, pp. 37-46. DOI: 10.1016/S0167-6911(00)00077-3

Y. Su, P. C. Müller and C. Zheng, “Global asymptotic saturated PID control for robot manipulators,” IEEE Trans. Control Syst. Technol, vol. 18, no. 6, 2010, pp. 1280-1288. DOI: 10.1109/TCST.2009.2035924 A. Codourey, “Dynamic modeling of parallel robots for computed-torque control implementation,” Int. J. Robot. Res, vol. 17, no. 12, 1998, pp. 1325-1336. DOI: 10.1177/027836499801701205 J.-J. E. Slotine and W. Li, “On the adaptive control of robot manipulators,” Int. J. Robot. Res, vol. 6, no. 3, 1987, pp. 49-59. DOI: 10.1177/027836498700600303

G. Tao., Adaptive Control Design and Analysis, Hoboken, NJ, USA:Wiley, vol. 37, 2003.

K. D. Young and Ü. Özgüner., Variable Structure Systems Sliding Mode and Nonlinear Control, London, U.K.:Springer, vol. 247, 1999. DOI: 10.1007/BFb0109967 C. Edwards, E. F. Colet, L. Fridman, E. F. Colet and L. M. Fridman., Advances in Variable Structure and Sliding Mode Control, Berlin, Germany:Springer, vol. 334, 2006. DOI: 10.1007/11612735

[10] Hu, Qinglei, Liang Xu, and Aihua Zhang, “Adaptive backstepping trajectory tracking control of robot manipulator,” Journal of the Franklin Institute, vol. 349, no. 3, 2012, pp. 1087-1105. DOI: 10.1016/j.jfranklin.2012.01.001

[11] PARK, S. H. et HAN, S. I, “Robust-tracking control for robot manipulator with deadzone and friction using backstepping and RFNN controller,” IET control theory & applications, vol. 5, no.12, 2011, pp. 1397-1417. DOI: 10.1049/iet-cta.2010.0460 [12] Ho, H. F., Yiu-Kwong Wong, and Ahmad B. Rad, “Robust fuzzy tracking control for robotic

VOLUME 16,

N° 4

2022

manipulators,” Simulation Modelling Practice and Theory, vol.15, no.7, 2007, pp. 801-816. DOI: 10.1016/j.simpat.2007.04.008

[13] Wai, Rong-Jong. “Tracking control based on neural network strategy for robot manipulator,” Neurocomputing, vol. 51, 2003, pp. 425-445. DOI: 10.1016/S0925-2312(02)00626-4 [14] KANAYAMA, Y., KIMURA, Y., MIYAZAKI, F. and NOGUCHI, T, “A Stable Tracking Control Method for an Autonomous Mobile Robot,” IEEE Conf. Robotics and Automation, Cincinnati, 1990, pp. 384-389. DOI: 10.1109/ ROBOT.1990.126006 [15] Bao, Jialei, Huanqing Wang, and Peter Xiaoping Liu, “Adaptive finite‐time tracking control for robotic manipulators with funnel boundary,” International Journal of Adaptive Control and Signal Processing, vol.34, no.5, 2020, pp. 575589. DOI: 10.1002/acs.3102

[16] Razmjooei, Hamid, et al., “A novel robust finite-time tracking control of uncertain robotic manipulators with disturbances,” Journal of Vibration and Control, vol.28, no.5-6, 2022, pp. 719-731. DOI: 10.1177/1077546320982449 [17] Su, Yuxin, and Chunhong Zheng. “Global finite-time inverse tracking control of robot manipulators,” Robotics and Computer-Integrated Manufacturing, vol.27, no.3, 2011, pp. 550-557. DOI: 10.1016/j.rcim.2010.09.010

[18] Akkar, Hanan AR, and Suhad Qasim G. Haddad, “Design Stable Controller for PUMA 560 Robot with PID and Sliding Mode Controller Based on PSO Algorithm,” International Journal Intelligent Engineering System, vol.13, no.6, 2020, pp. 487-499. DOI: 10.22266/ijies2020.1231.43

[19] Hashem Zadeh, Seyed Mohammad, et al., “Optimal sliding mode control of a robot manipulator under uncertainty using PSO,” Nonlinear Dynamics, vol. 84, no.4, 2016, pp. 2227-2239. DOI: 10.1007/s11071-016-2641-4 [20] You, Ki Sung, Min Cheol Lee, and Wan Suk Yoo, “Sliding Mode controller with sliding perturbation observer based on gain optimization using genetic algorithm,” KSME International Journal, vol.18, no.4, 2004, pp. 630-639. DOI: 10.1007/BF02983647

[21] Boukadida, Wafa, Anouar Benamor, and Hassani Messaoud, “Multi-objective design of optimal sliding mode control for trajectory tracking of SCARA robot based on genetic algorithm,” Journal of Dynamic Systems, Measurement, and Control, vol.141, no.3, 2019. DOI: 10.1115/1.4041852 Articles

51


Journal of Automation, Mobile Robotics and Intelligent Systems

[22] Loucif, Fatiha, and Sihem Kechida, “Sliding mode control with pid surface for robot manipulator optimized by evolutionary algorithms,” Recent Advances in Engineering Mathematics and Physics. Springer, Cham, 2020, pp. 19-32. DOI: 10.1007/978-3-030-39847-7_2 [23] Rahmani, Mehran, Hossein Komijani, and Mohammad Habibur Rahman. “New sliding mode control of 2-DOF robot manipulator based on extended grey wolf optimizer,” International Journal of Control, Automation and Systems,

52

Articles

VOLUME 16,

N° 4

2022

vol.18, no.6, 2020, pp. 1572-1580. DOI: 10.1007/ s12555-019-0154-x

[24] Yang, Xin-She, Nature-inspired metaheuristic algorithms. Luniver press, 2010.

[25] Park, Kang‐Bark, and Teruo Tsuji, “Terminal sliding mode control of second‐order nonlinear uncertain systems,” International Journal of Robust and Nonlinear Control: IFAC‐Affiliated Journal, vol.9, no.11, 1999, pp. 769-780. DOI : 10.1002/ (SICI) 1099-1239


VOLUME 16, N° 4 2022 Journal of Automation, Mobile Robotics and Intelligent Systems

AI-Based Yolo V4 Intelligent Traffic Light Control System Submitted: 26th April 2022 ; accepted 28th July 2022

Boppuru Rudra Prathap, Kukatlapalli Pradeep Kumar, Cherukuri Ravindranath Chowdary, Javid Hussain DOI: 10.14313/JAMRIS/4-2022/33 Abstract: With the growing number of city vehicles, traffic management is becoming a persistent challenge. Traffic bottlenecks cause significant disturbances in our everyday lives and raise stress levels, negatively impacting the environment by increasing carbon emissions. Due to the population increase, megacities are experiencing severe challenges and significant delays in their day-to-day activities related to transportation. An intelligent traffic management system is required to assess traffic density regularly and take appropriate action. Even though separate lanes are available for various vehicle types, wait times for commuters at traffic signal points are not reduced. The proposed methodology employs artificial intelligence to collect live images from signals to address this issue in the current system. This approach calculates traffic density, utilizing the image processing technique YOLOv4 for effective traffic congestion management. The YOLOv4 algorithm produces better accuracy in the detection of multiple vehicles. Intelligent monitoring technology uses a signal-switching algorithm at signal intersections to coordinate time distribution and alleviate traffic congestion, resulting in shorter vehicle waiting times. Keywords: Traffic jams, traffic light system, traffic management, intelligent monitoring, signal switching algorithm, artificial intelligence

1. Introduction As the quantity and volume of vehicles populating the roads, especially in metropolitan cities, are increasing , intra-city roads are facing issues related to capacity, congestion, and control. The current traffic management system requires a great deal of effort and manpower to avoid and prevent accidents and imposes long waiting queues at the crossings. A more sophisticated system and infrastructure is required for better traffic management. Intelligent transportation systems are the need of the hour for a better traffic management. WSN (wireless sensor networks) can be used in single and multiple intersections for controlling vehicle movement flow sequences [27]. Traditional traffic control systems involved manual operation of control systems, which required a good

amount of manpower, managing congestion by traffic police using signboards, a sign light, and a whistle. Sensors and timers play a crucial role in managing vehicle movement at a traffic signal. Electronic Sensors: Installing loop detectors or proximity sensors on the lane is another sophisticated option. This sensor collects information on the flow of traffic on the route. Sensor data is utilized to control traffic lights. Traditional timer-controlled traffic lights: Timers are utilized to keep track of these signals. The timer is set to a specific numerical value and the lights alternate between red and green based on the timer value. Following a comprehensive literature review, we discovered various techniques for detecting vehicle density and acting on it. As a result, we decided to create an adaptive traffic control system that recognizes objects in images and adjusts the timing of traffic signals as required. Traditional techniques have many disadvantages. Setting up the manual control system requires substantial time and work. Due to a labor shortage, we cannot have traffic officers manually managing traffic in all regions of a city or town. As a result, a more effective traffic control system is required. Static traffic management uses a signal with a timer for each predetermined period and does not respond to on-road traffic. Because high-quality data gathering often relies on complex and expensive equipment, and the number of facilities may be restricted owing to budget constraints, accuracy and coverage are frequently at odds when utilizing electronic sensors such as proximity or loop detectors [28]. Furthermore, because most sensors have a limited effective range, a network of facilities normally requires many sensors to offer comprehensive coverage. Live images from CCTV (closed-circuit television) cameras at traffic junctions are used to calculate real-time traffic density by recognizing the number of cars at the signal and appropriately modifying the green signal time. To precisely estimate the green signal period, the vehicles are classified as a car, bike, bus, truck, human, and bicycle. The YOLO (from “you only look once”) approach is used to identify the number of cars where the traffic signal timer is adjusted in the proper direction based on vehicle density related to detected images. As a result, green signal intervals are optimized, and traffic is cleared much more efficiently than in a static system,

2022 ® Prathap et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)

53


Journal of Automation, Mobile Robotics and Intelligent Systems

resulting in fewer unnecessary delays, congestion, and waiting times. The hardcoded traffic signal allocation systems will not have the knowledge of traffic density. They involve long waiting times, especially in densely populated cities, resulting in heavy traffic jams and congestion. In addition, sensor-based traffic management reduces the problem to some extent but requires regular maintenance. Hence, there is still scope for improving the signal time allocation, which reduces traffic jams and congestion to a large extent. As a result, we propose to create an automated traffic management framework based on artificial intelligence that has a broad understanding of traffic density. This fits in to the proposed model of research work. This introduction is followed by literature review in the associated areas of research under section II. Methodology of the proposed problem is discussed in section III and results of the work with an analysis is mentioned in section IV followed by a conclusion.

2. Literature Review

RFID (radio-frequency identification) tags use radio waves to transmit data about an object to an antenna/ reader combination. P. Manikonda, A.K. Yerrapragada, et al. [11] developed a method for detecting vehicle speed that included an RFID tag and an RFID reader. The average speed of specified (N) readers’ cars was utilized to calculate the average time at particular crossings. This, however, necessitates constant communication between tags and readers and the installation of RFID tags in every vehicle. A. Kanungo, A. Sharma, et al. [7] used no hardware and computed vehicle density using a background separation technique at 30 frames per second, adding picture matrices. The next step was to divide the result by a constant C (c=camera height*number of rows*number of columns*30). The time to signal at four-way junctions is determined by vehicle density. It’s green time! (minimum: 10 seconds, maximum: 60 seconds) The concept was good; however, the study failed to consider the vehicle’s starting and count, both of which are essential aspects. The computation procedure had to be rapid because the vehicle movement remained constant. Qingyan Wang et al. [20] proved that their YOLOv4 algorithm had been improved to develop prediction capacity. In the detection trial, the author’s algorithm area under the PR curve (AUC) was 97.58 percent, compared to 90 percent in the Vision for Intelligent Vehicles and Applications Challenge Competition. In the recognition experiment, the mean average precision of the Improved YOLOv4 method was 2.86 percent higher than that of the original YOLOv4 technique. The Improved YOLOv4 algorithm is a reliable and effective real-time traffic light signal detection and recognition solution. Dave, Pritul, et al. [21] proposed a two-step process for YOLOv4 and XGBoost Traffic light system. The first step is to count the cars in each class. The method is done using YOLOv4 object detecting technology. It employs an 54

VOLUME 16,

N° 4

2022

ensemble algorithm called XGBoost to predict the optimum green light window time (eXtreme gradient boosting). The suggested method is also compared with other YOLO implementations and prediction systems. The XGBoost algorithm produced the most efficient YOLOv4 results in accuracy and inference time. The proposed method might reduce traffic delays by 32.3 percent on average. A. Zaid, Y. Suhweil, et al. [1] developed an electronic and dynamic system that includes the measurement of Green Light Phase Time (GLPT) and an electronic system equipped with a faster C-language-based algorithm to provide a visualization of roads for remote control from the head office. For the vehicle detection algorithm, there are 12 LEDs for traffic light status and four limit switches (GLPT: EGT = FR x C, PGT = EGT + S - Y, RGT = EGT - PGT ). However, the headquarters must dictate the orders, which necessitates human involvement to avoid this. K. Sangeetha, Ms. Kavibharathi, et al. [12] proposed a Prewitt Edge Detection Mechanism to enhance the captured multiple images to reduce the light. These images were compared with images stored in the database to calculate the mean of matching percentage (0-10)% -90sec, (10-50)%- 60 sec,(50-70)%-30sec,(70-90)%-20sec). Here the Constant gamma values are used, which leads to more intensity reduction. The Internet of Things (IoT) and sensor-based technologies have already shown considerable progress in various fields. So J.M.S. Ferdous, T. Osman, et al. [13] tried it out by capturing images with a VCO7O6 camera and sending them to a microcontroller embedded in an Arduino. They can be sent to a server, where a background subtraction algorithm is used to detect vehicles. The server adjusts the green signal time and red signal time in a four-way junction server, and the traffic junction connection is maintained using the HTTP protocol. When vehicle density is poor, photos of the background are taken every 6 hours, which is entirely carried out in Indonesia and requires an essential factor of continuous background light monitoring. Jinyang Li, Yuanrui Zhang, et al. [9] proposed SATL (self-adaptive control traffic light systems) to improve the self-adaptive traffic signal timer by considering vehicle speed. Vehicles have a data-gathering module and a sending module that sends data to traffic light recipients. For an algorithm to assign time and increase time in case of collision or injuries, signals using a ZigBee sender with a range of 100 meters and Vmax = 40km/hr and Vmin = 20km/hr are considered, and increasing time in case of collision or accidents. However, vehicles slow down when making a turn, which is also essential at intersections, so S. Vignesh, K. S. Naresh, et al. [15] built a system of infrared sensors mounted on either side of the road that detects vehicle motion and sends the information to a Raspberry Pi connected to signals. This Raspberry Pi changes the signs with a more extended green time allocation when the number of vehicles detected by the IR sensor is high. Before projecting into the road, a traffic analysis allows us to take alternate lanes and avoid delays.


Journal of Automation, Mobile Robotics and Intelligent Systems

P. Rizwan, K. Suresh, et al. [16] came with vehicle detection sensors, and sensor data is used to compute time for traffic light timing adjustments. Input from camcorders and sensors is used to build a mobile application that includes images and timings to find shortcut routes to the destination considering the four-way junctions. Khushi [8] proposed a method in which pictures collected at intersections are sent to Matlab code for morphological image generation, and traffic density is measured using Matlab Duration functions. The length of green and red lights associated with a specific junction is calculated and sent to Arduino (Tdur=Tmax-k*Tmax) (East-West-NorthSouth). According to Muhammad Fachrie’s research [10], many issues have been resolved due to recent advances in ANN, which has been a key cause for the establishment of numerous firms. An Artificial Neural Network with a sliding mechanism with predefined bounding boxes was developed to finalize the number of vehicles on the normalized picture. Additionally, a fuzzy-based traffic signal controller was proposed, dividing traffic range into low, little low, medium, little high, and high and adjusting signal timings. To improve detection accuracy, A. Chattaraj, S. Bansal, et al. [3] modified the existing YOLO algorithm to OYOLO and OYOLO+R-FCN by combining both algorithms, varying the learning rate of epochs, and running the algorithms based on the requirement of accuracy. According to P. Adarsh, P. Rathi et al. [2], modelbased tracking employs an intra-frame matching method based on a parameterized vehicle model. An image segmentation component first recognizes possible moving vehicles by recognizing moving characteristics in the picture. H.F. Chong and D.W.K. Ng [4] utilized a technique known as region-based tracking to detect and monitor related areas in a picture associated with each vehicle. A background subtraction approach is widely used in the strategy. When automobiles partly obstruct each other in congested traffic, this technique fails because the algorithm recognizes the vehicles as a single huge blob in the foreground picture, lowering the process’s accuracy. A Haar-like feature detector with a high level of accuracy was used to find cars. S. Choudhury, S.P. Chattopadhyay, et al. [5] presented Haar-like characteristics by evaluating neighboring rectangular sections, adding the pixel intensities in each area, and subtracting the amounts. It will be used to categorize different parts of the image. Pranav Shinde and Srinand Yadav made the YOLO algorithm and [14] proposed a system that would use a Raspberry Pi USB camera to record video of vehicles when the light was red. The video would then be sent to a cloud system, where a YOLO algorithm would measure the number of vehicles and their lane preferences to turn on the green light. A difference in light intensity almost always causes foreground separation. This is a unique idea for extracting moving objects using a context subtraction process. Asma Ait Ouallane et al. [22] initially covered routing systems, traffic light solutions, and ways

VOLUME 16,

N° 4

2022

for controlling network traffic. Following that, the author explores potential options. Finally, they suggest many fresh possibilities for future urban highway traffic management research. AI-based techniques can reduce the challenges associated with effective road traffic management, particularly at junctions, which are a key cause of road congestion. B. Ali Almansoori et al. [23] developed an AI-powered system for identifying left-approaching automobiles at a roundabout. The system controller captures video data from roundabouts and uses a trained neural network to identify vehicles. When a vehicle is detected from a specific distance, signaling lights flash red to indicate a stop and green to indicate a pass. The proposed method reduces automobile accidents, roundabout delays, fuel economy, and loss of public property. D.Y. Huang., Chao Ho Chen., et al. [6] developed a unique concept for extracting moving things using a context subtraction process. Swinging tree leaves and raindrops are removed using two back-to-back filters in this example. Furthermore, a shade removal process combines a flexible background deletion approach to exclude mobile vehicles from background photos. Zhang X., Li X. et al. [17] propose a hybrid algorithm in which tiny-YOLOv3 and quick RCN are trained with images of various dimensions (320,416,608) and compared, with results presented as graphs and mean accuracy precision calculated. J. Hussain et al. [25] came up with the idea for YOLOv4, which has better inference methods. YOLOv4 is 12 percent faster than its predecessor, YOLOV3, and it is twice as fast as the EfficientDet method on the Tesla V100 GPU. The algorithm didn’t work well on traditional computers and single-board devices. In this study, we look at how well inference works in several different frameworks. We then suggest a framework that needs less than 30% of the hardware other frameworks need. In order to reduce traffic congestion, Muhammad Saleem et al. [26] suggested a fusion-based intelligent traffic congestion management system for VNs (FITCCS-VN) that gathers traffic data and guides traffic on available routes in smart cities. In order to avoid traffic congestion, the proposed system would provide drivers with unique features such as a distant view of traffic flow and the number of cars on the route. The recommended technique both boosts traffic flow and reduces congestion. The suggested method has a success rate of 95% and a failure rate of 5%, which is higher than existing methods. People spend more time commuting to work, school, shopping, social events, and navigating traffic lights. Signal allocation is now typically dependent on a timer in many cities worldwide. The timer solution has the disadvantage that even though there is less traffic on a route, a green signal is still assigned until its timer value decreases to 0. In contrast, traffic on another road, which is extremely busy, receives a red signal at that point, causing congestion and time loss for citizens. Most current applications are not automated and are susceptible to human error. 55


Journal of Automation, Mobile Robotics and Intelligent Systems

Current research introduces an innovative and intelligent control system to solve such issues. An intelligent traffic light system should be implemented to sense vehicles’ presence and absence and react accordingly. The following objectives have been developed based on current research and literature. 1. Create a computer vision-based traffic signal controller that can adapt to the current traffic scenario. 2. Assess traffic density by recognizing cars at traffic lights using real-time photos from CCTV cameras at traffic intersections. 3. Calculate the green signal time for vehicles such as cars, buses, motorcycles, trucks, and people.

3. Methodology

This section talks about the methodology followed for conducting this research work, employing the YOLO approach for effective traffic management system in vehicle movements. YOLO is a state-of-the-art object detection algorithm that is incredibly fast and accurate. YOLO is a clever convolutional neural network (CNN) for multiple objects recognition. The algorithm divides the image into regions and predicts bounding boxes and probabilities for each area using a single neural network applied to the entire image. The weighting of these bounding boxes is based on the anticipated changes. It can create a large amount of data that contains all of the information included inside the image. A single CNN forecasts various bounding boxes and class probabilities for distinct packages based on the YOLO algorithm. YOLO increases its detection performance by the training process on the captured set of photographs. The backbone CNN, which is utilized in YOLO, may be tuned further to improve processing performance. Darknet is a C and CUDA- based open-source neural network architecture. It is fast to set up and supports both CPU and GPU computing. On ImageNet, YOLO achieves a top-1 accuracy of 72.9 percent and a top-5 accuracy of 91.2 percent using DarkNet. Darknet primarily employs 3x3 filters for feature extraction and 1x1 filters for output channel reduction. It also uses global average pooling to make predictions [18].

3.1 Object Detection

56

The model was created utilizing Python programming language and entirely developed in Google Colab, which included various packages and frameworks to execute the previously mentioned algorithm. The model was developed using OpenCV functionalities to construct the YOLOV4 algorithm with 53 convolution layers. Blob from Image was used to change the image’s scale and dimension before passing it into the algorithm. cv2.dnn.readNetFromDarknet was used to read the weights and load the configuration file to load the CNN layers of the algorithm. The multiple boxes generated in each grid for each object are removed by taking the threshold value into account. Finally, Non-max suppression is used to find the best box for the object. These values are utilized to point the objects (car, truck, motorbike, person, bus), and

VOLUME 16,

N° 4

2022

cv2.rectangle for rectangular boxes (x,y,w,h) around the objects; different color classes are used for generating object names concerning the vehicle. The average time of the vehicle required to cross the lane is calculated based on the type of vehicle.

3.2 Simulation Environment

Pygame is a series of cross-platform Python modules for making video games and simulations. It consists of computer graphics and sound libraries programmed to work with the Python programming language, which incorporates AI, mathematics, and Pygame, extending the excellent SDL library. This enables users to write full-featured games and multimedia applications in Python. Pygame is very compact, running on almost every platform and operating system. We use Pygame’s functionalities to create vehicles at random and monitor their motion by updating their coordinates regularly. We use threading, a technique for performing several tasks at once. We keep the traffic signals up to date. Time functions are used to keep track of the time in seconds and do the job. Load is used to make images of cars, and blit keeps our eyes open.

3.3 Approach Adopted in YOLO

The advantage of using YOLOv4 over other versions and algorithms is that it addresses the drawbacks of image processing taking a long time and GPU utilization. When compared to different versions of YOLO, the detection accuracy is high. Furthermore, compared to YOLOv3, AP (average precision) and FPS (frames per second) increased by 10% and 12%, respectively. The intelligent traffic detection model built in this study is a traffic management extension of the YOLO algorithm. The objects detected by this algorithm aid in allocating the current signal timer. Detecting multiple images using R-CNN algorithms takes a long time. It has a high computational complexity, so YOLO was created in 2011 after a lot of research work in image processing to detect multiple images of images effectively in a short period. The algorithm’s accuracy has been demonstrated by using a simulation in which different vehicle density levels at crossings are randomly dispersed in all directions at random intervals. Compared to the default vehicle passage using the hardcoded approach, the signal switching algorithm increases the number of cars that may pass through in a given amount of time, aiding in vehicles’ efficient and continuous movement. Every time the algorithm captures images and identifies vehicles in images, it takes 15 seconds, so we make efficient use of this time to minimize traffic congestion. Figure 1 shows the lineage of steps specified in the proposed model. It starts from CCTV footage with refined images and ends at directioned traffic signal aided by timer.

4. Results and Discussion

This section elicits a brief description of results obtained and studied for the purpose of providing an


Journal of Automation, Mobile Robotics and Intelligent Systems

Fig. 1. Proposed System Model effective traffic light management system powered by the YOLO algorithm. Appropriate photographs, ­respective graphs and tables are shown in this section for a better visibility of the proposed work carried out. Equation (4.1) shows the Green Signal Time (GST) based on vehicle density.  GST =   

Noofvehiclesvehiclesclass * averagetimevehicleclass   Nooflanes  

(4.1)

VOLUME 16,

N° 4

2022

Where GST (Green Signal Time) is green signal time, the number of vehicles of each type of vehicle detected by the vehicle detection module is ‘NoOfVehiclesOfClass’. ‘AverageTimeOfClass’ is the time it takes for vehicles of that class to reach an intersection on average. The ‘NoOfLanes’ variable shows how many lanes there are at the intersection. The average time for each type of vehicle to pass through a junction can be adjusted depending on the surrounding environment. Depending on the intersection’s features, this can be done regionwise, by city, or even by neighborhood. It is possible to evaluate data from the respective transportation authorities. The signals are exchanged cyclically rather than being switched in order of densest to least dense. This is consistent with the new system, which allows people to change their routes because the signals turn green in a fixed pattern. The order of the signs has remained unchanged, except for the yellow signals, which have been considered. The average time for each type of vehicle to pass a junction can be altered based on the area, region, city, locality, or even intersection-wise based on the intersection’s qualities to enhance traffic management. Figure 2, Figure 3, and Figure 4 show different object detection algorithm analysis for traffic at a crossroad in the daytime using YOLOv4 algorithm. Figure 2 demonstrates the identification of objects with respect to type of vehicle. Figure 3 depicts the traffic density at traffic signals. Figure 4 is the display of the objects moving identification at traffic signals.

4.1 Analysis and Discussion

This section elicits an analysis of results obtained and studied for the purpose of providing an effective traffic light management system powered by the YOLO algorithm.

Fig. 2. Image captured from city traffic 57


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 5. Graphical representation of Bangalore traffic

Fig. 3. Image captured from the near bus station

Fig. 6. Graphical representation of Bangalore traffic near Bus station

Fig. 4. Image captured from signal traffic

58

Figure 5 depicts an investigation of vehicles identified at Bangalore (a metropolitan city in the country of India) traffic signals. It infers a good number of cars identified using this algorithm. Figure 6 depicts the study of cars and pedestrians observed at Bangalore bus terminals, which indicates the identification of both objects. Figure 7 represents a vehicle detection analysis in Mumbai (a metropolitan city in the country of India) traffic. As a result of huge traffic in the Mumbai region, the movement of cars present at the crossroads has grown exponentially. According to the experiment carried out in the research, Figure 5 and Figure 6 showcase Bangalore traffic which is a mix of vehicles and people, but Mumbai traffic is more focused on vehicle type rather than pedestrians. As a result, the research calls for the use of intelligent traffic systems that need less traffic signal time. Table 1

illustrates the time allocated for the green signal in both experiments in Bangalore and Mumbai respectively. Figure 8 demonstrates a considerable improvement in the number of vehicles passing in less time, demonstrating the validity of the suggested technique as an intelligent traffic system. Table 2, Table 3 and Table 4 show the number of vehicles crossed at respective lanes of existing timer, proposed system, and total green signal time allocation (seconds) for different phases of time at respective lanes. Based on the inference of Table 3, if the numbers of vehicles are increasing. the intelligent timer optimized according to the number of vehicles passes to the signals. Table 4 shows the difference of time elapsed on 400 will be different than the time elapsed on 160 due to the optimization of time interval. This research aims to reduce traffic congestion and pollution, and increase vehicle passage per day, especially at busy intersections. The YOLOv4 algorithm produces better accuracy in the detection of multiple vehicles depicted in the image, with a detection accuracy of 80-85 percent based on vehicle density in the image. We also obtained a substantial improvement in the passage of vehicles per unit of time with simulation as compared to the default hard-coded scheme. It increased the total flow of vehicles in all lanes. The built model is limited to four-way intersections and can also be used


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 8. Existing system vs. intelligent traffic control system

Fig. 7. Graphical representation of Mumbai traffic Tab. 1: Green signal Time Allocation Figure Number

Location

Allocated Time in seconds

1

Bangalore

43 seconds

3

Mumbai

25 seconds

2

Bangalore

24 seconds

Tab 2: Number of vehicles crossed at respective lanes of existing timer Time elapsed

Lane1

Lane2

Lane3

Lane4

Unit time

160

19

34

16

22

0.568

300

88

87

28

30

0.766

250 400

96

69

121

13

108

20

41

40

0.792 0.787

Tab 3: Number of vehicles crossed at respective lanes of intelligent timer Time elapsed

Lane1

Lane2

Lane3

Lane4

Unit time

160

40

55

15

16

0.787

300

106

91

23

37

0.854

250 400

77

86

135

34

138

24

27

40

0.884 0.925

Tab 4: Total green signal time allocation (seconds) for different phases of time at respective lanes Time elapsed

Lane1

Lane2

Lane3

Lane4

160

46

33

22

20

300

117

71

33

33

250 400

72

124

in high-vehicle-density areas. The project is primarily concerned with traffic, especially in metropolitan cities with large populations.

Conclusion

The proposed YOLOv4 algorithm recognizes numerous vehicles in a picture with an accuracy of 85 percent depending on vehicle density, and we obtained a

75 112

36 36

20 43

considerable increase in numbers of vehicles passing per unit of time using simulation, compared to the default hard-coded technique. There was greater traffic on all lanes. The developed model may be used in highvehicle-density areas and four-way intersections. The research focuses on urban transportation. The proposed method also alters the green signal time automatically based on traffic intensity at the light, guaranteeing that the direction with more traffic

59


Journal of Automation, Mobile Robotics and Intelligent Systems

obtains a longer period than the one with less traffic. This would prevent annoying delays while also reducing traffic and waiting times, resulting in decreased fuel usage and emissions. The simulation findings show that the system would be a major improvement over the existing method in terms of the number of cars crossing the junction. This gadget may be improved to function even better with additional calibration and the use of real-world CCTV data to train the model.

VOLUME 16,

[6] [7]

AUTHORS

Boppuru Rudra Prathap* – Department of Computer Science and Engineering, CHRIST University, Bangalore-560074, Karnataka, India, e- mail: boppuru. prathap@christuniversity.in.

2022

Engineering Conference (IEMECON). doi:10.1109/iemecon.2017.8079571

D.Y. Huang, Chao-Ho chen ,Wu-chin hu “Reliable moving vehicle detection based on the filtering of swinging tree leaves and raindrops.” doi:10.1109/ student.2019.6089337 Kanungo A, Sharma A, Singla C. (2014). “Smart traffic lights switching and traffic density calculation using video processing.” 2014 Recent Advances in Engineering and Computational Sciences (RAECS). doi:10.1109/raecs.2014.6799542 Khushi. (2017). “Smart Control of Traffic Light System Using Image Processing.” 2017 International Conference on Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC). doi:10.1109/ctceec.2017.8454966

Kukatlapalli Pradeep Kumar – Department of Computer Science and Engineering, CHRIST University, Bangalore-560074, Karnataka, India.

[9]

Javid Hussain – Department of Computer Science and Engineering, CHRIST University, Bangalore-560074, Karnataka, India.

[10] Muhammad Fachrie. “A Simple Vehicle Counting System Using Deep Learning with YOLOv3 Model.” (ICCSNT), 2017. doi:10.1109/ iccsnt.2017.8343709

Cherukuri Ravindranath Chowdary – Department of Computer Science and Engineering, CHRIST University, Bangalore-560074, Karnataka, India.

*Corresponding author

References [1]

[2]

[3] [4] [5]

60

[8]

N° 4

A. A. Zaid, Y. Suhweil and M. A. Yaman, “Smart controlling for traffic light time,” 2017 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), Aqaba, 2017, pp. 1–5. doi:10.1109/AEECT.2017.8257768.

Adarsh P., Rathi, P., Kumar, M. (2020). “YOLO v3-Tiny: Object detection and recognition using one stage improved model.” 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS). doi:10.1109/ icaccs48705.2020.9074315 Chattaraj, A., Bansal, S., Chandra, A. (2009). “Implementation of image processing in real-time IEEE Potentials,” 28(3), 40–43. doi:10.1109/ mpot.2009.932094 Chong HF, Ng DWK. (2016). “Development of IoT device for traffic management system.” 2016 IEEE Student Conference on Research and Development (SCOReD). doi:10.1109/scored.2016.7810059 Choudhury S, Chattopadhyay SP, Hazra TK. (2017). “Vehicle detection and counting using Haar feature-based classifier.” 2017 8th Annual Industrial Automation and ­ Electromechanical

Li J, Zhang Y, Chen Y. (2016). “A Self-Adaptive Traffic Light Control System Based on Speed of Vehicles.” 2016 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). doi:10.1109/qrs-c.2016.58

[11] Manikonda P, Yerrapragada AK, Annasamudram SS. (2011). “Intelligent traffic management system.” 2011 IEEE Conference on Sustainable Utilization and Development in Engineering and Technology (STUDENT). doi:10.1109/student.2011.6089337

[12] K. Sangeetha, Kavibharathi,Gnanasoundari, andKishorekumar. “Implementation of image processing in real-time traffic light control.” 2019, 3rd International Conference on Electronics Computer Technology. doi:10.1109/ icectech.2019.5941662 [13] Osman T, Psyche SS, Ferdous JMS, Zaman Hu. “Intelligent traffic management system for cross-sections of roads using computer vision.” 2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC). 2017. doi:10.1109/ccwc.2017.7868350 [14] Pranav Shinde, Srinand Yadav, Shivani Rudrake, Pravin Kumbhar. “Smart Traffic Control System using YOLO.” 2019 IEEE 8th Data-Driven Control and Learning Systems Conference (DDCLS). doi:10.1109/ddcls.2019.8908873

[15] Rani LJ, Kumar M, Naresh KS, Vignesh, S. “Dynamic traffic management system using infrared (IR) and Internet of Things (IoT).” 2017, Third International Conference on Science Technology


Journal of Automation, Mobile Robotics and Intelligent Systems

Engineering Management (ICONdoi:10.1109/iconstem.2017.8261308

STEM).

[16] Rizwan P., Suresh K.,Babu MR. “Real-time smart traffic management system for smart cities by using the Internet of Things and big data.” 2016, International Conference on Emerging Technological Trends (ICETT). doi:10.1109/ icett.2016.7873660 [17] Tao J, Wang H, ZhangX , Li, X., Yang, H. “An object detection system based on YOLO in traffic scene.” 2017, 6th International Conference on Computer Science and Network Technology (ICCSNT). doi:10.1109/iccsnt.2017.8343709

[18] Tai Huu - Phuong Tran, Jae Wook Jeon. “Accurate Real-Time Traffic Light Detection Using YOLOv4.” 2020. DOI: 10.1109/ICCE-Asia49877.2020.9277063 [19] Corovic, A., Ilic, V., Duric, S., Marijan, M., Pavkovic, B. “The Real-Time Detection of Traffic Participants Using YOLO Algorithm.” 2018, 26th Telecommunications Forum (TELFOR). doi:10.1109/telfor.2018.8611986.

[20] Wang Q, Zhang Q, Liang X, Wang Y, Zhou C, Mikulovich VI. “Traffic Lights Detection and Recognition Method Based on the Improved YOLOv4 Algorithm.” Sensors, vol. 22, no. 1, 2022, 200. https://doi.org/10.3390/s22010200. [21] Dave, P., Chandarana, A., Goel, P., & Ganatra, A. “An amalgamation of YOLOv4 and XGBoost for nextgen smart traffic management system.” PeerJ. Computer Science, vol. 7, 2021, e586. https://doi. org/10.7717/peerj-cs.586. [22] Ouallane, Asma Ait, et al. “Overview of Road Traffic Management Solutions based on IoT and AI.” Procedia Computer Science, vol. 198, 2022, 518-523. https://doi.org/10.1016/j. procs.2021.12.279

VOLUME 16,

N° 4

2022

[23] B. Ali Almansoori, S. Saif Almansoori, H. Almansoori, R. Ahmed Almansoori, I. Ahmed and K. Shahid, “AI-Based Adaptive Signaling for Traffic Control Around Roundabouts,” Advances in Science and Engineering Technology International Conferences (ASET), 2022, pp. 1-5, doi: 10.1109/ASET53988.2022.9735009.

[24] Michael Osigbemeh, Michael Onuu, Olumuyiwa Asaolu. “Design and development of an improved traffic light control system using hybrid lighting system,” Journal of Traffic and Transportation Engineering (English Edition), vol. 4, no. 1, 2017, 88-95, https://doi.org/10.1016/j. jtte.2016.06.001.

[25] Hussain, J., Prathap, B.R., Sharma, A. (2022). “An Improved and Efficient YOLOv4 Method for Object Detection in Video Streaming.” In: Shukla, S., Gao, XZ., Kureethara, J.V., Mishra, D. (eds), Data Science and Security. Lecture Notes in Networks and Systems, vol. 462. Springer, Singapore. https:// doi.org/10.1007/978-981-19-2211-4_27 [26] Muhammad Saleem, Sagheer Abbas, Taher M. Ghazal, Muhammad Adnan Khan, Nizar Sahawneh, Munir Ahmad. “Smart cities: Fusion-based intelligent traffic congestion control system for vehicular networks using machine learning techniques.” Egyptian Informatics Journal, 2022. https://doi.org/10.1016/j. eij.2022.03.003. [27] Yousef, Khalil M., Mamal N. Al-Karaki, and Ali M. Shatnawi. “Intelligent traffic light flow control system using wireless sensors networks.” J. Inf. Sci. Eng., vol. 26, no. 3, 2010, 753-768.

[28] Alharbi, A., Halikias, G., Sen, A.A.A. et al. “A framework for dynamic smart traffic light management system.” Int. J. Inf. Technol. vol. 13, 2021, 1769–1776. https://doi.org/10.1007/s41870021-00755-2.

61


VOLUME 16, N° 2 2022 Journal of Automation, Mobile Robotics and Intelligent Systems

A Distributed Big Data Analytics Model for Traffic Accidents Classification and Recognition based on SparkMlLib Cores Submitted: 21st June 2022 ; accepted 2nd August 2022

Imad El Mallahi, Jamal Riffi, Hamid Tairi, Abderrahamane Ez-Zahout, Mohamed Adnane Mahraz DOI: 10.14313/JAMRIS/4-2022/34 Abstract: This paper focuses on the issue of big data analytics for traffic accident prediction based on SparkMllib cores; however, Spark’s Machine Learning Pipelines provide a helpful and suitable API that helps to create and tune classification and prediction models to decision-making concerning traffic accidents. Data scientists have recently focused on classification and prediction techniques for traffic accidents; data analytics techniques for feature extraction have also continued to evolve. Analysis of a huge volume of received data requires considerable processing time. Practically, the implementation of such processes in real-time systems requires a high computation speed. Processing speed plays an important role in traffic accident recognition in real-time systems. It requires the use of modern technologies and fast algorithms that increase the acceleration in extracting the feature parameters from traffic accidents. Problems with overclocking during the digital processing of traffic accidents have yet to be completely resolved. Our proposed model is based on advanced processing by the Spark MlLib core. We call on the real-time data streaming API on spark to continuously gather real-time data from multiple external data sources in the form of data streams. Secondly, the data streams are treated as unbound tables. After this, we call the random forest algorithm continuously to extract the feature parameters from a traffic accident. The use of this proposed method makes it possible to increase the speed factor on processors. Experiment results showed that the proposed method successfully extracts the accident features and achieves a seamless classification performance compared to other conventional traffic accident recognition algorithms. Finally, we share all detected accidents with details onto online applications with other users. Keywords: Big data, machine learning, traffic accident, severity prediction, convolutional neural network

1. Introduction Creating communication approaches between vehicle, radar, and computer technology is one of the 62

most critical tasks in modern artificial intelligence. One of the easiest ways for a user to enter information is through traffic accident. Therefore, data analysis processing technology and its processing tools have become a necessary part of the information society. In addition, traffic accident recognition is an essential research aspect of data processing and a vital vehicleradar–computer interaction technique. Traffic accident data contain semantic and personal characteristics, and environmental information. Recently, the issue of severity prediction for traffic accidents has become a major concern [1-5]. The study presented here aims to provide a prediction tool for the problem of severity, which is important information for emergency logistics [6-7]. The biggest challenge is the lack of real-time data from road safety departments for the traffic accidents. The proposed work performed statistical significance testing on the impact of applying a multi-class neural network and multiclass random forest on a traffic accidents data set [8-12]. Some algorithms of machine learning can help in complicated decisions supporting system solutions [13-22]; also, some authors discuss the issue of traffic light control as a challenging problem in modern societies [23-27,36]. This paper presents an efficient solution: to use data in severity prediction by detecting the severity prediction issue for traffic accidents. In this paper two machine learning techniques were proposed for the detection of severity prediction for traffic accidents. The multi-class neural network proved to have ­better accuracy, with 93.64% accurate severity prediction; this is more than the multi-class random forest, which achieved 87.71% accuracy. The random forest algorithm combines the output of multiple (randomly created) decision trees to generate the final output. Conclusion: Applying machine learning algorithms on severity prediction data can help severity prediction providers and individuals to pay attention to the traffic accident risks and traffic accidents status changes to improve the quality of life. The proposed system was applied to a traffic accident data set. The experimental results of the proposed work proved that using the multi-class neural network method can increase the possibility of diagnostic accuracy. We use the Apache Spark that supports DL and other big data

2022 ® El Mallahi et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)


Journal of Automation, Mobile Robotics and Intelligent Systems

analytic platforms that support machine learning, such as Hadoop, AzureML, and BigML (Fig.1). DL is a branch of machine learning that can solve classification, prediction, and clustering problems in Internet of Vehicles environments. Big Data Storage: In order to store the incoming data in real time, we use a special cluster such as HDFS (Hadoop Distributed File System) or any other NoSQL database. We specify for our treatments the data locality procedure by calling from the external requesting system any lamBig Data Analytic function to realize the next step (see Fig. 1). For processing and advanced processing, we can directly apply Map Reduce by translating all processing of classification or recognition to Mappers Tasks and Reducers Tasks. And we have to call directly on the pre-implemented advanced function and APIs from the Apache Spark core SparkMlLib. Visualization: Here, our eyes’ tendency is to get attracted more towards visuals rather than written content. We have several ways to practice this step (see Fig. 2). We first plot the trajectories of vehicles or we can plot histograms.

2. Related Work

During the last ten decades, the issue of road traffic safety has been of interest for economic and social

VOLUME 16,

N° 4

2022

development in the world. The various solutions in intelligent systems were employed for traffic accident classification. Earlier methods used manually defined features, mostly based on the combination of images of accidents on the road, and statistical information [28]. During this year (2022), in Morocco speed cameras were fixed in the roads to control and transmit images and videos in real time in the event of traffic violations or accidents, which can help in decisionmaking about road accidents. Manually defined and feature descriptors are fed to the traditional machine learning models, SVM [29]. Comprehensive survey of the studies of computer vision approaches for traffic accidents recognition can be found in Sharma et al. [30]. With the advances in hardware for speed cameras, especially with the incorporated use of GPUs, deep neural networks (DNNs) have achieved new standards in many research frontiers. The main advantage of the DNNs is that they do not require manual feature selection, and the features are learned within a DNN framework. However, DNNs require a large amount of training data, which is not always available. For small or moderate size datasets, transfer learning can help to overcome dataset size limitations [31]. Delen and Sharda [32] identified the significant predictors of injury severity in traffic accidents using a series of artificial neural networks. Alikhan and Lee [33-41] used the clustering-classification heuristic method for improvement accuracy in classification of severity of road accidents.

3. Background

The processing of the large-scale data generated from the Internet of Vehicles environment from various sources, such as cameras and sensors, is required. DL can be used for the processing of the Internet of Vehicles big data. Big Data Analytic platforms that support DL are required for the analysis of Internet of Vehicles. In this section, we present Apache Spark, which supports ML and other Big Data Analytic platforms that support machine learning, such as Hadoop, AzureML, and BigML. DL is a branch of machine learning that can solve classification, prediction, and clustering problems in Internet of Vehicles environments.

3.1. Apache Spark Fig. 1. Big data analytic platforms for machine learning

Spark is a big data processing framework based on streaming, machine learning, and graph processing

Fig. 2. Big Data processing

63


Journal of Automation, Mobile Robotics and Intelligent Systems

[36]. It is an open-source framework and was developed to overcome some of the limitations of Hadoop MapReduce. Spark uses memory based on processing large amounts of data, and it is faster in terms of data processing than the MapReduce framework. As a result, the data are stored in memory using resilient distributed datasets. Moreover, Spark supports real-time analysis. Chiroma et al. [36] presented Spark’s open-source distributed machine learning library, MLlib. Several learning settings exist in MLlib to improve the functionality efficiently, such as optimization, linear algebra primitives, and underlying statistical methods. Moreover, MLlib provides a high-level API and several languages that leverage Spark’s rich ecosystem to simplify the development of end-to-end machine learning pipelines. Chiroma et al. [36] discussed the DL over Apache Spark for mobile BDA. The authors showed how Spark can perform distributed DL on Map-Reduce. Each partition of the deep model is learned by the Spark worker for the entire mobile big data. Then the parameters use the master deep model of all partial models through averaging.

3.2. Hadoop

Hadoop has emerged as an important framework for “distributed processing of large datasets across clusters of machines” [36]. Many Hadoop-related projects have been developed over the years to support the framework, such as Hive, Pig, Tez, Zookeeper, and Mahout. Mahout is one of the distributed linear algebra frameworks for scalable machine learning.

3.3. AzureML

AzureML is a collaborative machine learning platform based on predictive analytics in big data, which allows easy development of predictive models and APIs. Numerous unique features, such as easy operationalization, versioning collaboration, and integration of user code, are provided by AzureML. Chiroma et al. [36] offered a technique for cloud-based AzureML named Generalized Flow, which allows binary classification and multiclass datasets and processes them to maximize the overall classification accuracy. The performance of the technique is tested on datasets based on the optimized classification model. The authors used three public datasets and a local dataset to evaluate the proposed flow using the classification. The result of the public datasets has shown an accuracy of 97.5%. Furthermore, the concept has become indispensable in big data technologies. For example, AzureML supports neural network for regression, two-class classification, and multiclass classification.

3.4. BigML

64

BigML provides highly scalable ML and predictive analysis services in the cloud The goal of BigML is to assist in developing a set of services, given that it is easy to use and seamless to integrate. BigML has been used in many studies for predictive analytics and DL because of its robustness and simplicity in

VOLUME 16,

N° 4

2022

providing a user-friendly interface. For example, a study on the distinguishing features of human footprint images offers deep analysis using BigML. The idea is to exploit the concept of the human footprint for personal identification using many fuzzy rules for predictive analysis. The verification of 440 footprint images is conducted for data quality. GPUs have been applied to speed up the performance. Moreover, Chiroma et al. [36] presented a predictive analysis on the most popular place for dengue in Malaysia to obtain an early warning and awareness to people using the BigML platform. The study is based on the decision tree algorithm model, which builds on BigML to support classification. Moreover, Chiroma et al. [36] analyzed the game features and acquisition, retention, and monetization strategies as primary drivers of mobile game application success.

4. Proposed Method 4.1 Dataset Employed

In this paper, we proposed solution-based big data and machine learning models for the development of an intelligent system for traffic accident prediction based on different data sources. Figure 3 represents the networks of wireless access technology involving vehicles and the Internet, as well as the heterogeneous network commonly referred to as the Internet of Vehicles. The figure shows the representation of the Internet of Vehicles in a large-scale distributed environment in terms of wireless communication of various devices. The model of the Internet of Vehicles is integrated into the cloud, equipped with a high-performance computing server with multiple GPUs, large-scale ML models, and Apache Spark. In the first pillar, this processing is done by capturing on real time all incoming datasets, which are stored in the Hadoop Distributed File System cluster. After that we have called on SparkMlLib core to use all the pre-implemented LamBig Data Analytic functions. In the second pillar, an analytical study is performed to group the important features. In the third pillar, we performed a selection of features itself, and new data sets are generated. Fourth, machine learning algorithms are used to define accident rates. Finally, the crash rates are sent to the vehicles. This paper focuses on the second stage of the scheme. The data sources represented for this system are police consulter, traffic conditions, automatic radar, vehicle data, fixed or moving cameras, driver data, weather, or other external factors. Each source for the dataset can be integrated into the proposed system. The aim of this paper is threefold. First, we introduce a TRAFFIC ACCIDENTS_2019_LEEDS dataset. Second, we analyzed the quality of this dataset for the traffic accident classification task. Third, we extend the study, using ANN, SVM, and random forest models to pre-train for the traffic accident classification task by exploring a larger number of machine learning models.


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 3. Model of the Internet of Vehicles integrated into the cloud equipped with a high-performance computing server with multiple GPUs, large-scale ML models, and Apache Tab. 1. Complete Dataset details Type

Pedestrian

Driver or rider

Vehicle or Pedestrian passenger Total

Fig. 4. Distribution of casualty class for the severity prediction for traffic accidents In this study, we then use the TRAFFIC ACCIDENTS_2019_LEEDS data from the office of Road Safety of the Department of Transport. The classification labels represent each of the data sets. In this database there were 1152 accidents classified as pedestrian, 405 classified as driver or rider, and the remaining 350 accidents were classified as vehicle or pedestrian passenger (see Fig. 4).

Number of features 1152 405 350

1907

Table 1 presents the dataset details and number of features for pedestrian, vehicle or pillion passenger, or driver or rider.

4.2 Balancing the Database

As shown in the following figure (Fig. 4), the database is unbalanced, because the number of each class is quite different (1 is pedestrian, 2 is vehicle or pillion passenger, 3 is driver or rider). To balance the database, there are two possibilities: Up sampling, or resampling the values to make their count equal to the class label with the higher count, or Down sampling, picking n samples from each class label where n = number of samples in class with least count.

65


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 5. An overview of random forest

Tab 2. Dataset details after augmentation Type

Pedestrian

Vehicle or pillion passenger Driver or rider Total

Number of features 1152 1152 1152 3456

In this study, we chose to expand the database. We obtained 1152 records for each class, for a total of 3456 records after augmentation (see Table 2). Then we divided the database into two parts, a training part (Training Dataset) and another part for testing (Test Dataset). We used 80% of the database for training and 20% for testing: i.e., 2764 number of features for Pedestrian, Vehicle or pillion passenger, or Driver or rider for the training set, and 692 features for the test set, in Table 2. This processing was done by capturing in real time all incoming datasets stored in the Hadoop Distributed File System cluster. After this we have called SparkMlLib core to use all the pre-implemented LamBig Data Analytic functions. We took an ANN that consists of an input layer. Fig. 5 shows the random forest algorithm, which combines the output of multiple (randomly created) decision trees to generate the final output.

5. Experimental Results and Discussion 5.1 Evaluation Metrics

66

Accuracy is one criterion for evaluating classification models. Informally, accuracy refers to our model’s

percentage of true predictions. The formal definition of accuracy is as follows: Accuracy =

TP + TN TP + TN + FP + FN

(1)

Determinants are True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). Precision is the percentage of successfully detected positives in relation to all expected positives. Mathematically: Precision =

TP TP + FP

(2)

Where TP denotes True Positive (number of correct positive predictions) and FP denotes False Positive (quantity of misclassified positive predictions). Recall is the total number of positive predictions that were correct across all positive samples. Mathematically: Recal =

TP TP + FN

(3)

Where TP denotes True Positive (number of correct positive predictions) and FN denotes False Negative (number of incorrect negative predictions). F1 score is Precision and Recall in a symbiotic relationship. For unbalanced data, the F1 score is a superior performance statistic than the accuracy metric [37]. F1 = 2 ×

Precsion × Recall Precsion + Recall

(4)


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

5.2 Experimental Setting

6. Discussion

We compared the performance of our model to the performance of the ANN, SVM, and RF approaches. This processing is done by capturing in real time all incoming datasets, which are then stored in the Hadoop Distributed File System cluster. After that, we have called upon SparkMlLib core to use all the pre-implemented LamBig Data Analytic functions. We preserved the class ratio between Pedestrian, Vehicle or pillion passenger, and Driver or rider, where the datasets were randomly split into training and test data. Each of the models we tested was trained using training data, while the models’ performance was evaluated using test data. To ensure that the model was consistent, we ran 10-fold cross-validation on each of the models. To compare results with our system, we used the ANN, SVM, and RF classifiers on the TRAFFIC ACCIDENTS_2019_LEEDS dataset. The algorithms were created utilizing the Python scikit-learn toolkit and the hyperparameter settings provided. Fig. 6 represents the confusion matrix for Pedestrian, Vehicle or pillion passenger, or Driver or rider using the ANN model. The performance of the ANN model for the test dataset is evaluated after the completion of the training phase and was compared using several performance measures—precision (PPV), sensitivity or recall, specificity, area under the curve (AUC), F1 score. Fig. 7 also presents the confusion matrix for Pedestrian, Vehicle or pillion passenger, or Driver or rider classification using the random forest model and SVM model. The model that Fig. 8 represents is the Train and validation accuracy curve.

To predict the severity of traffic accidents, we proposed a solution based on big data and machine learning models. This processing was done by capturing in real time all incoming datasets stored in the Hadoop Distributed File System cluster. We then called on SparkMlLib core to use all the pre-implemented LamBig Data Analytic functions. We used ANN, SVM and RF classifiers. The results of the models in terms of accuracy, precision, recall and F1 score are calculated from the confusion matrices. In terms of accuracy, precision, recall and F1 score, Table 3 shows the results of the models on the dataset. For the dataset, the Random Forest classifier outperforms the other models in terms of precision, accuracy, and F1 score. Although the ANN classifier has the best recall, it performs poorly on the other performance criteria for this dataset. Compared to RF, ANN and SVM

TP FP  CM =   FN TN 

(5)

Fig. 6. Confusion matrix for Pedestrian, Vehicle or pillion passenger, or Driver or rider classification using ANN model.

Fig. 7. Confusion matrix for Pedestrian, Vehicle or pillion passenger, or Driver or rider classification using Random Forest model (a) SVM model and (b) Random Forest model. 67


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 8. Train and validation accuracy curve (a) Train and validation accuracy curve, (b) Train and validation loss curve.

Tab. 3. Values obtained for the different metrics Accuracy

precision Recall

F1 score

KNN

Random Forest

SVM

ANN

0.62

0.9364161849710982

0.8251445086705202

0.8771676300578035

0.27

0.9364161849710982

0.8251445086705202

0.8771676300578035

0.38 0.28

0.9382125952919493 0.9355102879655588

0.8222978546756788 0.8232424765175214

0.8788867858874835 0.8770074547672448

As shown in Table 3, the best performing model for detecting the fatal state is the Random Forest model which gave better accuracy values (93.64% for training accuracy and 93.82% for test accuracy)

Fig. 9. Classifier ROC curve for Random Forest Classifier.

68

classifiers perform admirably. Compared to RF, ANN and SVM are less accurate. However, compared to the other approaches, SVM fails to achieve a satisfactory F1 score and recall score, even though the precision score is correct compared to RF and ANN classifiers. Finally, Fig. 9 represents the receiver operating characteristic (ROC) curves for each class in the random forest model, showing the true positive rate versus false positive rate as the classification threshold is

varied between 0 and 1. The ROC curve for each model is an average of 10 curves from the tenfold crossvalidation, determined by the trapezoid rule.

7. Conclusion

In this work, we propose a solution based on big data and machine learning models for prediction of traffic accidents. This processing is done by capturing in


Journal of Automation, Mobile Robotics and Intelligent Systems

real time all incoming datasets, which are stored in the Hadoop Distributed File System cluster. Then, we called upon SparkMlLib core to use all the pre-implemented LamBig Data Analytic functions. Next, we focused on severity prediction for traffic accidents, which is a huge step in road accident management. After that, this issue provides important information for emergency logistical transportation. Finally, to evaluate the severity of road accidents, we have evaluated the potential impact of the accident, and realized effective accident management procedures. In this proposed study, we have implemented some algorithms to classify the severity of traffic accidents, and presented the confusion matrix to specify the : Pedestrian, Vehicle or pillion passenger, or Driver or rider using Random Forest, Support Vector Machine, and Artificial Neural Network. To validate this experimentation, the TRAFFIC ACCIDENTS_2019_LEEDS dataset was used to classify the severity prediction for traffic accidents into three classes: Pedestrian, Vehicle or pillion passenger, or Driver or rider. In future work, it will be possible to use more features, and to find best features for classifications for real data in our city. Again, we can extract these selected features from the program file; also, we can implement the cost for the prediction of the gravity of Traffic Accidents. The very important benefit of using the big data paradigm is that it improves the processing of data, and establishes a good rate on road security based on classification and recognition of traffic accidents. We have called directly and in the real time mode on the pre-implemented Machine Learning functions for classifying and predicting the traffic accident in real time. The new aims and challenge of this work is that we have processed very large data streams in real time mode. This makes possible effective uses of very advanced libraries and a faster system. The obtained results have been tested for accident prevention in different types of areas and roads.

AUTHORS

Imad El Mallahi* – Phd Student in Big Data analytics, traffic accidents, Artificial Intelligence, LISAC Laboratory, Department of Computer Sciences, Faculty of Science, Sidi Mohamed Ben Abdellah University of Fez, Fez, Morocco, imade.elmallahi@ usmba.ac.ma. Jamal Riffi, Hamid Tairi, Mohamed Adnane Mahraz – Sidi Mohammed ben Abdellah University, Faculty of Sciences Dhar el Mahraz, Sidi Mohammed ben Abdellah University, Faculty of Sciences Dhar el Mahraz, LISAC laboratory, Fez, Morocco.

Abderahamane Ez-Zahout – Mohamed V University, Faculty of Sciences, Intelligent Processing Systems & Security Team (IPSS) Computer Science Department, Rabat, Morocco. *Corresponding author

VOLUME 16,

N° 4

2022

References [1] [2] [3]

[4] [5] [6]

[7] [8] [9]

Road Traffic Injuries. Accessed: Jul. 18, 2018. [Online]. Available: http://www.who.int/newsroom/fact-sheets/detail/road-traffic-injuries F. Zong, H. Xu, and H. Zhang, Prediction for traffic accident severity: Comparing the Bayesian network and regression models, Math. Problems Eng., no. 23, 2013, Art. no. 475194.

Shiran, G.; Imaninasab, R.; Khayamim, R. Crash Severity Analysis of Highways Based on Multinomial Logistic Regression Model, Decision Tree Techniques, and Artificial Neural Network: A Modeling Comparison. Sustainability 2021, 13, 5670. Abdel-Aty, M. Analysis of driver injury severity levels at multiple locations using ordered probit models. J. Saf. Res. 2003, 34, 597–603.

Sze, N.N.;Wong, S.C. Diagnostic analysis of the logistic model for pedestrian injury severity in traffic crashes. Accid. Anal. Prev., vol. 39, 2007, 1267–1278. Savolainen, P.T.; Mannering, F.; Lord, D.; Quddus, M.A. The statistical analysis of highway crash-injury severities: A review and assessment of methodological alternatives. Accid. Anal. Prev., vol. 43, 2011, 1666–1676. Moghaddam, F.R.; Afandizadeh, S.; Ziyadi, M. Prediction of accident severity using artificial neural networks. Int. J. Civ. Eng. vol. 9, 2011, 41–48.

Taamneh, M.; Alkheder, S.; Taamneh, S. Datamining techniques for traffic accident modeling and prediction in the United Arab Emirates. J. Transp. Saf. Secur. 2017, 9, 146–166. [CrossRef]

Zheng, M.; Li, T.; Zhu, R.; Chen, J.; Ma, Z.F.; Tang, M.J.; Cui, Z.Q.;Wang, Z. Traffic Accident’s Severity Prediction: A Deep-Learning ApproachBased CNN Network. IEEE Access 2019, 7, 39897–39910.

[10] Breiman, L. Random forests. Mach. Learn., vol. 45, 2001, 5–32.

[11] Lu, Z.; Long, Z.; Xia, J.; An, C. A Random Forest Model for Travel Mode Identification Based on Mobile Phone Signaling Data. Sustainability, vol. 11, 2019, 5950.

[12] Evans, J.; Waterson, B.; Hamilton, A. Forecasting road traffic conditions using a context-based random forest algorithm. Transp. Plan. Technol., vol 42, 2019, 554–572. [13] Hamad, K.; Al-Ruzouq, R.; Zeiada,W.; Abu Dabous, S.; Khalil, M.A. Predicting incident duration using random forests. Transp.A-Transp. Sci. vol. 16, 2020, 1269–1293. [CrossRef]

69


Journal of Automation, Mobile Robotics and Intelligent Systems

[14] Macioszek, E. Roundabout Entry Capacity Calculation—A Case Study Based on Roundabouts in Tokyo, Japan, and Tokyo Surroundings. Sustainability, vol. 12, 2020, 1533.

[15] Severino, A.; Pappalardo, G.; Curto, S.; Trubia, S.; Olayode, I.O. Safety Evaluation of Flower Roundabout Considering Autonomous Vehicles Operation. Sustainability, vol. 13, 2021, 10120.

[16] Macioszek, E. The Comparison of Models for Critical Headways Estimation at Roundabouts. In Proceedings of the 13th Scientific and Technical Conference on Transport Systems. Theory and Practice (TSTP), Katowice, Poland, 19–21 September, 2016. [17] Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. FeiFei, Large-scale video classication with convolutional neural networks, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2014, 17251732. [18] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, Face recognition: A convolutional neural-network approach, IEEE Trans. Neural Netw., vol. 8, no. 1, 1997, 98113.

[19] M. G. Karlaftis and E. I. Vlahogianni, Statistical methods versus neural networks in transportation research: Differences, similarities and some insights, Transp. Res. C, vol. 19, no. 3, 2011, 387399. [20] S. Al-Ghamdi, Using logistic regression to estimate the influence of accident factors on accident severity, Accident Anal. Prevention, vol. 34, no. 6, 2002. 729741. [21] M. Bédard, G. H. Guyatt, M. J. Stones, and J. P. Hirdes, ``The independent contribution of driver, crash, and vehicle characteristics to driver fatalities,’’ Accid Anal. Prevention, vol. 34, no. 6, 2002, 717727.

[22] K. M. Kockelman and Y. J. Kweon, Driver injury severity: An application of ordered probit models, Accident Anal. Prevention, vol. 34, no. 3, 2002, 313321.

[23] Mohamed AbdElAziz, Khamis Ahmed, ElMahdy Ahmed El-Mahdy, Kholoud Osama Shata Kholoud Osama Shata, Walid Gomaa Walid Gomaa, System and Method for During Crash Accident Detection and Notification, Patent Number: 2020/771, Filing Date: 9 June 2020, Filing Place: Egypt. [24] Mohamed AbdElAziz Khamis, Ahmed El-Mahdy, Kholoud Osama Shata. An In-Vehicle System and Method for During Accident Detection without 70

VOLUME 16,

N° 4

2022

being Fixed to Vehicle, Patent Number: 2020/769, Filing Date: 9 June 2020, Filing Place: Egypt.

[25] Mohamed A. Khamis, Walid Gomaa, Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework, Engineering Applications of Artificial Intelligence, Vol. 29, March, 2014.134–151 https://doi.org/10.1016/j. engappai.2014.01.007. [26] Mohamed AbdElAziz Khamis, Walid Gomaa, Enhanced Multiagent Multi-Objective Reinforcement Learning for Urban Traffic Light Control, Proc. of the 11th IEEE International Conference on Machine Learning and Applications (ICMLA 2012), Boca Raton, Florida, USA, 12-15 Dec. 2012, pp. 586-591. [27] Mohamed A. Khamis , Walid Gomaa, Hisham El-Shishiny, Multi-objective traffic light control system based on Bayesian probability interpretation, Proc. of 15th IEEE Intelligent Transportation Systems Conference (ITSC 2012), Anchorage, Alaska, USA, 16-19 Sept. 2012, pp. 995–1000. [28] Zhang XG. Introduction to statistical learning theory and support vector machines. Acta Automat Sinica, vol. 26, 2000, 32–41.

[29] Yuan F and Cheu RL. Incident detection using support vector machines. Transp Res Part C Emerg Technol. vol. 11, 2003, 309–328.

[30] Sharma B, Katiyar VK and Kumar K. Traffic accident prediction model using support vector machines with Gaussian Kernel. In: Pant M, Deep K, Bansal JC, et al. (eds) Proceedings of fifth international conference on soft computing for problem solving, vol. 437. Singapore: Springer 2016, pp.1–10. [31] Flores MJ, Armingol JM and de la Escalera A. Real-time warning system for driver drowsiness detection using visual information. J Intell Robot Syst., vol. 59, 2010, 103–125. [32] Delen D, Sharda R and Bessonov M. Identifying significant predictors of injury severity in traffic accidents using a series of artificial neural networks. Acc Anal Prevent., vol. 38, 2006, 434–444.

[33] Alikhani M, Nedaie A and Ahmadvand A. Presentation of clustering-classification heuristic method for improvement accuracy in classification of severity of road accidents in Iran. Safe Sci., vol. 60, 2013, 142–150. [34] Lee S-L. Predicting traffic accident severity using classification techniques. Adv Sci Lett., vol. 21, 2015, 3128–3131.


Journal of Automation, Mobile Robotics and Intelligent Systems

[35] Ez-zahout A., A distributed big data analytics model for people re-identification based dimensionality reduction. International Journal of High Performance Systems Architecture. vol. 10, no. 2, 2021, 57–63.

[36] Haruna Chiroma, Shafi’i M. Abdulhamid, Ibrahim A. T. Hashem, Kayode S. Adewole, Absalom E. Ezugwu, Saidu Abubakar, and Liyana Shuib. Deep Learning-Based Big Data Analytics for Internet of Vehicles: Taxonomy, Challenges, and Research Directions. Hindawi Mathematical Problems in Engineering, vol. 2021, Article ID 9022558, 20 pages, https://doi. org/10.1155/2021/9022558 [37] Asmae Rhanizar, Zineb El Akkaoui, A Predictive Framework of Speed Camera Locations for Road Safety. Computer and Information Science., vol. 12, no. 3, 2019. URL: https://doi.org/10.5539/ jmr.v12n3p92

VOLUME 16,

N° 4

2022

[38] Salma Bouaich, Mahraz, M.A., Riffi, J. et al. Vehicle Counting Based on Road Lines. Pattern Recognit. Image Anal., vol. 31, 2021, 739–748. https://doi. org/10.1134/S1054661821040076

[39] Bouti, A., Mahraz, M.A., Riffi, J. et al. A robust system for road sign detection and classification using LeNet architecture based on convolutional neural network. Soft Comput., vol. 24, 2020, 6721. https://doi.org/10.1007/ s00500-019-04307-6 [40] Bouaich, S., Mahraz, M.A., Riffi, J., Tairi, H., Vehicle Detection using Road Lines, 3rd International Conference on Intelligent Computing in Data Sciences, ICDS, 2019, 8942305

[41] Bouaich, S., Mahraz, M.A., Rifii, J., Tairi, H., Vehicle counting system in real-time, 2018 International Conference on Intelligent Systems and Computer Vision, ISCV 2018, May, pp. 1–4

71


VOLUME 16, N° 4 2022 Journal of Automation, Mobile Robotics and Intelligent Systems

Design of A Vision‐Based Autonomous Turret Submitted: 10th January 2023; accepted: 10th February 2023

Rabah Louali, Djilali Negadi, Rabah Hamadouche, Abdelkrim Nemra DOI: 10.14313/JAMRIS/4-2022/35 Abstract: This article describes the hardware and software de‐ sign of a vision‐based autonomous turret system. A two ­­degree of freedom (2 DOF) turret platform is designed to carry a cannon equipped with an embedded camera and actuated by stepper motors or direct current motors. The turret system includes a central calculator running a visual detection and tracking solution, and a microcon‐ troller, responsible for actuators control. The Tracking‐ Learning‐Detection (TLD) algorithm is implemented for target detection and tracking. Furthermore, a Kalman filter ­algorithm is implemented to continue the tracking in case of occlusion. The performances of the designed­ turret, regarding response time, accuracy and the execu‐ tion time of its main tasks, are evaluated. In addition, an experimental scenario was performed for real‐time­ autonomous detection and tracking of a moving target. Keywords: Autonomous turret, Stepper motor, DC ­motor, Vision based control, Tracking‐Learning‐Detection (TLD) algorithm, Kalman based visual tracking

1. Introduction

72

Autonomous weapon systems (AWSs) have become decisive on the battle ield because they can effectively carry out various missions such as surveillance, intel‑ ligence, reconnaissance, and armed operations, with‑ out engaging human lives [2]. Turrets and sentries are widely used on the bat‑tle ields. Making these systems autonomous will allow surveillance, detection, identi ication, tracking, and even destruction of potential targets without human intervention. Aware of their interests, the defense sector and the military industry paid particular a­ttention to autonomous turret systems, which r­esulted in the ­development, commercialization, and ­extensiveuse of these systems. The bestknown example of c­ ommercial ­autonomous turret systems is the Samsung SGR‑A1 [2], which is an autonomous ­surveillance gun ­developed by Samsung Techwin to assist South Korean troops in the Korean demilitarized zone. This system has ­capabilities for surveillance, detection, tracking, and iring, as well as voice recognition [2].

In contrast, review of academic literature shows that the number of works carried out for the ­ development of autonomous turret systems remains lim‑ited. P. Demski et al. [?] proposed a ­ ­remote‑­controlled turret system with video transmis‑ sion. M. Tsourma and M. Dasygenis [6] described a system that supports motion detection, tracking, and face recognition. S. Kuswadi et al. [5] designed and realized an auto‑matic turret system that includes a camera and a PID controller to drive pitch and yaw motion of the turret. R. R. Alcala et al. [1] designed and implemented a body wearable device to control a sentry gun turret. This paper describes the full design of an ­au­tonomous turret system controlled using a visual ser‑voing solution. This system can operate in ­manual or automatic mode. In manual mode, the operator controls the pan and tilt of the turret from a r­ emote interface. The camera embedded on the turret ­allows surveillance of an area of interest in real‑time. In ­automatic mode, a Tracking Learning Detection (TLD) algorithm [3] is implemented for target detection and recovering. In this case, the turret detects and tracks autonomously the target selected by the o ­ perator. A Kalman ilter [4] is implemented to predict and track the target position in real‑time using TLD observation. The remainder of the paper is organized as ­follows. Sections 2 and 3 describe, respectively, the hardware and software design of the turret system. Section 4 is devoted to the evaluation of the designed turret per‑ formances, regarding dynamic and real‑­time metrics.

2. Hardware Design

The hardware architecture of the vision‑based au‑ tonomous turret is shown in Fig. 1. The central calculator manages the manual and as well as the vision‑based automatic control. A manual control interface (­ joystick, keyboard, or mouse) can be ­connected to this computer via USB. This computer also executes automatic target detection and track‑ ing using the ­video stream received from the camera ­embedded on the gun. It then computes and sends the highlevel commands to the microcontroller via an USB ­ interface. The microcontroller is respon‑ sible for the lowlevel control of the turret actua‑ tors by ­outputting the appropriate PWM signals to control the power stage. The latter generates the power signals that drive the pan and tilt actuators.

2022 ® Louali et al. This is an open access article licensed under the Creative Commons Attribution-Attribution 4.0 International (CC BY 4.0) (https://creativecommons.org/licenses/by-nc-nd/4.0)


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 1. Hardware architecture of the vision based autonomous turret In the following ­subsections, we will explain each component of the turret system.

2.1. The turret platform

Fig. 2 shows the mechanical structure of the turret platform, which can perform pan and tilt rotations, using two motors. The working volume of this ­turret is a halfsphere, since the pan de lection can vary from 0° to 360°, while the tilt can take any value from 0° to 180°. The motion transmission is performed using parallel gears, which allow speed reduction ­ and torque multiplication. The pan and tilt reduction ratios are 1/10 and 1/5, respectively. The pan rota‑ tion is ensured by a tapered roller bearing, which can ­support relatively large radial and axial loads.

2.2. Actuators and power stages

We built two versions of the turret platform: the irst is driven by stepper motors, while the second is driven by DC motors. Characteristics of the used actuators are given in Table 1. On the one hand, the ­stepper m ­ otors do not require a position control loop; ­ however, ­besides having low speed and low torque compared to DC motors, their torque also decreases rapidly as the speed increases. On the other hand, DC motors deliver higher speed and higher torque but require a position control loop. We used a POLOLU 70:1 DC motor that integrates an incremental encoder, which allows the implementation of a position control loop. To drive the stepper motors, we realized the po‑ wer stage shown in Fig. 3, which is based on the L297 and L298 integrated circuits. To drive the DC motors, we designed a power stage using mainly the L293 in‑ tegrated circuit as shown in Fig. 4. To protect the mi‑ crocontroller, a galvanic isolation is introduced using the integrated circuit ULN2803a. Furthermore, an amplication stage based on two BDW93C transistors is used to supply the DC motors.

2.3. Microcontroller

To perform the low‑level control of the actuators, we used the MBED NXP LPC1768 microcontroller, which is a rapid prototyping module based on an ARM7 processor.

Fig. 2. The turret platform Tab. 1. Actuators of the turret platform Turret version Actuators

Version 1

Version 2

Stepper motors :

DC motors :

SUPERIOR

ELECTRONIC

POLOLU 70:1

Current : 1 A

Current : 5 A

M061‑LS02

Voltage : 5 V

Actuators Torque : 0.53 N.m characteristics Resolution : 1.8°

Voltage : 12 V

Torque : 1.37 N.m Encoder

resolution : 0.16°

Fig. 3. Power stage of stepper motors

2.4. Embedded camera A video camera connected to the central computer via a USB link is mounted on the cannon (the gun). Aligning the center of the image with the center of the target allows aiming at the object of interest. The size of the image is 640 × 480 pixels, which means the center of the image is located at X0 = [U0 = 320, V0 = 240]T pixels.

73


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 5 . TLD algorithm Fig. 4 . Power stage of DC motors

3. Software Design In addition to the manual control, the designed sys‑ tem implements a vision‑based autonomous mode for target detection and tracking. For detection, we used Tracking‑Learning‑Detection (TLD) ­ algorithm [3]. In addition, we designed a Kalman‑based vi‑ sion algorithm for target tracking to ensure an ­optimal ­estimation of the target position, even when it ­ becomes completely occluded [4]. Furthermore, we proposed control laws that drive the actua‑ tors of the turret for real‑time tracking of the target.

3.1. Detection method based on TLD algorithm

The Tracking‑Learning‑Detection (TLD) method pro‑ posed by Zdenek Kalal, performs tracking of unknown objects in a video stream [3]. The main steps of the TLD method are shown in Fig. 5. Initially, the object of interest is selected manually and then the TLD method tracks the target by learning its appearance. The recursive tracker and the detector run in p ­ arallel and their results are merged, then v ­alidation and learning steps are performed to improve d ­ etection ­performance by identifying and updating errors in each frame. The described detection methods locate the ­target at a given point in time, without predicting the ­target motion. The time taken to transmit the target coordi‑ nates to the turret is signi icant. This is particularly worrying for fast dynamic targets. To ensure real‑time tracking, a prediction step using an algorithm based on a Kalman ilter was added to the design.

3.2. Tracking

74

To ensure target tracking, we implemented an algo‑ rithm based on the Kalman ilter [4], as represented by the low chart in Fig. 6. The tracking problem is modeled by (1), which includes the target dynamics and the observation equa‑tion, where X = [x, y]T is the target position; ∆ = [δx, δy]T is the target velocity; Z = [U, V ]T is the ­observed position of the target obtained by the detec‑ tion algorithm; A, B, and C are, respectively, the evo‑ lution, the control, and the observation matrix; Te is the sample period; and W and V are, respectively, the state and the measurement noises. It should be noti‑ ced that the target is assumed to evolve at a constant speed δx, δy = Cte.

Fig. 6 . Tracking algorithm  X (k ) = A ⋅ X (k − 1) + B ⋅ ∆ + W (k )   Z (k ) = C ⋅ X (k ) + V (k ) (1)  Te 0  1 0 1 0 A= ; C =  ; B =   0 1 0 1  0 Te 

The Kalman ilter prediction step is performed a­ ccording to (2), where Pp and P are the covariance matrix of the prediction and measurement update error, and Q is the covariance matrix of the process noise.  X p (k ) = A ⋅ X (k ) + B ⋅ U  T T (2) Pp (k ) = A ⋅ P(k − 1) ⋅ A + B ⋅ Q ⋅ B

The update step is performed according to (3), where R is the measurement noise covariance matrix and K is the Kalman gain.

K (k ) = P (k ) ⋅ C ⋅ C ⋅ P (k ) ⋅ C T + R  p p     P k = P k − K k ⋅ C ⋅ P k ( ) ( ) ( ) ( ) (3)  p p  X (k ) = X p (k ) + K (k ) ⋅  Z (k ) − C ⋅ X p (k ) 

To validate the tracking system, the target is ­hidden for an interval of time. The tracking results


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Fig. 8. Stepper motors control flowchart Fig. 7. Target tracking by the Kalman filter are shown in Fig. 7. When the target disappears from the c­ amera’s ield of view, the detection algorithm pro‑ vides the coordinates of the center of the i­mage, i.e., 320 ­pixels for the horizontal position and 240 ­pixels for the vertical position. In this case, the Kalman ilter ensures tracking continuity through the prediction model.

3.3. Control

The control law implemented on the microcontroller aims to line up the center of the image X0 = [U0, V0]T with the center of the target X = [U, V ]T , by control‑ ling the actuators of the turret platform. To avoid the phenomenon of pumping (oscillation), the center of the image is not a single pixel but an area defined by X0 = [U0 + δu; V0 + δv]T . In our case, we set δu = δv = 32 pixels. Fig. 8 and Fig. 9 show the control lowcharts for stepper and DC motors, respectively. Note that the stepper motors’ control law is ”All‑or‑None” type, while the DC motors’ control law is ”Proportional” type. In 8, Fu and Fv are the frequencies supplied to the turret pan and tilt stepper motors, while in 9 Ratev and Rateu are the duty cycles supplied to the turret pan and tilt DC motors.

3.4. Graphical User Interface

We designed a graphical user interface (GUI) for the turret system according to the state machine diagram shown in Fig. 10, and implemented it using MATLAB GUI editor. The GUI allows the operator to visual‑ ize the scene captured by the camera and to manu‑ ally control the turret in order to look for a possible ­target. After selecting a target, it is possible to start the automatic tracking.

Fig. 9. DC motors control flowchart

4. Results and Discussion

4.1. Dynamic Performances Evaluation We studied the dynamic performance, in terms of speed and accuracy, of the two versions of the turret (based on stepper and DC motors). We designed the experimental setup depicted in Fig. 11. It consists in tracking a red virtual target and performing de ined trajectories. This target is created and animated ­virtually using the simulation softwareMATLABand a data‑show projector. The def ined trajectories of the target are: i) ­horizontal and vertical rectilinear motion (see Fig. 12a); ii) horizontal and vertical sinusoidal motion (see Fig. 12b); and iii) coupled horizontal‑vertical motion by tracing the in inite shape (see Fig. 12c and Fig. 12d). The different tests provided the results given in Table 2. These results show that the response time of 75


Journal of Automation, Mobile Robotics and Intelligent Systems

VOLUME 16,

N° 4

2022

Tab. 2. Dynamic performances evaluation of the turret

Response time Static error

Stepper motors

DC motors

based turret

based turret

Pan

Tilt

Pan

Tilt

axis

axis

axis

axis

1.880 s

0.826 s

0.59 s

0.472 s

8 pixels

22 pixels

4 pixels

3 pixels

Fig. 10. Graphical User Interface

Fig. 11. Experimental setup to evaluate the turret dynamic performances

Fig. 13. Real‐time tracking of a moving target in an open loop. In terms of accuracy, both versions of the turret have satisfactory accuracy because the measured static errors are in the dead zone [δu, δv] = [32, 32] pixels.

4.2. Functional validation (a) Rec�linear mo�on.

(b) Sinusoidal mo�on.

The functional testing of the system is performed through real‑time tracking of a moving target. The user manually controls the turret to search for a ­potential target. As shown in picture 2 of Fig. 13, selecting the mobile robot as a target starts the ­automatic tracking. This is successfully accomplished despite a partial ­occlusion of the target, as shown in picture 4 of Fig. 13.

4.3. Real‐time performances evaluation (c) Infinite shape mo�on of the target.

(d) Infinite shape mo�on of the turret.

Fig. 12. Target trajectories to evaluate the turret dynamic performances the horizontal axis is greater than that of the vertical axis because the horizontal axis requires more torque. The results also show that the DC motor is faster than the stepper motor. However, it should be kept in mind that the DC motor requires a position sensor and a control loop, while the stepper motor is controlled 76

To evaluate the real‑time performances of the system, we measured the execution time of its main tasks over a number of iterations. We used as a central calcula‑ tor a Fujitsu Lifebook laptop with a CORE i5 processor and 2.67 GHz of RAM. The results of this study are summarized in Table 3. The Worst‑Case Execution Time (WCET) is 147.1 ms, which is equivalent to the processing of more than 6 frames/second. The computing time is ­consumed ­mainly by the TLD and the display tasks. The ­execution time standard deviation is less than 1 ms, which ­quanti ies the temporal stability of the system.


Journal of Automation, Mobile Robotics and Intelligent Systems

Tab. 3. Temporal Analysis of the System’s Tasks Tasks Vs. Execution Time [ms]

Max

Min

Mean

std*

TLD + Display

145.1

63.0

77.70

0.216

Sending coordinates to the microcontroller

1.7

0.6

0.99

Computing the target barycenter coordinates

*Standard deviation

5. Conclusion

0.3

147.1

0.1

63.7

0.13

78.82

0.003

0.002

0.221

This paper presents the hardware and software de‑ sign of a vision‑based autonomous turret system. Two versions of the turret platform were built: the first is actuated by stepper motors, while the second is actu‑ ated by Direct Current (DC) motors. For real‑time target detection, the TLD algori‑ thms were implemented. For better target tracking, a ­prediction solution based on the Kalman ilter was added to the software architecture. The low‑level con‑ trol of the turret platform is ensured by a microcon‑ troller, which implements control laws according to the used actuators. Furthermore, functional and real‑time perfor‑ mances of the system were studied. In addition, we ­validated the designed turret system through the case of tracking a moving target. It can be concluded that the designed system ­based on the TLD method performs robust and r­eal‑­ time tracking. Moreover, this system does not need prior knowledge about the shape or the color of the target to accomplish the tracking. As future work, we will consider the case of multiple target classiication and tracking.

AUTHORS

Rabah Louali* – Ecole Militaire Polytechnique, Algiers, Algeria, e‑mail: rabah.louali@emp.mdn.dz, rabah.louali@gmail.com.

Djilali Negadi – Ecole Militaire Polytechnique, Algeria. Rabah Hamadouche – Ecole Militaire Polytechnique, Algeria.

VOLUME 16,

N° 4

2022

Abdelkrim Nemra – Ecole Militaire Polytechnique, Algeria, e‑mail: abdelkrim.nemra@emp.mdn.dz, karim_nemra@yahoo.fr. *Corresponding author

References [1]

[2] [3] [4] [5]

[6]

R. R. Alcala, Z. G. Arceo, J. N. Baterisna, J. O. Mora‑ da, J. O. D. Ramirez, and R. E. Tolentino, “Design and implementation of body wearable device and oculus rift controlled panning and aiming sentry gun turret”. 2020 4th Interna‑tional Con‑ ference on Trends in Electronics and Informatics (ICOEI)(48184), 2020, pp. 871–876, 10.1109/ ICOEI48184.2020.9143032.

A. Guersenzvaig, “Autonomous weapon systems: Failing the principle of discrimination”, IEEE Tech‑nology and Society Magazine, vol. 37, no. 1, 2018, pp. 55–61, 10.1109/MTS.2018.2795119. Z. Kalal, K. Mikolajczyk, and J. Matas, “­Tracking‑learning‑detection”, IEEE ­Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 7, 2012, pp. 1409–1422.

R. E. Kalman, “A New Approach to Linear Fil‑­ tering and Prediction Problems”, Journal of Ba‑sic Engineering, vol. 82, no. 1, 1960, pp. 35–45, 10.1115/1.3662552. S. Kuswadi, M. N. Tamara, and D. N. H.W., “Gun turret automatic weapon control system design and realization”. 2016 International Symposium on Electronics and Smart Devices (ISESD), 2016, pp. 30–34. M. Tsourma and M. Dasygenis, “Development of a hybrid defensive embedded system with face recognition”. 2016 IEEE Intl Conference on Computational Science and Engineering (CSE) and IEEE Intl Conference on Embedded and Ubiquitous Computing (EUC) and 15th Intl Sym‑ posium on Distributed Computing and Applica‑ tions for Business Engineering (DCABES), 2016, pp. 154–157.

77


www.jamris.org 2022

Indexed in SCOPUS

Journal of Automation, Mobile Robotics and Intelligent Systems

pISSN 1897-8649 (PRINT) / eISSN 2080-2145 (ONLINE) VOLUME 16, N° 2,

WWW.JAMRIS.ORG • pISSN 1897-8649 (PRINT) / eISSN 2080-2145 (ONLINE) • VOLUME 16, N°2, 2022

logo podstawowe skrót

Publisher: Łukasiewicz – Industrial Research Institute for Automation and Measurements PIAP

logo podstawowe skrót

Łukasiewicz – Industrial Research Institute for Automation and Measurements PIAP


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.