Journal of Automation, Mobile Robotics Systems & Intelligent Systems pISSN 1897-8649 (PRINT) /(ONLINE) eISSNVOLUME 2080-2145 12, N−1 2018 Journal of Automation, Mobile Robotics & Intelligent pISSN 1897-8649 (PRINT) / eISSN 2080-2145 11, (ONLINE) N° 4 2017VOLUME www.jamris.org
www.jamris.org
VOLUME VOLUME 12 12 N°1 N°1 2018 2018 www.jamris.org www.jamris.org
Publisher: Publisher: Publisher: IndustrialInstitute Researchfor Institute for Automation and Measurements Industrial IndustrialResearch Research Institute forAutomation Automation and andMeasurements Measurements PIAP PIAP PIAP
pISSN 1897-8649 (PRINT) /eISSN 2080-2145 (ONLINE)
pISSN pISSN1897-8649 1897-8649(PRINT) (PRINT)/eISSN /eISSN2080-2145 2080-2145(ONLINE) (ONLINE)
pISSN pISSN1897-8649 1897-8649(PRINT) (PRINT)//eISSN eISSN2080-2145 2080-2145(ONLINE) (ONLINE)
VOLUME 11 N°4 2017 www.jamris.org pISSN 1897-8649 (PRINT) / eISSN 2080-2145 (ONLINE)
Indexed Indexed in in SCOPUS SCOPUS
Indexed in SCOPUS
Journal of Automation, mobile robotics & Intelligent Systems
Editor-in-Chief
Typesetting:
Janusz Kacprzyk (Polish Academy of Sciences, PIAP, Poland)
Ewa Markowska, PIAP
Advisory Board:
Piotr Ryszawa, PIAP
Dimitar Filev (Research & Advenced Engineering, Ford Motor Company, USA) Kaoru Hirota (Japan Society for the Promotion of Science, Beijing Office) Witold Pedrycz (ECERF, University of Alberta, Canada)
Webmaster: Editorial Office:
Co-Editors:
Industrial Research Institute for Automation and Measurements PIAP Al. Jerozolimskie 202, 02-486 Warsaw, POLAND Tel. +48-22-8740109, office@jamris.org
Roman Szewczyk (PIAP, Warsaw University of Technology) Oscar Castillo (Tijuana Institute of Technology, Mexico) Marek Zaremba (University of Quebec, Canada)
Copyright and reprint permissions Executive Editor The reference version of the journal is e-version. Printed in 300 copies.
Executive Editor: Anna Ładan aladan@piap.pl
Associate Editor: Piotr Skrzypczynski (Poznan University of Technology, Poland)
Statistical Editor: Małgorzata Kaliczynska (PIAP, Poland)
Editorial Board: Chairman - Janusz Kacprzyk (Polish Academy of Sciences, PIAP, Poland) Plamen Angelov (Lancaster University, UK) Adam Borkowski (Polish Academy of Sciences, Poland) Wolfgang Borutzky (Fachhochschule Bonn-Rhein-Sieg, Germany) Bice Cavallo (University of Naples Federico II, Napoli, Italy) Chin Chen Chang (Feng Chia University, Taiwan) Jorge Manuel Miranda Dias (University of Coimbra, Portugal) Andries Engelbrecht (University of Pretoria, Republic of South Africa) Pablo Estévez (University of Chile) Bogdan Gabrys (Bournemouth University, UK) Fernando Gomide (University of Campinas, São Paulo, Brazil) Aboul Ella Hassanien (Cairo University, Egypt) Joachim Hertzberg (Osnabrück University, Germany) Evangelos V. Hristoforou (National Technical University of Athens, Greece) Ryszard Jachowicz (Warsaw University of Technology, Poland) Tadeusz Kaczorek (Bialystok University of Technology, Poland) Nikola Kasabov (Auckland University of Technology, New Zealand) Marian P. Kazmierkowski (Warsaw University of Technology, Poland) Laszlo T. Kóczy (Szechenyi Istvan University, Gyor and Budapest University of Technology and Economics, Hungary) Józef Korbicz (University of Zielona Góra, Poland) Krzysztof Kozłowski (Poznan University of Technology, Poland) Eckart Kramer (Fachhochschule Eberswalde, Germany) Rudolf Kruse (Otto-von-Guericke-Universität, Magdeburg, Germany) Ching-Teng Lin (National Chiao-Tung University, Taiwan) Piotr Kulczycki (AGH University of Science and Technology, Cracow, Poland) Andrew Kusiak (University of Iowa, USA)
The title receives financial support from the Minister of Science and Higher Education of Poland under agreement 857/P-DUN/2016 for the tasks: 1) implementing procedures to safeguard the originality of scientific publications, and 2) the creation of English-language versions of publications.
Mark Last (Ben-Gurion University, Israel) Anthony Maciejewski (Colorado State University, USA) Krzysztof Malinowski (Warsaw University of Technology, Poland) Andrzej Masłowski (Warsaw University of Technology, Poland) Patricia Melin (Tijuana Institute of Technology, Mexico) Fazel Naghdy (University of Wollongong, Australia) Zbigniew Nahorski (Polish Academy of Sciences, Poland) Nadia Nedjah (State University of Rio de Janeiro, Brazil) Dmitry A. Novikov (Institute of Control Sciences, Russian Academy of Sciences, Moscow, Russia) Duc Truong Pham (Birmingham University, UK) Lech Polkowski (Polish-Japanese Institute of Information Technology, Poland) Alain Pruski (University of Metz, France) Rita Ribeiro (UNINOVA, Instituto de Desenvolvimento de Novas Tecnologias, Caparica, Portugal) Imre Rudas (Óbuda University, Hungary) Leszek Rutkowski (Czestochowa University of Technology, Poland) Alessandro Saffiotti (Örebro University, Sweden) Klaus Schilling (Julius-Maximilians-University Wuerzburg, Germany) Vassil Sgurev (Bulgarian Academy of Sciences, Department of Intelligent Systems, Bulgaria) Helena Szczerbicka (Leibniz Universität, Hannover, Germany) Ryszard Tadeusiewicz (AGH University of Science and Technology in Cracow, Poland) Stanisław Tarasiewicz (University of Laval, Canada) Piotr Tatjewski (Warsaw University of Technology, Poland) Rene Wamkeue (University of Quebec, Canada) Janusz Zalewski (Florida Gulf Coast University, USA) Teresa Zielinska (Warsaw University of Technology, Poland)
Publisher: Industrial Research Institute for Automation and Measurements PIAP
If in doubt about the proper edition of contributions, please contact the Executive Editor. Articles are reviewed, excluding advertisements and descriptions of products. All rights reserved © Articles
1
JOURNAL of AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS VOLUME 12, N° 1, 2018 DOI: 10.14313/JAMRIS_1-2018
CONTENTS 33
3
Editorial Piotr A. Kowalski, Szymon Łukasik 5
Neural Network Structure Optimization Algorithm Grzegorz Nowakowski, Yaroslaw Dorogyy, Olena Doroga-Ivaniuk DOI: 10.14313/JAMRIS_1-2018/1 14
Experiment with JavaScript on Client and Server Side of the Virtual Laboratory and Visualized in Mixed Reality Using Microsoft HoloLens Erich Stark, Erik Kučera, Pavol Bisták, Oto Haffner, Olena Doroga-Ivaniuk DOI: 10.14313/JAMRIS_1-2018/2 23
The North Sea Bicycle Race ECG Project: Time-Domain Analysis Dominika Długosz, Trygve Eftestøl, Aleksandra Królak, Tomasz Wiktorski, Stein Ørna DOI: 10.14313/JAMRIS_1-2018/3
2
Articles
Virtual Tour for Smart House Developed in Unity 3D Engine and Connected with Microcontroller Erik Kučera, Oto Haffner, Erich Stark DOI: 10.14313/JAMRIS_1-2018/4 40
Ensembling a Linear Regression Model with an Error Mitigation Component Artur Nowosielski, Piotr A. Kowalski, Piotr Kulczycki DOI: 10.14313/JAMRIS_1-2018/5 44
Optimization of Membership Function Parameters for Fuzzy Controllers of an Autonomous Mobile Robot Using the Flower Pollination Algorithm Oscar R. Carvajal, Oscar Castillo, José Soria DOI: 10.14313/JAMRIS_1-2018/6 50
Block-Structured Models Composed of Nonlinear Fuzzy Dynamic and Static Parts – a Case Study Piotr Bazydło, Piotr Marusak DOI: 10.14313/JAMRIS_1-2018/7
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Special section on Recent Advances in Information Technology
This issue of the Journal of Automation, Mobile Robotics and Intelligent Systems is devoted to selected aspects of current studies in the area of Information Technology, presented by young talented contributors working in this field of research. Among included papers, one can find contributions dealing with neural networks, time-domain analysis, regression modelling, visualisation in mixed reality and programming in Unity 3D engine in microcontroller’s environment. The idea of creating this special issue was born as a result of broad and interesting discussions during the Fourth Doctoral Symposium on Recent Advances in Information Technology (DS-RAIT 2017) which was held in Prague (Czech Republic) on September 3-6, 2017 as a satellite event of the Federated Conference on Computer Science and Information Systems (FedCSIS 2017). The aim of this meeting was to provide a platform for the exchange of ideas between early-stage researchers, in Computer Science, PhD students in particular. Furthermore, the Symposium was to provide all participants an opportunity to obtain feedback on their ideas and explorations from the vastly experienced members of the IT research community who had been invited to chair all DS-RAIT thematic sessions. Therefore, submission of research proposals with limited preliminary results was strongly encouraged. This issue contains the following DS-RAIT papers in their special, extended versions. The first paper entitled Neural Network Structure Optimization Algorithm authored by Grzegorz Nowakowski, Yaroslaw Dorogyy and Olena Doroga-Ivaniuk, presents a deep analysis of current literature on the problems of the optimization of neural network parameters and their structure. The analysis includes a discussion of the basic disadvantages that are present in the observed algorithms and methods. The outcome is a new algorithm for neural network structure optimization which is free of the major shortcomings of other algorithms. The paper provides a detailed description of this algorithm, its implementation and application in recognition problems. Erich Stark, Erik Kucera, Pavol Bistak and Oto Haffner in their work entitled Experiment with Javascript on Client and Server Side of the Virtual Laboratory and Visualized in Mixed Reality Using Microsoft HoloLens, investigate aspects of the remote control of a test experiment within a virtual laboratory. This is a common problem, but the authors provide alternative way to solve it. The paper also compares several currently existing virtual laboratories along with their possible shortcomings. To develop their a new solution, JavaScript technology was applied on both the client and server side, using Node.js runtime library. This modern approach is a visualization of received data in mixed reality using Microsoft HoloLens or any device compatible with the Windows Mixed Reality platform. The paper entitled The North Sea Bicycle Race ECG Project: Time-Domain Analysis, was written by the team consisting of Dominika Długosz, Trygve Eftestøl, Aleksandra Królak, Tomasz Wiktorski and Stein Ørn. The North Sea Bicycle Race is an annual endurance cycling competition in Norway, and the examination of ECG recordings collected from participants of this race may allow defining and evaluating the relationship between physical endurance exercises and heart electrophysiology. The study also identifies the parameters reflecting potentially alarming deviations in health. In so-doing, this paper presents the results of a time-domain analysis of ECG data collected in 2014, implementing K-Means clustering. To handle this and similar data, the authors propose a double stage analysis strategy aimed at producing hierarchical clusters, The first phase allows rough separation of data. The second stage is applied to reveal the internal structure of the majority clusters. In both steps, the authors note that discrepancies driving the separation could stem from three sources. Firstly, they could be signs of abnormalities in the electrical activity of the heart. Secondly, they may allow discriminating between natural groups of participants according to sex, age and physical fitness. Finally, some deviations could result from faults in data extraction, therefore, serving in the evaluation of the parameters. After applying their strategy, the authors noted that the clusters were defined predominantly by combinations of features: heartbeat signals correlation, P-wave shape, and RR intervals; none of the features alone was discriminative for all the clusters. The work entitled Virtual Tour for Smart House Developed in Unity 3D Engine and Connected with Microcontroller, written by Erik Kucera, Oto Haner and Erich Stark describes the topic of a virtual tour of a potential construction. This concept is very popular as many people would like to see a virtual house before acquiring the real counterpart. The paper demonstrates the creation of a virtual smart house tour developed in Unity engine. This virtual tour is controlled via an Arduino family microcontroller which has several sensors and actuators attached. These electronic devices react to the events in the virtual tour and vice versa. Artur Nowosielski, Piotr A. Kowalski and Piotr Kulczycki authored the paper Ensembling a linear regression model with an error mitigation component. In this paper, the authors delve into a proposed model error mitigation technique based on the error distribution analysis of the original model. They then create an additional model that tempers the error impact in particular domain areas identified as being most sensitive. Both models are then combined
3
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
into a single ensemble model. The idea is demonstrated by way of the trivial two-dimensional linear regression model. We would like to thank all those who were participating in, and contributing to the Symposium program, as well as all the authors who have submitted their papers. We also wish to thank all our colleagues, the members of the Program Committee, both for their hard work during the review process and for their cordiality and outstanding local organization of the Conference. Editors: Piotr A. Kowalski Systems Research Institute, Polish Academy of Sciences and Faculty of Physics and Applied Computer Science, AGH University of Science and Technology
4
Articles
Szymon Łukasik Systems Research Institute, Polish Academy of Sciences and Faculty of Physics and Applied Computer Science, AGH University of Science and Technology
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Neural Network Structure Optimization Algorithm Submitted: 17th December 2017; accepted: 20th March 2018
Grzegorz Nowakowski, Yaroslaw Dorogyy, Olena Doroga-Ivaniuk
DOI: 10.14313/JAMRIS_1-2018/1 Abstract: This paper presents a deep analysis of literature on the problems of optimization of parameters and structure of the neural networks and the basic disadvantages that are present in the observed algorithms and methods. As a result, there is suggested a new algorithm for neural network structure optimization, which is free of the major shortcomings of other algorithms. The paper describes a detailed description of the algorithm, its implementation and application for recognition problems. Keywords: structure optimization, neural network, ReLU, SGD
1. Introduction The unit of neural networks is widely used to solve various problems including recognition tasks. The existence of a method for automatic search of neural network optimal structure could provide an opportunity to get the structure of a neural network much faster, that would better suit the subject area and existing incoming data [1]. Since there are no well-defined procedures for selecting the parameters of a NN and its structure for a given application, finding the best parameters can be a case of trial and error. There are many papers, like [2–4] for example, in which the authors arbitrarily choose the number of hidden layer neurons, the activation function, and number of hidden layers. In [5], networks were trained with 3 to 12 hidden neurons, and it was found that 9 was optimal for that specific problem. The GA had to be run 10 times, one for each of the network architectures. Since selecting NN parameters is more of an art than a science, it is an ideal problem for the GA. The GA has been used in numerous different ways to select the architecture, prune, and train neural networks. In [6], a simple encoding scheme was used to optimize a multi-layer NN. The encoding scheme consisted of the number of neurons per layer, which is a key parameter of a neural network. Having too few neurons does not allow the neural network to reach an acceptably low error, while having too many neurons limits the NN’s ability to generalize. Another important design consideration is deciding how many connections should exist between network layers. In [7], a genetic algorithm was used
to determine the ideal amount of connectivity in a feed-forward network. The three choices were 30%, 70%, or 100% (fully-connected). In general, it is beneficial to minimize the size of a NN to decrease learning time and allow for better generalization. A common process known as pruning is applied to neural networks after they have already been trained. Pruning a NN involves removing any unnecessary weighted synapses. In [8], a GA was used to prune a trained network. The genome consisted of one bit for each of the synapses in the network, with a ‘1’ represented keeping the synapse, while a ‘0’ represented removing the synapse. Each individual in the population represented a version of the original trained network with some of the synapses pruned (the ones with a gene of ‘0’). The GA was performed to find a pruned version of the trained network that had an acceptable error. Even though pruning reduces the size of a network, it requires a previously trained network. The algorithm developed in this research optimizes for size and error at the same time, finding a solution with minimum error and minimum number of neurons. Another critical design decision, which is application-specific, is the selection of the activation function. Depending on the problem at hand, the selection of the correct activation function allows for faster learning and potentially a more accurate NN. In [9], a GA was used to determine which of several activation functions (linear, logsig, and tansig) were ideal for a breast cancer diagnosis application. Another common use of GA is to find the optimal initial weights of back-propagation and other types of neural networks. As mentioned in [10], genetic algorithms are good for global optimization, while neural networks are good for local optimization. Using the combination of genetic algorithms to determine the initial weights and back propagation learning to further lower error takes advantage of both strengths and has been shown to avoid local minima in the error space of a given problem. Examining the specifics of the GA used in [2] shows the general way in which many other research papers use GA to determine initial weights. In [2], this technique was used to train a NN to perform image restoration. The researchers used fitness based selection on a population of 100, with each gene representing one weight in the network that ranged from -1 to 1 as a floating point number. Dictated by the specifics of the problem, the structure of the neural network was fixed at nine input and one output node. The researchers ar-
5
Journal of Automation, Mobile Robotics & Intelligent Systems
6
bitrarily chose five neurons for the only hidden layer in the network. To determine the fitness of an individual, the initial weights dictated by the genes are applied to a network which is trained using back propagation learning for a fixed number of epochs. Individuals with lower error were designated with a higher fitness value. In [10–11] this technique was used to train a sonar array azimuth control system and to monitor the wear of a cutting tool, respectively. In both cases, this approach was shown to produce better results that when using back-propagation exclusively. In [12] the performance of a two back propagation neural networks were compared: one with GA optimized initial weights and one without. The number of input, hidden, and output neurons were fixed at 6, 25, and 4, respectively. Other parameters such as learning rate and activation functions were also fixed so that the only differences between the two were the initial weights. In [2, 11–13] each of the synaptic weights was encoded into the genome as a floating point number (at least 16 bits), making the genome very large. The algorithm developed in this research only encodes a random number seed, which decreases the search space by many orders of magnitude. Determining the initial values using the GA has improved the performance of non-back propagation networks as well. In [14] a GA was used to initialize the weights of a Wavelet Neural Network (WNN) to diagnose faulty piston compressors. WNNs have an input layer, a hidden layer with the wavelet activation function, and an output layer. Instead of using back propagation learning, these networks use the gradient descent learning algorithm. The structure of the network was fixed, with one gene for each weight and wavelet parameter. Using the GA was shown to produce lower error and escape local minima in the error space. Neural networks with feedback loops have also been improved with GA generated initial weights. Genetic algorithms have also been used in the training process of neural networks, as an alternative to the back-propagation algorithm. In [15] and [16], genes represented encoded weight values, with one gene for each synapse in the neural network. It is shown in [17] that training a network using only the back-propagation algorithm takes more CPU cycles than training using only GA, but in the long run back-propagation will reach a more precise solution. In [18], the Improved Genetic Algorithm (IGA) was used to train a NN and shown to be superior to using a simple genetic algorithm to find initial values of a back propagation neural network. Each weight was encoded using a real number instead of a binary number, which avoided lack of accuracy inherent in binary encoding. Crossover was only performed on a random number of genes instead of all of them, and mutation was performed on a random digit within a weight’s real number. Since the genes weren’t binary, the mutation performed a “reverse significance of 9” operation (for example 3 mutates to 6, 4 mutates to 5, and so on). The XOR problem was studied, and the IGA was shown to be both faster and produce lower error. Similar to [3], this algorithm requires a large genome since all the weights are encoded. Articles
VOLUME 12,
N° 1
2018
Previously, genetic algorithms were used to optimize a one layered network [19], which is too few to solve even moderately complex problems. Many other genetic algorithms were used to optimize neural networks with a set number of layers [2–3, 12, 14, 20–21]. The problem with this approach is that the GA would need to be run once for each of the different number of hidden layers. In [20], the Variable String Genetic Algorithm was used to determine both the initial weights of a feed forward NN, as well as the number of neurons in the hidden layer to classify infrared aerial images. Even though the number of layers was fixed (input, hidden, and output), adjusting the number of neurons allowed the GA to search through different sized networks. A wide range of algorithms is used to build the optimal neural network structure. The first of these algorithms is the tiled constructing algorithm [22]. The idea of the algorithm is to add new layers of neurons in a way that input training vectors that have different respective initial values, would have a different internal representation in the algorithm. Another prominent representative is the fast superstructure algorithm [23]. According to this algorithm new neurons are added between the output layers. The role of these neurons is the correction of the output neurons error. In general, a neural network that is based on this algorithm has the form of a binary tree. In summary, the papers mentioned above studied genetic algorithms that were lacking in several ways: • They do not allow flexibility of the number of hidden layers and neurons. • They do not optimize for size. • They have very large genomes and therefore search spaces. The algorithm described in this article addresses all of these issues. The main goal of this work is to analyze the structure optimization algorithm of neural network during its learning for the tasks of pattern recognition [24] and to implement the algorithm using program instruments [1].
2. The Algorithm of Structural Optimization During Learning
Structural learning algorithm is used in multilayer networks and directs distribution networks and has an iterative nature: on each iteration it searches for the network structure that is better than the last one. Network search is performed by sorting all possible mutations of network and by selection and combination of the best ones(selection and crossing). Consider the basic parameters of the algorithm.
Learning parameters: learning rate: ; inertia coefficient: ; coefficient of weights damping: ; the probability of activation of the hidden layer neuron: ph; • the probability of activation of the input layer neuron: pi. Structured learning parameters: • initial number of neurons in the hidden layer; • • • •
Journal of Automation, Mobile Robotics & Intelligent Systems
• • • •
activation function for the hidden layer; activation function in the output layer; maximum number of mutations in the crossing; the number of training epochs of the original network; • the number of training epochs in the iteration; • acceptable mutation types; • part of the training sample used for training.
3. Elementary Structural Operations on a Neural Network
According to [25] the following basic structural operations on a network have been introduced [1]: • adding a synapse between two randomly selected unrelated network nodes or neurons – operation SynADD; • removing the synapse between two randomly selected unrelated network nodes or neurons – operation SynDEL; • moving synapse between two randomly selected unrelated network nodes or neurons – operation SynMOD; • changing the activation function of the neuron to randomly selected neuron – operation AMOD; • serialization of the node or the neuron – operations SerNODE and SerNR; • parallelization of the node or the neuron – operations ParNODE and ParNR; • adding a node or a neuron – operations AddNODE and AddNR; • create a new layer – operation LADD; • removing the layer NN – operation LDEL. The use or nonuse of described structural operations depends on the complexity of the task. For recognition problems that will be described in this article operations(mutations) described in [26] are used.
4. Algorithm Implementation
Internally neural networks are presented as numeric matrix sequences of each layer weight except for the input one [1]. In Fig.1 the matrix sequence for [2-3-2] network type is shown: hidden layer matrix 2x3 and output layer one 3x2.
Fig. 1. [2-3-2] Network internal realization example Each element aij in matrix Ak equals to weight value between i and j network neurons. For realization of different types of mutations, the operations on matrices are used. When adding a new neuron to the layer a combination of adding operations of new matrix row and column is implemented. In Fig. 2, 3 and 4 the realization of neuron addition
VOLUME 12,
N° 1
2018
to the input, hidden and output layers has been presented.
Fig. 2. Neuron addition to the input layer
Fig. 3. Neuron addition to the hidden layer
Fig. 4. Neuron addition to the output layer To extract neurons opposing operations are used. In Fig. 5 there is a realization of extraction of a second neuron in the hidden network.
Fig. 5. Hidden layer neuron extraction When adding a new layer, the new weight’s matrix insertion operation is performed. Since some operations change matrices’ structures, there is a certain difficulty in their combination. For example, when extracting the hidden layer O3 neuron in [2–3–2] network the O4 neuron in the resulting network will shift one position and become O3 neuron; when adding new hidden layer, that contains 4 neurons in front of existing hidden layer, next layer will shift one position. When combining different mutations their step-by-step execution has to be done in a strict order, which depends on type and parameters of each mutation. In Listing 1 there is a code fragment implemented in Clojure [27], that executes combined mutation. At first the mutations that do not change structures –addition and extraction of connections, are executed, then the addition of new neurons and extraction of existing ones is executed; new layers are added at the end. Mutations, which extract neurons, are executed in neuron number decrease order, similarly as layer addition – in a new layer index decrease order. Articles
7
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
(defmethodmutate ::combined [net {:keys [mutations]}] (let [grouped-ms (group-by :operation mutations) {add-node-ms ::add-node del-node-ms ::del-node layer-ms ::add-layer} grouped-ms safe-ms (mapcat grouped-ms [::identity ::add-edge ::del-edge]) safe-del-node-ms (reverse (sort-by #(second (:deleted-node %)) del-node-ms)) safe-layer-ms (reverse(sort-by :layer-pos layer-ms)) ms (concat safe-ms add-node-ms safe-del-node-ms safe-layer-ms)] (reduce mutate net ms)))
Listing 1 – Code fragment implemented in Clojure that executes combined mutation One of the Clojure [9] benefits over other programming languages is usage of unchangeable data structures – collections and containers, the content of which cannot be changed. In return, while trying to add a new element to the collection the new substance of the collection will be created containing this element. The operation of creating a new collection is optimized this way: both objects will use the mutual part of collection. In the Fig. 6 the result of adding object 5 to the end of array [……] is shown. V denotes an old collection object, v2 denotes newly created collection object.
Fig. 6. Principle of data structure work in Clojure Programming with unchangeable data structure usage makes programs much easier to understand. • program parallelization simplicity–unchangeable data can be used in parallel without any need to synchronize threads; • no problems with memory leaks; • caching simplicity; • major memory economy in some cases.
8
Due to these characteristics of unchangeable structures the main part of an algorithms work is done in parallel with maximum computing resources usage. The developed system has a client-server architecture. A system deployment diagram is shown in Fig. 7. In general the system consists of 2 parts: • server application, which does neural network learning and implements structure optimization algorithm; • client application, which implements GUI. Articles
Fig. 7. System deployment diagram Clojure has been used to implement the server application. The Java platform [28] has been used as a runtime environment. For the GUI implementation, the ClojureScript– Clojure dialect [27], executed in JavaScript, has been used.
5. Experimental Research
Example 1. MONK’s Problem. MONK’s Problem [29] was among the first that had been used to compare classification algorithms. Each training example of sample contains 7 attributes, whereas the last attribute – class number which should be referred to example: 1. a1 ∈ {1, 2, 3} 2. a2 ∈ {1, 2, 3} 3. a3 ∈ {1, 2, 3} 4. a4 ∈ {1, 2, 3} 5. a5 ∈ {1, 2, 3, 4} 6. a6 ∈ {1, 2} 7. a7 ∈ {0, 1}
The following tasks are determined:: • Problem M1: (a1 = a2) ∨ (a5 = 1) • Problem M2: at least 2 of (a1 = 1, a2 = 1, a3 = 1, a4 = 1, a5 = 1, a6 = 1) • Problem M3: ((a5 = 3) ∨ (a4 = 1)) ∨ ((a5 4) ∧ (a2 ≠ 3)
Neural networks easily solve problems M1 and M2 and achieve 100% classification accuracy in the test sample. Training sample for M3 problem include a noise as 5% incorrectly classified examples so this issue will be used for research. We used the following training values and structural optimization settings: • training speed: = 0.001; • inertia coefficient: = 0; • coefficient of weights damping: = 0.5; • the maximum number of mutations at crossing: M = 10; • the number of training epochs of original network: T0 = 100; • the number of training epochs in iteration: Ti = 20; • allowable types of mutations, adding and removing weights; • type of cost function: cross-entropy [30]. The obtained price values depending on classification accuracy are presented in Figs. 8 and 9.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Fig. 10. TwoSpirals sample in the graphic form Fig. 8. Price value of normal and optimized networks
Fig. 9. Classification accuracy of ordinary and optimized networks The resulting accuracy of the classification is tabulated in Table 1. Table 1. The resulting classification accuracy for MONK’s problems Type NN
Training [%]
Testing [%]
Optimized
98.36
96.75
Common
97.54
solving the problem only with the optimization of the structure using more common network topology.
Research of the algorithm. We used the following training values and structural optimization settings: • training speed: = 0.005; • inertia coefficient: = 0; • coefficient of weights damping: = 0.1; • the probability of activation of the hidden layer neuron: ph = 1; • the probability of activation of the input layer neuron: pi = 1; • the maximum number of mutations at crossing: M = 20; • the number of training epochs in original network: T0 = 50; • the number of training epochs in iteration: Ti = 150; • permissible types of mutations: all; • type of cost function: cross-entropy [30]. Figs. 11 and 12 show the relation between price value and classification accuracy on a number of completed training epochs.
96.99
Although a significant increase in classification accuracy did not happen with these dependencies we can conclude that due to optimization of the structure during training, the network does not stop at points of local minima and studies twice faster.
Example 2. TwoSpirals problem. This problem is a rather complicated classification task and a generalization of many recognition algorithms which was proposed in [31]. The sample consists of a set of points that form a two-dimensional spiral. It is necessary to properly classify the points that are not included in the training set. Sample Selection. Each data training sample consists of three elements: the x and y coordinates in the range 0…1, and a number of the curve where it meets. The sample in the graphic form is shown in Fig. 10. Network architecture. To solve the problem, a 2-layer network with one hidden layer containing 10 neurons with linear straightened activation function was selected as the original network. Networks such as 2-10-10-2, 2-5-10-2 are the best in coping with this task, using an odd activation function (bipolar sigmoid function or a hyperbolic tangent). Instead, it was interesting to explore the possibility of
Fig. 11. Ordinary and optimized network price value during training
Fig. 12. Accuracy classification of ordinary and optimized networks during training Articles
9
Journal of Automation, Mobile Robotics & Intelligent Systems
After 7 iterations of the algorithm, we obtained a [2-9-9-7-7-2] network with 92.7% classification accuracy.
Example 3. Human Recognition. The implemented program system is used to research problems of human face recognition [1]. The face image database of Yale university was used as output data [32]. Sampling 10 different persons and 50 different images of each person were selected. Each image has been scaled to the size of 26×26 pixels and coded into 676-dimensional vector, the values of pixels’ brightness were normalized to 0…1 range. Each output class representing a particular person was coded into a 10 element vector which contains 9 zeroes and a single 1 at a different index. The obtained 500 samples were randomly divided into training and testing sets 2:1. In Fig. 13 the source images and images used for neural network learning are shown.
Fig. 13. Data set formation example Architecture of source network. A network architecture which is shown in Fig. 14 was used to evaluate the work of the algorithm.
VOLUME 12,
N° 1
2018
Selected parameters following algorithm: initial number of neurons in the hidden layer: 3; activation function for the hidden layer: ReLU [34]; activation function in the output layer: softmax; maximum number of mutations in the crossing: M = 50; • the number of training epochs of the original network: T0 = 100; • the number of training epochs in the iteration: Ti = 5; • acceptable mutation types: adding and removing synapses; • part of the training sample used for training: 1; • type of cost function: cross-entropy [30]. During 40 iterations of the algorithm 300 extractions and 128 additions of synapses were carried out. In Fig. 15 and Fig. 16 the dependency of price and precision values of classification from amount of implemented learning epochs has been presented. Received values are shown in Table 2. Due to connections’ optimization structure we could lower false classification percentage to 4.2% on testing set. An experiment has also been made in which Ti = 3, which is shown in Fig. 17 and Fig. 18. During 100 iterations of the algorithm 645 extractions and 457 additions of synapses were carried out. We could lower the false recognition percentage from 7.8 to 6.0 on testing set. The result is shown in Table 3. • • • •
Table 2. The resulting accuracy of image classification for Ti = 5 Type NN
Training, %
Testing, %
Optimized
98.19
95.80
Common
97.59
93.41
Fig. 15. Image classification accuracy for Ti = 5 Fig. 14. Image recognition network architecture
10
Research of the algorithm. The following training values and structural optimization settings have been used for SGD with weight decay regularization [33]: • learning rate: = 0.002; • inertia coefficient: = 0.1; • coefficient of weights damping: = 0.1; • the probability of activation of the hidden layer neuron: ph = 1; • the probability of activation of the input layer neuron: pi = 1; Articles
Fig. 16. Price value for image classification for Ti = 5
Journal of Automation, Mobile Robotics & Intelligent Systems
Table 3. The resulting accuracy of image classification for Ti = 5 Type NN
Training, %
Testing, %
Optimized
99.09
94.01
Common
98.79
92.21
Fig. 17. Image classification accuracy for Ti = 3
Fig.18. Price value for image classification for Ti = 3 Example 4. Evaluation of critical IT-infrastructure functioning. In this example, we show the quality of operation of the service using an algorithm for estimating [35]. Figure 19 shows the example of a dependency tree which schematically represents the impact of critical IT-infrastructure elements (hereinafter – IT-infrastructure elements (CITIE)).
Fig. 19. CITIE tree example Here O , i ∈ [1; K ] , ji ∈ [1; Ni ] are CITIEs, and the arrows show influence of quality of functioning of some CITIE on the quality of functioning of other CITIE. Let’s denote vector of parameters that affect i j
VOLUME 12,
N° 1
2018
the quality of CITIE O ij , as Pji and Q ij as qualitative assessment of the functioning of the CITIEs that are affecting O ij . As an example of CITIE, for which it is necessary to calculate the qualitative evaluation of functioning, we selected an average application server.We reviewed five parameters affecting the quality of its functioning, which are constructed sets P’ and Q’: • p1 – hard drive usage. This parameter is reduced to values between 0 and 1 • p2 – CPU usage. This parameter is reduced to values between 0 and 1; • p3 – load of the network that the server is connected to. This is the ratio of available network bandwidth to the nominal network bandwidth; • p4 – used RAM of the server. This is the ratio of used RAM volume to the maximum available memory; • q1 – quality of functioning of another CITIE (DB server, used by the selected application server). To calculate the qualitative assessment of the functioning of this CITIE we construct a classifier based on neural network. For O ij the input parameters of the neural such network will be vector {P’, Q‘} and the output parameter is qualitative assessment of the functioning of O ij . Without assumptions about the nature of relationships between elements and qualitative evaluations of the elements’ parameters, it is advisable to apply approximate expert estimates based on personal experience of administrators, IT-managers, etc. Since we automatically determine the structure of the network, the person is enough to specify the quality of functioning of the element with different values of {P’, Q’}. During experiment, values of selected parameters were artificially set on computers. Then, it was proposed to experts to specify the performance of this server on a scale from zero to one. Then we automatically define the type of neural network, and start training the network using the method described in previous example. The resulting structure of neural network we obtained, can be used to determine the quality of functioning of another similar CITIE. In this case, it will not have to determine the optimal network structure and training time will be reduced. This will allow the service provider “on the fly” retrain its existing models in a shorter period. During 50 iterations of the algorithm, 157 extractions and 107 additions of synapses were carried out. Received values are presented in Table 4. Due to the optimization structure of connections we could lower false classification percentage to 5.7% on testing set. Table 4. The resulting accuracy of CITIE classification for Ti = 3 Type NN
Training, %
Testing, %
Optimized
97.8
94.3
Common
96.4
92.1
Articles
11
Journal of Automation, Mobile Robotics & Intelligent Systems
Conclusion This article considered the problem of a structural optimization algorithm implementation, and the possible appliance of this algorithm in image recognition and for evaluation of critical IT-infrastructure functioning problems were analyzed. Due to the optimization structure of connections we could lower the false classification percentage to 4.2% in the testing set, and we could lower the false recognition percentage from 7.8 to 6.0 in the testing set for human recognition task. For the task of CITIE evaluation, we could reduce false classification level to 5.7%. The proposed algorithm is flexible in the number of hidden layers, neurons and links. The obtained results prove the efficiency of the proposed algorithm for using with recognition problems.
ACKNOWLEDGEMENTS
Presented results of the research, which was carried out under the theme No. E-3/611/2017/DS, were funded by the subsidies on science granted by Polish Ministry of Science and Higher Education.
AUTHORS Grzegorz Nowakowski* – Cracow University of Technology ul. Warszawska 24, 31-155 Cracow, Poland. E-mail: gnowakowski@pk.edu.pl. YaroslawDorogyy – National Technical University of Ukraine “Igor Sikorsky Kyiv Politechnic Institute” av. Victory 37, Kyiv, Ukraine. E-mail: cisco.rna@gmail.com. OlenaDoroga-Ivaniuk – National Technical University of Ukraine “Igor Sikorsky Kyiv Politechnic Institute” av. Victory 37, Kyiv, Ukraine. E-mail: cisco.rna@ gmail.com. * Corresponding author
REFERENCES
12
[1] G. Nowakowski et al., “The Realisation of Neural Network Structural Optimization Algorithm”, In: Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, 2017, 1365–1371. DOI: 10.15439/2017F448. [2] Q. Xiao, W. Shi, X. Xian, X. Yan, “An image restoration method based on genetic algorithm BP neural network”. In: Proceedings of the 7th World Congress on Intelligent Control and Automation, 2008, 7653–7656. [3] W. Wu, W. Guozhi, Z. Yuanmin, W. Hongling, “Genetic Algorithm Optimizing Neural Network for Short-Term Load Forecasting”. In: International Forum on Information Technology and Applications, 2009, 583–585. DOI: 10.1109/IFITA.2009.326. [4] S. Zeng, J. Li, L. Cui, “Cell Status Diagnosis for the Aluminum Production on BP Neural Network with Genetic Algorithm”, CommunicaArticles
VOLUME 12,
N° 1
2018
tions in Computer and Information Science, vol. 175, 2011, 146-152. DOI: 10.1007/978-3642-21783-8_24. [5] W. Yinghua, X. Chang, “Using Genetic Artificial Neural Network to Model Dam Monitoring Data”. In: Second International Conference on Computer Modeling and Simulation, 2010, 3–7. DOI: 10.1109/ICCMS.2010.80. [6] R. Sulej, K. Zaremba, K. Kurek, R. Rondio, Application of the Neural Networks in Events Classification in the Measurement of the Spin Structure of the Deuteron, Warsaw University of Technology, Poland, 2007. [7] S. A. Harp, T. Samad, “Genetic Synthesis of Neural Network Architecture”, Handbook of Genetic Algorithms, 1991, 202–221. [8] D. Whitley, T. Starkweather, C. Bogart, “Genetic Algorithms and Neural Networks: Optimizing Connections and Connectivity”, Parallel Computing, vol. 14, no. 3, 1990, 347–61. DOI: 10.1016/0167-8191(90)90086-O. [9] V. Bevilacqua, G. Mastronardi, F. Menolascina, P. Pannarale, A. Pedone, “A Novel Multi-Objective Genetic Algorithm Approach to Artificial Neural Network Topology Optimisation: The Breast Cancer Classification Problem”, International Joint Conference on Neural Networks, 1958– 1965, 2006. [10] Y. Du, Y. Li, “Sonar array azimuth control system based on genetic neural network”. In: Proceedings of the 7th World Congress on Intelligent Control and Automation, 2008, 6123–6127. [11] S. Nie, B. Ye, “The Application of BP Neural Network Model of DNA-Based Genetic Algorithm to Monitor Cutting Tool Wear”. In: International Conference on Measuring Technology and Mechatronics Automation, 2009, 338–341. DOI: 10.1109/ICMTMA.2009.160. [12] C. Tang, Y. He, L. Yuan, “A Fault Diagnosis Method of Switch Current Based on Genetic Algorithm to Optimize the BP Neural Network”. In: International Conference on Electric and Electronics, vol. 99, chapter 122, 2011, 943–950. DOI: 10.1007/978-3-642-21747-0_122. [13] Y. Du, Y. Li, “Sonar array azimuth control system based on genetic neural network”. In: Proceedings of the 7th World Congress on Intelligent Control and Automation, 2008, 6123–6127. [14] L. Jinru, L. Yibing, Y. Keguo, “Fault diagnosis of piston compressor based on Wavelet Neural Network and Genetic Algorithm”. In: Proceedings of the 7th World Congress on Intelligent Control and Automation, 2008, 6006–6010. DOI: 10.1109/WCICA.2008.4592852. [15] D. Dasgupta, D. R. McGregor, “Designing Application-Specific Neural Networks using the Structured Genetic Algorithm”. In: Proceedings of International Workshop on Combinations of Genetic Algorithms and Neural Networks, 1992, 87–96. DOI: 10.1109/COGANN.1992.273946. [16] G. G. Yen, H. Lu, “Hierarchical Genetic Algorithm Based Neural Network Design”, IEEE Sympo-
Journal of Automation, Mobile Robotics & Intelligent Systems
sium on Combinations of Evolutionary Computation and Neural Networks, 2000, 168–175. DOI: 10.1109/ECNN.2000.886232. [17] P. Koehn, Combining Genetic Algorithms and Neural Networks: The Encoding Problem, University of Tennessee, Knoxville, 1994. [18] Z. Chen, “Optimization of Neural Network Based on Improved Genetic Algorithm”. In: International Conference on Computational Intelligence and Software Engineering, 2009, 1–3. DOI: 10.1109/ CISE.2009.5365287. [19] P. W. Munro, “Genetic Search for Optimal Representation in Neural Networks”. In: Proceedings of the International Joint Conference on Neural Networks and Genetic Algorithms, chapter 91, 1993, 675–682. DOI: 10.1007/978-3-70917533-0_91. [20] X. Fu, P.E.R. Dale, S. Zhang, “Evolving Neural Network Using Variable String Genetic Algorithms (VGA) for Color Infrared Aerial Image Classification”, Chinese Geographical Science, vol. 18(2), 2008, 162–170. [21] J. M. Bishop, M. J. Bushnell, “Genetic Optimization of Neural Network Architectures for Colour Recipe Prediction”. In: Proceedings of the International Joint Conference on Neural Networks and Genetic Algorithms, 719–725, 1993. [22] M. Mezard, J.P. Nadal, “Learning in feedforward layered networks: The Tiling algorithm”, Journal of Physics, 1989, V. A22, P. 2191 – 2203. [23] M. Frean, “The Upstart Algorithm: A Method for Constructing and Training Feed-Forward Neural Networks”, Tech. Rep. 89/469, Edinburgh University, 1989. [24] B. D. Ripley, Pattern recognition and neural networks, Cambridge: Cambridge Univ. Press, 2009. DOI: 10.1017/CBO9780511812651.
VOLUME 12,
N° 1
2018
[25] Y. Y. Dorogiy, “Accelerated learning algorithm of Convolutional neural networks”, Visnik NTUU “KPI”, Informatics, operation and computer science, vol. 57, 2012, 150–154. [26] Y. Y. Dorohyy, “The algorithm of algorithmic optimization of the structural neural network is based on classification of data”, Visnyk NTUU “KPI”, Informatics, operation and computer science, vol. 62, 2015, 169–173. [27] S. D. Halloway, Programming Clojure, The Pragmatic Bookshelf, 2 edition, 2012. [28] B. Goetz, Java Concurrency in Practice, AddisonWesley Professional,1 edition, 2006. [29] S. Thrun et al., The monk’s problems: A performance comparison of different learning algorithms. Technical Report CMU-CS-91-197, Carnegie Mellon University, 1991. [30] P. Sadowski, Notes on backpropagation, homepage: https://www.ics.uci.edu/~pjsadows/notes.pdf (online). [31] K. J. Lang, M. Witbrock, Learning to Tell Two Spirals Apart In: Proceedings of 1988 Connectionists Models Summer School. Morgan Kaufmann, San Mateo CA, 1989, 52-59. DOI: 10.13140/2.1.3459.2329. [32] Yale Face Database, homepage: http://vision. ucsd.edu/~iskwak/ExtYaleDatabase/Yale/Face/ Database.htm (online). [33] Y. Bengio, “Practical recommendations for gradient-based training of deep architectures”, arXiv:1206.5533v2, 2012. [34] M. Hüsken, Y. Jin, B. Sendhoff, Soft Computing (2005) 9: 21. DOI: 10.1007/s00500-003-0330-y. [35] Y. Y. Dorogyy et al., “Qualitative evaluation method of IT-infrastructure elements functioning”. IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), 170–174, 2014.
Articles
13
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
���������� ���� J���S����� �� ������ ��� S����� S��� �� ��� ������� ���������� ��� ���������� �� M���� R������ ����� M�������� �������� ��bm��ed: 15th December 2018; accepted: 15th March 2018
Erich Stark, Erik Kučera, Pavol Bisták, Oto Haffner DOI: 10.14313/JAMRIS_1-2018/2 Abstract: The paper demonstrates remote control of test experiment in the virtual laboratory. This is a common problem, but another way can always be used to solve it. The paper compares several exis�ng virtual laboratories and their possible issues at present. To develop such a new solu�on JavaScript technology was used on both client and server side using �ode.js run�me. The modern approach is a visuali�a�on of received data in mixed reality using �icroso� �olo�ens or another compa�ble device with �indows �ixed �eality pla�orm. Keywords: virtual experiment, virtual laboratory, javascript, Unity engine, virtual reality, mixed reality
�. ��troduc�o� Practical exercises in the laboratory are an important part of the process of training people with technical background in general. Ancient Chinese philosopher Confucius once said: ”Tell me, and I will forget. Show me, and I may remember. Involve me, and I will understand” [20]. We know from experience that man can learn in the fastest way when he tries things several times, and after that, he understands how it works. Unfortunately, you cannot always provide direct access to real devices to perform the experiment for researchers or students. There may be several issues: the higher price of laboratory equipment, workplace safety (depending on the experiment), or lack of quali�ied assistants. In recent years, the development of virtual machines has increased mainly due to the technological evolution of software engineering. The progress of modern technology gives us the better approach to solve new challenges while creating whether the virtual systems for online teaching or speci�ic virtual laboratories where physical processes can be simulated. In experiments conducted in a virtual environment, it is possible to share resources of this environment for more connected users who want to perform the same experiment, which would not be possible on our computers. This makes virtual laboratory a good complement to study whether research, where you can try different variations of the experiment without risk to health or destruction of the device. Later, experiments can be tested on real devices, if necessary.
2. Virtual Laboratories
14
At the time when the Internet was not yet widespread in use, the experiments were done in real la-
boratories. It was important to keep on with different safety regulations to the possibility of personal injury or damage to equipment. �istance and lack of �inancial resources make real experiments dif�icult to perform, especially in cases where it is necessary to have some advanced and sophisticated tools. Another encountered problem is the lack of good teachers. Although at present there are already online courses that provide instructional videos, but it solves the problem only partially. Thanks to internet experiments can be structured for visualization and control remotely. Nowadays, a lot of equipment already provides an interface to connect computer and process data from it. Experimenting over the internet allows the use of resources, knowledge, software and data when physical experiments cannot [15]. In this paper, we discuss the creation of virtual laboratory (VL). Before we describe the list of technologies to create VL, we must explain what we consider under VL. Generally, we can say that VL is a computer program, where students interact with the experiment by the computer via the Internet as it is depicted in the Fig. 1.
Fig. 1. The difference between a face-to-face and remotely controlled experiments A typical example is the simulation experiment, where the student interacts with the web/app interface. Another possibility is a remote-controlled experiment where the student interacts with the real device via the computer interface, although he can be far away. This is the case when a virtual laboratory turns into a remote laboratory. When the web excludes the second option, so we
Journal of Automation, Mobile Robotics & Intelligent Systems
have the following de�inition: ”We call it a virtual laboratory where the student interacts with the experiment, which is physically at distant from him or her and not to demand any physical reality”. After explaining what is VL, look at the bene�its they can bring. They are described in the Table 1. Tab. 1. Comparison of Real, Virtual and Remote Laboratories [7] Laboratory Type Real
Virtual
Remote
Advantages
Disadvantages
real data, interaction with real experiment, collaborative work, interaction with supervisor good for concept explanation, no time and place restrictions, interactive medium, low cost interaction with real equipment, calibration, realistic data, no time and place restrictions, medium cost
time and place restrictions, requires scheduling, expensive, supervision required idealized data, lack of collaboration, no interaction with real equipment
only ”virtual presence” in the lab
People often think that the main bene�it of a virtual laboratory is to replace the real one. But it is not. You cannot replace the experience of the real work with the VL. Although VL is better than no experience. VL should not be seen as providing the maximum possible interaction experience. 2.�. ��is�n� �olu�ons
There are currently many different virtual and remote laboratories, which are used by foreign universities for teaching or research. This paper brie�ly reviews often used laboratories that are accessible over the Internet. A comparison of functionality and the use of technology can be seen in the Table 2, where different virtual laboratories created in the world are summarized. There are also some examples from Faculty of Electrical Engineering and Information technology, Slovak University of Technology in Bratislava in the Table 3. 2.2. �isa��anta�e� of ��is�n� �olu�ons
At the beginning of the design of a virtual laboratory, it was appropriate to examine the possibilities of existing solutions. Avoiding various design issues is important. Alternatively, technologies that have been used are already outdated. Nowadays, the development of new technologies is incredibly fast. Such an
VOLUME 12,
N° 1
2018
Tab. 2. Comparison of Virtual Laboratories Created Outside of FEI STU [12] Name of VL WeblabDEUSTO
NCSLab ACT
Client technology AJAX, Flash, Java applets, LabVIEW, Remote panel AJAX, Flash
HTML, Java applets LabShare AJAX, Java Sahara applets iLab HTML, Active X, Java applets RECOLAB HTML SLD
AJAX, HTML
Server technology Web services, Python, LabVIEW, Java, .NET, C, C++ PHP PHP
Simulation software XilinxVHDL, LabVIEW Matlab, Simulink Matlab, Simulink Java
Web services, Java Webservices, LabVIEW .NET PHP
Web services, PHP
Matlab, Simulink Matlab, Simulink
analysis of existing solutions we have done in the previous section. Our aim was to create a cross-platform solution using one programming language on client and server side, which cannot be done with WCF or COM technology as in the previous solutions. JMI is only suitable for solutions where Java platform is used. The server cannot also be used with LabVIEW technology or .NET (multi-platform version – .NET core is already under development). Client solutions such as Flash, ActiveX and Java applets are no longer supported in browsers, so their use is not appropriate. 2.3. Components of Virtual Laboratory
There are plenty of existing laboratories, but usually, it is not possible to guarantee compatibility between them, because there is not a solid standard. Anyway, it is always possible to identify the basic components that virtual laboratories can use. Some of them can be even used more times. Components: - The experiment itself - The device with possibility to control and acquiring data
- Laboratory server, which provides control, monitoring and data processing of the experiment - Server providing connection between remote users and laboratory server, usually via the internet
- Web camera connected to a server, which can be used for remote user as a visual and audible feedback on the actual status of the experiment - Tools enabling multi-user audio, video and chat communication
15
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Tab. 3. Comparison of Virtual Laboratories Created at FEI STU [13] Year
Author
2011
R. Farkas
2014 2014 2015
M. Kundrat T. Cerveny S. Varga
2012
T. Borka
Simulation software Matlab, Simulink, Real device Matlab, Simulink, Real device Matlab, Simulink Matlab, Simulink Matlab, Simulink
�ata �low
Client technology
Server technology
JMI Sockets
Java
Java
JMI, SOAP JMI, HTTP COM, HTTP
HTML, JS HTML, JS HTML, JS
Tomcat, Java JSF, EJB3, MySQL Jetty, Java .NET, PHP
WCF
- Client software controlling and representing data of the experiment [5] It is important to realize which of these components could be used, because for a creation of a virtual laboratory it is not necessary to have them all. Alternatively, others that are perfectly suited for a role can also be used. Sometimes it is used e.g. database server if experiments will be stored and processed later. It is also important to realize what type of VL we want to create. Certainly, differences will be in the design of single-user as opposed to multi-user VL, even with multiple experiments simultaneously. It should bear in mind as properly solve the scalability, potential safety issues, multi-user access and other possible issues.
3. Architecture Proposal
As the main component, Node.js was selected. It is the server which handles communication between components of VL. The parts of architecture will be explained based on Fig. 2. The data are fetched periodically from Simulink into Matlab workspace. In the beginning, it was not sure whether it would be possible to achieve to run multiplatform soft real-time Simulink based simulations. Because only Windows based solution was found directly from MathWorks. For our solution, Real-Time Pacer [19] was used that allows us to run simulations in soft real-time even under MacOS or Linux. It is used to slow down the simulation to the soft real-time. To communicate with RESTful web service Matlab R2015a uses the built-in rarely used function webread and webwrite [14]. Firstly, the simulation must be run through the web browser, after that data will be transferred over socket.io library channel. These data will be shown in the graph of the web browser, and it is possible to save them to MongoDB database for later processing (Fig. 2). 3.�. �e�erence �imula�on �odel
16
For a development purpose, we used the simulation of the dynamic system called projectile motion implemented in Simulink that runs through the web interface. This simulation needs to be run with two �iles. The purpose of the �irst is the initialization of variables needed to calculate the coordinates of the point. This experiment has three parameters. The �irst and second parameter are initial values for simulation. The
.NET, WPF
.NET
Fi�. �. �esi�n of �ommuni�a�on bet�een �omponents last parameter userFromWeb is not necessary for simulation itself, but it is important to identify the user who runs the simulation. This makes it possible to assign the simulation results in later processing from the database. 3.2. Experiment Handler
The second Matlab �ile is a handler code sending the data to Node.js. Because of its length of implementation, it is not possible to display the whole source code, so we describe only the key part. During initialization, the URL path is set for Express.js REST API where Matlab will send the data. The model is preloaded using the Matlab function load_system(’projectile_motion’). This function searches in the current folder for projectile_motion.mdl �ile and sets it as the top-level model. After this initial settings, simulation must be run using the command set_param(model, ’SimulationCommand’, ’Start’). In the next block of the Matlab code, it is running an in�inite while loop that makes possible to collect data from the simulation to the state until it is complete. Inside of the while loop the function set_param(model, ’SimulationCommand’, ’WriteDataLogs’) is called, which is looking for the current toplevel simulation. In the soft realtime the calculated data are written to the Matlab workspace. Without that function, data would be written only after the simulation ends.
Journal of Automation, Mobile Robotics & Intelligent Systems
Meanwhile, it is necessary to prepare required format of data for the web service. Thus, before sending them to the REST API, it is suitable to wrap data to the JSON structure. We used the Matlab library JSONlab v1.2 [4]. A sequence of these two commands is required to create the desired JSON format and send it to Express.js API. Create JSON with the command json = savejson(’result’, struct(’user’, userFromWeb, ’status’, ’Running’, ’data’, struct(’time’, timeFinal, ’you’, vyFinal, ’y’, yFinal, ’�’ ��ine))) and trans�er it to the service with response = webwrite (URL, JSON, options). The command get_param(model, ’SimulationStatus’) is used to check current status of the simulation. If the simulation is still running the status is ”running”. As soon as the status is ”stopped”, the loop needs to be stopped using the break keyword and we know that all data is transferred to Node.js.
VOLUME 12,
N° 1
2018
Fig. 3. Start Matlab in command line using shell.js library
4. Remote Control of Experiment 4.1. Web Client Created with Angular Framework Client application was created with the JavaScript framework Angular [6] (version 1.5.5). The role of the web client was to verify the functionality of the server that sends simulated data. The functionality has been veri�ied, and screens will be described speci�ically.
3.3. Communi�a�on �etween Component�
One of the aspects of the individual components of the laboratory is communication. Although in each component communication works differently, it is still based on the HTTP protocol. The sequence diagram in the Fig. 6 shows that communication starts from the web browser. The user inserts the parameters of simulation, which are sent to StarkLab via the REST web service. This service starts Matlab on the current operating system with the necessary �iles and simulation parameters. Meanwhile, the user waits until Matlab starts in the background. Simulation is immediately initialized and starts sending data to StarkLab, which sends them directly to the web client from where the simulation has been originated. All the received data will be re�lected in the chart, animation, and table in the web browser. This sequence is repeated until the condition contains SimulationStatus == ”running”. After stopping the simulation, the client sends a request to save data through StarkLab directly into the document database MongoDB. 3.4. Run Matlab from Command Line
In the beginning, it was not clear how to run the simulation. It was necessary to determine whether Node.js allows to carry out the commands of the operating system, respectively run programs. The simulation was working in such a way that the Matlab was opened manually and we put there all the necessary initialization �iles, then the simulation itself. But this solution is not suf�icient in terms of automation and autonomy. It has been found that Node.js can launch any software that can run through the terminal. To simplify this work�low the shell.js library [2] was used which provides such an option. The sample of code in the Fig. 3 shows how Matlab is started via Node.js route http://localhost/matlab/run. This route is called immediately after the form was sent with initial parameters of the experiment from the web browser.
Fig. �. �ogin to �eb a��lica�on Fig. 4 shows login page for web client application. It is authenticated against LDAP server of Slovak University of Technology. The details of the login process via LDAP is not interesting for this part of the paper. After successful login, the dedicated page for the tested experiment is showed. Our experiment was projectile motion. It takes two parameters to run a simulation. In Fig. 5 it can be seen the form that takes two parameters to run a simulation. The page is redirected to http://localhost/matlab route, where the user is waiting to see the data from Node.js REST API. It redirects to the dashboard page, and the user has to wait until the start of Matlab simulation. When it starts, the user will see new data coming to graph, animation, and table in his web browser. This part could be accelerated by a powerful server running with Matlab. Visualization of the received data is done by Chart.js library on Fig. 7. Our implementation of chart was created using Angular directive with name <uigraph></ui-graph>. Because of this approach (the usage of Angular components), it can be used multiple times with the same codebase. In the beginning, it is necessary to get an element from DOM (Document Object Model) tree. Next step
17
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
&ŝŐ͘ ϱ͘ WĂƌĂŵĞƚĞƌƐ ŽĨ ƐŝŵƵůĂƟŽŶ ͬ ŝŶŝƟĂů ǀĞůŽĐŝƚLJ ĂŶĚ ĂŶŐůĞ ŝŶ ĚĞŐƌĞĞƐ
&ŝŐ͘ ϲ͘ ŽŵŵƵŶŝĐĂƟŽŶ ďĞƚǁĞĞŶ ĐŽŵƉŽŶĞŶƚƐ
18
is to obtain canvas context and create the object with initial data. The plotted data at the bottom of the picture is identical to the data in the graph. The difference is in the way of implementation as animation. This animation was created using HTML Canvas technology. The last section, where the data can be seen is a table where data were added over time as well as chart and animation before. In this table Angular databinding [6] is used to set received object as one row with their properties. As simulation runs, the Angular adds new rows to table dynamically. This system is not only about the real-time rendering of data, but also for later viewing and processing of them. On the site of simulations, we can see all the entries for the currently logged in user - Fig. 14. The list is obtained from MongoDB using Angular $http.get(url, callback) function from web client to our Node.js server, which can have access to database.
When the one of the results is opened, the output looks the same as in Fig. 7, but it is possible to set data sampling and time of simulation. The second option is about time rendering. There are two options: to see data output immediately or soft real-time as it was �irstly run.
�� �i��ali�a��� �� �i�t�al �a���at��y i� �i�e� Reality Modern forms of education are now realized on the basis of the development of new ICT technologies (e.g. interactive applications made in 3D engine [17], virtual reality or mixed reality). Visualisation of process modelling, identi�ication and control of complex mechatronic systems, elements and drives using virtual and mixed reality allows students to get a much better and quicker understanding of the studied subject compared to conventional teaching methods.
Journal of Automation, Mobile Robotics & Intelligent Systems
Fig. �. �rap� and �nima�on of pro�e��le mo�on in �x� y� posi�on
VOLUME 12,
N° 1
2018
Fig. �. �a�le of sa�ed simula�on for �urrently logged user direct view of a physical, real-world environment whose elements are augmented (or supplemented) by computer-generated sensory input such as sound, video, graphics or GPS data. Augmented reality is an overlay of content on the real world, but that content is not anchored to or part of it. The real-world content and the CG content are not able to respond to each other (Fig. 10 and Fig. 11).
Fig. 10. Example of augmented reality
Fig. �. �a�le data � �me� x� y� �elo�ity �alues of pro�e��le mo�on experiment 5.1. Differences Between Virtual, Augmented and Mixed Reality Virtual reality (VR) replicates an environment that simulates a physical presence in places in the real world or an imagined world, allowing the user to interact in that world. Devices for virtual reality are Google Cardboard, HTC Vive, Oculus Rift, etc. Augmented reality (AR) is a live, direct or in-
Fig. 11. Example of augmented reality 19
Journal of Automation, Mobile Robotics & Intelligent Systems
Mixed reality (MR) is the merging of real and virtual worlds to produce new environments and visualisations where physical and digital objects co-exist and interact in real time. MR is an overlay of synthetic content on the real world that is anchored to and interacts with the real world. The key characteristic of MR is that the synthetic content and the real-world content are able to react to each other in real time. Technologies for mixed reality are Microsoft HoloLens (Windows Mixed Reality platform), Android ARCore and Apple ARKit.
VOLUME 12,
N° 1
2018
tely new segment of mixed reality. Mixed reality has unquestionable advantages over virtual reality, as the user perceives a real world and also a virtual world in the same time. The use of this feature is in practice undisputed and it is assumed that mixed reality will become a new standard in many areas such as education, marketing, modeling of complex mechatronic systems, etc.
Fig. 1�. �i�ro�o� �olo�en� – mixed reality appli�a�on (Volvo) [18]
Fig. 1�. Example of mixed reality – �i�ro�o� �olo�en�
Fig. 13. Example of mixed reality – Android ARCore and Apple ARKit
20
Nowadays, there is a trend of using interactive 3D applications and virtual reality in many prestigious universities. Very interesting project is a virtual clinic [9]. This project is supported by the University of Miami or Charles R. Drew University of Medicine and Science in Los Angeles. This interactive application offers an insight into the actual functioning of a larger clinic, and they can also try to diagnose patients. Students are thus trained through a real experience with the health system, but this complex system is modelled and simulated in virtual reality. There are also interactive applications from Animech Technologies. This company offers many education modules like Virtual Car, Virtual Truck or Virtual Gearbox [16]. Using these applications students can understand the functioning of mentioned devices and they can look into their interior and detach their individual components in detail. An absolute novelty is Microsoft HoloLens [11], the arrival of which has led to the emergence of a comple-
For Microsoft HoloLens there are more education applications. Application HoloTour [3] provides 360-degree spatial video of historical places like Rome or Peru. The application complements 3D models of important landmarks that have not been retained or supplementary holographic information about elements in the scene. Application HoloAnatomy [10] allows interactive education of anatomy of the human body. The advantage is that if the application is used by more users in the same time, everyone sees the same model of a part of the human body. This allows an interaction between students that results in a signi�icant multiplier education effect. From technical �ields there is an application called HoloEngine [1]. This application allows understanding of the complex 3D mechanical structures of the combustion engine. The application allows you to see the engine in the air, start it and even look inside it and closely monitor the mutual interaction of the mechanical parts. 5.2. Virtual Laboratory in Mixed Reality
Node.js virtual laboratory
UDP stream
Unity application in HoloLens
Fig. 1�. ���eme – vi��ali�a�on of virt�al la�oratory in mixed reality developed in Unity For development Unity engine was used. Proposed Unity application for Microsoft HoloLens brings a visualisation of results from described Node.js virtual
Journal of Automation, Mobile Robotics & Intelligent Systems
laboratory in mixed reality. By this application, students get better insight into the results of the experiment. It was needed to connect Unity application with Node.js laboratory. There is a free library for Unity called Socket.IO for Unity [8] which was used in the proposed application. In Fig. 16, it is possible to see the results from the virtual laboratory in Unity engine. The application was deployed on Microsoft HoloLens. The results in mixed reality, you can see in Fig. 17.
VOLUME 12,
N° 1
2018
was not in many of registered users, but only when we run multiple simulations in Matlab. In our test computer – MacBook Pro there was already a problem with two parallel simulations. It can be improved using a powerful server for Matlab calculations. The work is not over yet and StarkLab can be extended with another interesting functionality such as the creation of a uni�ied protocol for data interchange. Suitable would be also interfaces for other calculation and simulation software. Matlab deployment on a separate server with an available domain would help to availability. Another interesting functionality would be uploading simulation and calculation scripts through a web interface. The future work will also focus on additional development for Windows Mixed Reality platform and Microsoft HoloLens headset. The source code of the virtual laboratory is available as open source at https://github.com/erichstark/.
ACKNOWLEDGEMENTS
Fig. 16. Results from virtual laboratory in Unity engine
This work has been supported by the Cultural and Educational Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic, KEGA 030STU-�/2017, by the Scienti�ic Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic under the grant VEGA 1/0733/16 and VEGA 1/0819/17, and by the Tatra banka Foundation within the grant programme Quality of Education, project No. 2017vs022 (Collaborative education applications in mixed reality for mechatronics).
AUTHORS
Fig. 17. Results from virtual laboratory in mixed reality ��i�roso� �olo�ens�
6. Conclusion After the experience with this kind of development, we assess that the creation of virtual laboratory platform on Node.js development was easier thanks to the use of JavaScript on the server and client side. We thought that due to the single thread loop of Node.js would handle more clients and simulations than the similar solution on a different platform. The problem
Erich Stark∗ – Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, Ilkovicova 3, Bratislava, Slovakia, e-mail: erich.stark@stuba.sk, www: www.uamt.fei.stuba.sk. Erik Kučera∗ – Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, Ilkovicova 3, Bratislava, Slovakia, e-mail: erik.kucera@stuba.sk, www: www.uamt.fei.stuba.sk. Pavol Bisták – Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, Ilkovicova 3, Bratislava, Slovakia, e-mail: pavol.bistak@stuba.sk, www: www.uamt.fei.stuba.sk. Oto Haffner – Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, Ilkovicova 3, Bratislava, Slovakia, e-mail: oto.haffner@stuba.sk, www: www.uamt.fei.stuba.sk. ∗
Corresponding author
REFERENCES
[1] 360world Europe Kft. “Holoengine”, 2016. [2] Contributors. “shell.js”, 2017.
21
Journal of Automation, Mobile Robotics & Intelligent Systems
[3] M. Corporation. “Holotour”, 2017.
[4] Q. Fang. “Jsonlab: a toolbox to encode/decode json �iles”, 2016.
[5] L. Gomes and S. Bogosyan, “Current trends in remote laboratories”, IEEE Transactions on Industrial Electronics, vol. 56, no. 12, 2009, 4744– 4756, 10.1109/TIE.2009.2033293.
[6] M. Hevery and team. “Angular framework”, 2016.
[7] Z. Nedic, J. Machotka, and A. Nafalski, “Remote laboratories versus virtual and real laboratories”. In: 33rd Annual Frontiers in Education, 2003. FIE 2003., vol. 1, 2003, T3E–1–T3E–6 Vol.1, 10.1109/FIE.2003.1263343. [8] F. Panettieri. “Socket.io for unity”, 2014.
[9] D. Parvati, W. L. Heinrichs, and Y. Patricia, “Clinispace: a multiperson 3d online immersive training environment accessible through a browser”, Medicine Meets Virtual Reality 18: NextMed, vol. 163, 2011, 173.
[10] S. Prajapati, E. Madrigal, and M. T. Friedman, “Acquisition, visualization and potential applications of 3d data in anatomic pathology”, DISCOVERIES, vol. 4, no. 4, 2016, 1, 10.15190/d.2016.15.
[11] P. Rauschnabel, A. Brem, and Y. Ro. “Augmented reality smart glasses: De�inition, conceptual insights, and managerial importance”, 07 2015.
[12] I. Santana, M. Ferre, E. Izaguirre, R. Aracil, and L. Hernandez, “Remote laboratories for education and research purposes in automatic control systems”, IEEE Transactions on Industrial Informatics, vol. 9, no. 1, 2013, 547–556, 10.1109/TII.2011.2182518.
[13] E. Stark. “Virtual laboratory using javascript on the server side (in slovak)”. Master’s Thesis, 2016. [14] M. team. “Web access”, September 2017.
[15] V. team. “The philosophy of virtual labs”, December 2017.
[16] A. Technologies. “Animech technologies showreel 2013”, 2013. [17] A. Thomas. “Variant: Limits”, 2017.
[18] E. Uhlemann, “Connected-vehicles applications are emerging [connected vehicles]”, IEEE Vehicular Technology Magazine, vol. 11, no. 1, 2016, 25– 96. [19] G. Vallabha. “Real-time pacer for simulink”, September 2016.
[20] R. R. Wright, “Using 3 dimensional simulation in nursing education”. In: S. Gennaro, ed., 43rd Biennial Convention (07 November-11 November 2015), vol. 1, no. 1, Las Vegas, Nevada, USA, 2015, 1.
22
VOLUME 12,
N° 1
2018
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
The North Sea Bicycle Race ECG Project: Time-Domain Analysis Submitted: 6th December 2017; accepted 5th March 2018
Dominika Długosz, Trygve Eftestøl, Aleksandra Królak, Tomasz Wiktorski, Stein Ørn
DOI: 10.14313/JAMRIS_1-2018/3 Abstract: Analysis of electrocardiogram and heart rate provides useful information about health condition of a patient. The North Sea Bicycle Race is an annual cycling competition in Norway. Examination of ECG recordings collected from participants of this race may allow defining and evaluating the relationship between physical endurance exercises and heart electrophysiology. Parameters reflecting potentially alarming deviations are to be identified in this study. This paper presents results of a time-domain analysis of ECG data collected in 2014, implementing K-Means clustering. A double stage analysis strategy, aimed at producing hierarchical clusters, is proposed. The first phase allows rough separation of data. Second stage is applied to reveal internal structure of the majority clusters. In both steps, discrepancies driving the separation could stem from three sources. Firstly, they could be signs of abnormalities in electrical activity of the heart. Secondly, they may allow discriminating between natural groups of participants – according to sex, age, physical fitness. Finally, some deviations could result from faults in data extraction, therefore serving in evaluation of the parameters. The clusters were defined predominantly by combinations of features: heartbeat signals correlation, P-wave shape, and RR intervals; none of the features alone was discriminative for all the clusters. Keywords: ECG, principal Component Analysis, silhouette analysis, clustering
1. Introduction The North Sea Race (Nordsjørittet) is an international cycling competition organized annually in Rogaland, western Norway, between cities: Egersund and Sandness. It is open to a wide spectrum of competitors, from amateurs to professionals. In 2014, ECG data was collected from over a thousand participants, on three days: the day of the race (14.06.2014) as well as the day before and after the race. The data set was collected as part of the North Sea Race Endurance Exercise Study (NEEDED). Continuation of this project with extended set of recorded data is planned for years 2017–2019. Additionally, long-term effects are to be studied for 20 years, until 2034. Analysis of electrocardiogram (ECG) is a valuable tool in monitoring and diagnosis of patients for
various cardiac conditions. The procedure of automatic ECG signal analysis can be performed in time domain or frequency domain and is usually divided into two steps: feature extraction and classifier designation [1]. There are various methods for feature extraction that are reported in the literature. The aspects of Principal Component Analysis (PCA) related to ECG signal processing are discussed in [2], application of customized wavelet transform (WT) in ECG discriminant analysis is described in [3], while the use of Hilbert transform for feature extraction from ECG signal was examined in [4]. Comparison of support vector machine (SVM) algorithm and artificial neural network approach (ANN) for classification of arrhythmias in ECG signal is presented in [5]. Deep learning method for active classification of electrocardiogram signals was applied in the research described in [6], while the clustering method for QRS complexes classification was applied in [7]. Measurement of ECG and heart rate (HR) during daily activity is a potential tool for early diagnosis of cardiac diseases and may also provide individualized guidance to exercise and physical training. The aim of this project is to identify ECG and HR parameters useful for differentiating normal and abnormal patterns during prolonged, high intensity endurance exercise. In this part of the study, concerning data from 2014, the objective consisted of three elements. First of all, it aimed at creating ECG processing algorithms which would found a base for future analysis. Particular focus was put on time-domain approaches. Secondly, influence of a major physical effort on electrical activity of the heart was studied. Finally, by means of data clustering algorithms, the project aimed at developing methods to detect possible individuals with ECG parameters significantly different than for most of the participants.
2. The Dataset and Software
The database consisted of 3158 ECG recordings, each of duration of 10 s, stored in .mat files. Since this project aimed at comparison of data obtained from all three collection time points, it was decided to reject participants for whom some of the recordings were missing. As a result, 996 complete sets of three recordings were obtained. The collection had to be further reduced owing to errors raised in a few cases on the stage of ECG segmentation. After removing these, further analysis was conducted for 989 participants (2967 ECG recordings).
23
Journal of Automation, Mobile Robotics & Intelligent Systems
Processing and analysis of the data was conducted using Python programming language, with particular use of packages: BioSPPy [8], SciPy [9], and scikitlearn [10], [11].
3. Data Pre-processing and Feature Extraction
The dataset provided 8-channel ECG recordings, containing signals from leads I, II, and six precordial leads (V1 to V6). In this project, however, only lead-I signal was analyzed. After the channel of interest was extracted, it was subjected to pre-processing and measurements, described in detail in the following sections of this paper. The procedure aimed at visualization of changes in the ECG signal over the three days and extraction of features relevant for comparison of data obtained from different participants.
3.1. Data Pre-processing In the initial stage of processing, the lead-I ECG signal was subjected to filtering to suppress high-frequency noise and remove baseline drift. This was done by application of a bandpass-type Finite Impulse Response (FIR) filter with cutoff frequencies of 3 and 45 Hz. The filtered signal was used to detect locations of R-peaks, which was done by Engelse-Zeelenberg approach modified by Lourenco et al. [12]. As a proofreading, for singular cases in which this method failed to reliably identify the peaks (less than 3 of them found in a ten-seconds recording), the detection was repeated utilizing the method of Christov [13]. The identified R-peaks were used as reference during extraction of heartbeat templates, defined in a time window of 0.3 s before and 0.4 s after the spike. For both procedures, algorithms implemented in the BioSPPy package [8] were used. The pre-processing stage was finalized by averaging of the heartbeat templates extracted from a single recording to improve signal-to-noise ratio [14]. Additionally, parameters referring to the heart rate (mean duration and standard deviation of R-to-R intervals) were derived.
24
3.2. Heartbeat Template Measurements Some of the features used in the further processing stage were defined on the basis of characteristic intervals and amplitudes of waveforms present in a standard lead-I ECG signal. In order to measure those, methods for searching key points (peaks of P, Q, R, S, and T waves, as well as onsets and endpoints of some of them) in the heartbeat templates were developed. The location of R-peak in the heartbeat signal was fixed, resulting from the beat extraction procedure. P wave top was defined as a maximum before the occurrence of the R-peak, excluding 0.05 s directly preceding the latter. A similar, but mirror-reflected procedure was applied for determining the top of the T wave. The Q and S points were found as local minima within a fixed, short time window before and after the R-peak respectively. The S wave endpoint, needed mainly for the purpose of ST elevation measurements, was defined as a point where the positive slope after S falls below 90% of its value at S. Articles
VOLUME 12,
N° 1
2018
Search for onsets and endpoints of P and T waves was performed following the idea described by Laguna et al. [15]. In a specific time window preceding or following the wave peak of interest (for an onset or an endpoint of the wave respectively), a point with a maximal slope is found. Moving further away from the peak, the algorithm searches for a point at which the slope attains a value of the slope specified by a threshold. The threshold is defined as a percentage (e.g. 2%) of the maximum slope value before or after the peak. In absence of such a point, a point with minimal slope within the time window (taken from the maximal slope point) is marked as the onset or endpoint. The values of thresholds and time window durations were adjusted empirically. Exemplary results of the ECG key point search are presented in Fig. 1. Each subplot presents an averaged heartbeat template for the respective day of measurements for the same participant. The found ECG points are marked as red dots.
Fig. 1. ECG key points detection – exemplary results The points were used to measure intervals and amplitudes of ECG signals. For estimation of amplitudes, the level of Q was regarded as the baseline. ST elevation was defined as difference in amplitude between the endpoint of the S wave and the onset of the T wave.
3.3. Morphological Comparison of Heartbeat Templates Another set of parameters was derived from comparison of morphology of the extracted heartbeats, either a full set of beats from one signal or a set of three averaged beats from the three days (for a given participant). To exclude correlation changes stemming from changing heart rate between the days (which influences durations i.a. of ST interval), processing in this part was conducted only on parts of the heartbeats corresponding to QRS complexes, whose shape did not exhibit any heart-rate dependency. A basic measure to compare the heartbeats is Pearson r coefficient, also referred to as Pearson product-moment correlation coefficient. Its value was computed for every pair of heartbeats within the analyzed set, creating a matrix of beat-to-beat correlation. To ensure that exclusively the shape of the beats is compared, with no influence of residual baseline drift, the coefficient was calculated using first differences of the signals. From the correlation matrix, the mean value was used as a feature for the further analysis.
Journal of Automation, Mobile Robotics & Intelligent Systems
Another aspect in beat contour analysis is the idea of morphological classification. It was developed at the University of Glasgow, as a part of their 12-lead ECG analysis algorithm [16], [17]. Following this approach, QRS complexes from the first day were iteratively compared taking into account their morphology (Pearson r coefficient). Similar peaks were grouped into a class; if similarity threshold was exceeded, a new class was created. Beats within each class were averaged to serve as templates for comparison with signals from the second and third day. Beats from these two days were assigned to this of the first-day classes to which they were the most similar. In case Pearson coefficient for a beat and each of the classes templates was falling below a specific threshold, the beat was considered an outlier. Percentage of such morphological outliers for the given participant was another feature derived in this field. 3.4. Features Definition ECG features were derived from the measurements using the above described approaches. Ten features aimed at comparing the data obtained from the three days were defined as described below. Abbreviations of the feature names, provided in the parentheses, are used later in figures presented in the results section. • Shape coefficient of P wave, defined as ratio of height of the wave to its width; the used features expressed change in this value from day 1 to day 2 or 3 (P_shape_12 and P_shape_13 respectively). • Difference in duration of QT interval on day two or three with respect to day 1 (QT_12 and QT_13 respectively). • Difference in duration of RR interval on day two or three with respect to day 1 (RR_12 and RR_13 respectively). • Change (difference) in mean correlation of heartbeat templates from the second or third recording with respect to correlation in the first day (correlation_12 and correlation_13 respectively). • Maximal ST elevation (max_ST_elev) – maximum from values measured on the three days. It was decided to choose the maximum coming from any of the days since the ST elevation itself, not necessarily its change from day to day, should be regarded as an alarming ECG feature. [18] • Percentage of morphological outliers (morph_ outliers) – percentage of beats from days no. 2 and 3 not matching to any beat class defined in day 1 for the given participant (expressed with relation to total number of beats from the three days), as defined in the previous section. Features based on differences between days are defined by subtracting value on day 2 or 3 from value on day 1. Therefore, positive values of these features indicate a decrease with respect to day 1 (shortening of intervals or decline in correlation).
4. Feature Set Analysis
Analysis of the derived set of features was performed predominantly by unsupervised clustering. Since it was noticed that clustering on the entire da-
VOLUME 12,
N° 1
2018
taset tends to yield one or more larger clusters, containing majority of the points, and a few ‘far outliers’ – points significantly separated from the majority group, it was decided to develop a two-stage procedure. After first-attempt analysis and clustering, the outliers clusters (containing less than 10% of the total number of observations) are removed and the analysis is repeated to reveal structure of the majority clusters. Each of the two stages consists of two main elements: principal component analysis (PCA) and K-means clustering combined with silhouette analysis, described in the following sections of this paper.
4.1. Principal Component Analysis Principal Component Analysis is a statistical operation aimed at reduction of dimensionality of the clustering data. It performs mapping of the observation matrix on a new orthogonal space, whose axes are referred to as principal components (PCs). The orientation of the new space is chosen such that the first principal component is aligned with the direction of the highest possible variance in the data; the same applies then to each consecutive principal component, with the assumption, that the new PC is orthogonal to all the previously defined ones. Consequently, each PC explains smaller portion of the dataset variance, expressed as eigenvalue of each component. It is then possible to reduce the dimensionality by discarding the less meaningful principal components and retaining only the first few, which in total stand for majority (e.g. 80%) of the data variance. [19] PCA is frequently applied prior to K-means clustering. It allows not only reducing computational effort by decreasing number of dimensions to be analyzed, but also suppressing the effect of possible correlation between the original features (which is referred to as whitening [20]). Furthermore, by investigation of eigenvectors of the components it is possible to evaluate contribution of each of the original features to the principal components, hence defining their statistical significance. PCA was applied to the set of features on both main stages of the analysis after data normalization. Six principal components, explaining about 80% of the data variance, were retained. The number of the PCs was chosen such that a balance was reached between dimensionality reduction and the retained portion of the variance. The data mapped on the PC space was passed to clustering and silhouette analysis.
4.2. Clustering with Silhouette Analysis Since no prior assumptions on the structure of the data were made, and the K-means clustering requires specified number of clusters as an input, silhouette analysis was launched on the dataset to determine the best number of clusters. Silhouette analysis allows validating consistency of computed clusters by comparing cohesion of each sample (describing how well it belongs to a cluster it was assigned to) and its separation from other clusters. The resulting silhouette score is expressed as a fraction between –1 and 1. A high score represents good sample classification, whereas negative values indicate that the sample might have been Articles
25
Journal of Automation, Mobile Robotics & Intelligent Systems
assigned to an improper cluster. Average silhouette score of all the samples allows assessing general consistency and validity of the clustering. [21] In this project, silhouette analysis on the PC-transformed data was performed for numbers of clusters (computed by K-means algorithm) ranging from 2 to 7. Average silhouette scores were compared and the number of clusters corresponding to the highest score (the best cluster separation) was chosen for further analysis. K-means clustering with the chosen number of clusters was applied to the dataset mapped to the reduced principal components space. The result was presented and analyzed graphically both in the PC and the original feature space.
4.3. Feature Set Analysis Framework In this section, methods of the feature set analysis are summarized and detailed sequence of operations on the dataset is presented. (a) The 10-dimensional set of features is first subjected to normalization. (b) PCA is performed to map the set to a reduced, 6-dimensional space. (c) Number of clusters is chosen by the silhouette analysis. (d) K-means clustering is applied to the PC-transformed dataset. (e) The results of clustering are presented in both feature spaces. Additionally, eigenvectors and eigenvalues are visualized to analyze statistical significance of the original features. (f) If any of the clusters contains less than 10% of all observations, the corresponding samples are removed from the original observations matrix (with all the 10 features retained). (g) Steps a-e are repeated for the corrected observations set.
5. Results and Discussion
Results of the first-stage clustering analysis are presented in Fig. 2 and Fig. 3, and for the second stage – in Fig. 4 and Fig. 5.
VOLUME 12,
Articles
2018
Fig. 6 depicts outcome of PCA. As it can be seen in the figures, 2D presentation of the results provides only a limited view and it is necessary to look at different combinations of the dimensions to observe separation between clusters. The results of clustering in the PC space and original feature space are presented using scatter plot of observations in two of the feature space dimensions (as shown in Fig. 2 to Fig. 5). It should be noted that the features have been normalized, therefore the exact displayed values should not be taken into account. The results of PCA are shown as bar plots of the eigenvectors of the components (Fig. 6). Starting from the top, the subplots refer to consecutive principal components. Statistical significance of the latter, defined as portion of the dataset variance they explain, is added to the vertical label of each subplot (marked as ExpVar). Heights of bars in the subplots correspond to contribution of the original, normalized features (whose names are listed at the bottom of the plots) to the principal components. As shown in Fig. 2, the first stage of the analysis produced expected unbalanced results: the majority (more than 90%) of observations was assigned to a single cluster (labeled as 1), while the remaining two clusters are much smaller. As presented in Fig. 2a, cluster 2 is well separated from the other two with respect to the fourth and fifth principal component, which are defined predominantly by percentage of morphological outliers and ST elevation (Fig. 6a). Indeed, this separation is explained predominantly by the first of them – as presented in Fig. 3a, cluster 2 is composed of the observations with relatively high values for morphological outliers percentage, while for most observations the values are equal or close to 0. On the other hand, cluster 0 in this projection is overlapped partially with both clusters 1 and 2. However, it is clearly separated when observed from principal components 1 and 3, both of which exhibit
Fig. 2. Result of clustering on the full dataset, in the principal component space; (a) projection on principal components 4 and 5; (b) projection on principal components 2 and 3 26
N° 1
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Fig. 3. Result of clustering on the full dataset, in the original feature space; (a) projection QT interval difference (days 1 and 2) and percentage of morphological outliers; (b) projection on correlation difference-related features
Fig. 4. Result of clustering on the restricted dataset, in the principal component space; (a) projection on principal components 1 and 4; (b) projection on principal components 2 and 3; (c) projection on principal components 5 and 6 Articles
27
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
(a)
Fig. 5. Result of clustering on the restricted da taset, in the original feature space; (a) projection on RR interval difference (days 1 and 2) and maximal ST elevation; (b) projection on correlation difference between days 1 and 2 and percentage of morphological outliers; (c) projection on the correlation difference-related features; (d) projection on differences in QT and RR intervals between days 1 and 3; (e) projection on the P-shape-related features 28
Articles
Journal of Automation, Mobile Robotics & Intelligent Systems
high dependence on correlation-related features (see Fig. 3b). Furthermore, analysis of projection of the dataset onto these two features leads to interesting observations. Majority of the points are concentrated around the (0,0) point, indicating little change in intra-recording heartbeat templates correlation from the first to the second or the third day. In some cases, the correlation is reduced with respect to the first day. However, for some participants, the correlation was considerably increased by approximately the same portion on both the second and the third day. The latter group constitutes cluster 0. Hence, for participants in this cluster, correlation on the second and third day was on comparable level, relatively high compared to day 1. This is typically not accompanied by increased percentage of morphological outliers since this feature always uses day 1 as a reference. Since the clusters 0 and 2 encompassed minor portion of the observations (1.2% and 4.8% respectively), they were excluded from further analysis and the second stage of the procedure was conducted on the points originally assigned to cluster 1, which is presented in Fig. 4 and Fig. 5. Due to a high number of clusters (7), proper visualization of separation in just two dimensions is further obstructed. The three major clusters, labeled as 0, 1, and 4, can be discriminated by looking i.a. at principal components 1 and 4 (Fig. 4a), which are dependent on maximal ST elevation and features related to QT and RR interval (Fig. 5a). However, the separation cannot be clearly visualized in just two dimensions. Possibly, this division is of lesser significance when compared to other clusters distinguished in this set. Clusters 2, 3, and 6, can be distinguished by projection onto principal components 2 and 3, defined predominantly by features associated with shape of the P wave, correlation, and morphological outliers percentage (Fig. 6b and Fig. 4b). Statistical significance of the latter was slightly lower than in the first stage of the analysis (considering its contribution to the first two principal components); however, it is still one of main components differentiating cluster 2 from others (as shown in Fig. 5b). This is particularly interesting when compared to correlation representation of the clustering result (Fig. 5c). Cluster 2 is constituted by points for which decreased correlation was indeed observed, but predominantly either on day 2 or 3, rarely on both days. On the other hand, cluster 6 exhibits improved correlation on both day 2 and 3. Similarly as in the first stage of the analysis, this does not necessarily entail an increase in the percentage of the morphological outliers. On the other hand, closer look at the P shape allows to discriminate cluster 3 (as shown in Fig. 5e). For participants belonging to this cluster, P wave was flattened (lower height-to-width ratio) in days 2 and 3 with respect to day 1. The change in shape was more prominent than observed in the other groups. Finally, cluster 5, which appears to overlap with other clusters in most of dimensions, is in fact distinctly separated with respect to principal components 5 and 6 (Fig. 4c). The fact that it was not reflected in any of the first, more important principal components
VOLUME 12,
N° 1
2018
could be attributed to relatively small size of this cluster (about 0.5% of all observations), which diminishes its impact on the total variance of the dataset. Original features that contribute the most to this component include those related with QT and RR intervals. As presented in Fig. 5d, decrease in duration of QT interval is in general correlated with increase in RR interval. For cluster 5, however, this trend does not apply. Values for RR interval overlap with other clusters, but QT interval on day 3 is shortened to a much higher extent. This effect is not present on day 2. Further detailed investigation of these cases is needed to determine whether the phenomenon is a question of improper key point localization or a sign of potential cardiac issue. Although the identified clusters are usually not distinctly separated from one another, they are defined by common trends in relation to combinations of certain features. Summary of the clustering procedure and results is presented in Fig. 7 in the Appendix.
6. Conclusions
The NEEDED study focuses on characterization patterns associated with a prolonged endurance exercise. One of its major goals is identification of parameters related to ECG and heart rate which could be used to distinguish between regular and deviated performance of the heart. In this part of the research, several potentially discriminative features were recognized. Further investigation and validation with additional data is needed to verify which of them could serve as criterions in detection of electrocardiophysiological abnormalities. In the first stage of the analysis, the crucial features were associated predominantly with correlation between the beats. The impact of the correlation-related features was slightly diminished, but still considerable during the second stage of the clustering. The other particularly meaningful features included: P shape, RR interval and QT interval, the latter two exhibiting some correlation. Interestingly, the heart rate (described by RR interval) was not always increased after the race; frequently, the direction of the change was the same on day 2 and 3 with respect to day 1. More valuable information could be extracted by comparison of these trends with additional data, including participant details (sex, age, level of physical activity) as well as data on time interval between finishing the race and collecting the recording of the individual participant. It should be noted that there was no universal feature or principal component which would provide separation between all the clusters globally. On the other hand, each cluster could be described by a combination of two to four features that made it distinguishable from the other clusters. Determination of features defining the individual clusters was facilitated by analysis of eigenvectors of the principal components. However, PCA is only based on variance prominent in one of the first principal components, what makes it only a candidate for a cluster-determining property. On the other hand, features truly significant for separation are always marked in the principal components’ eigenvectors. Articles
29
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
It should be noted that the produced model of the analyzed dataset well suits the expected structure of the test population. Participants with deviating ECG parameters constitute a minority. Most of observations fall into the normal ranges or exhibit only slight alterations of different types, reflecting physiological phenomena with ontogenetic variability. Future works include a fusion of time-domain and frequency-domain analysis of the collected ECG data. Furthermore, the dataset will be supplemented with additional information, including i.a. patients’ age, gender, the race completion time, and indication of cardiovascular system condition. This will allow to verify the results concerning significance of the ECG features derived and investigated in this paper. What is more, the supplementary data will enable introducing supervised learning methods to the analysis. The algorithm will be trained to eventually gain the ability of differentiating between natural groups of participants and reporting possible cases of alarming ECG parameters. Additional analysis will be launched for a set of competitors participating in more than one edition of the race to study long-term influence of endurance effort on cardiac physiology in professionals and amateurs.
AUTHOR
Fig. 6. Principal component analysis results: eigenvectors and explained variance portions of the six components; results of the first (a) and second (b) stage of the analysis
30
The presented method produces hierarchical structure of clusters from the dataset. This allows two-level investigation of the data structure and separate investigation of huge discrepancies and more subtle trends in the dataset. Furthermore, the hierarchy scheme is also followed in analysis of features having particular impact on the dataset partitioning. Combined with additional data, it could be used in differentiation between natural, physiological groups among the population and early detection of certain cardiac abnormalities. Articles
Dominika Długosz – Łódź University of Technology, Institute of Electronics, ul. Wólczańska 211/215, 90924 Łódź, Poland, e-mail: 195887@edu.p.lodz.pl Trygve Eftestøl* – University of Stavanger, Faculty of Science and Technology, Department of Electrical and Computer Engineering, 4036 Stavanger, Norway, e-mail: trygve.eftestol@uis.no Aleksandra Królak* – Łódź University of Technology, Institute of Electronics, ul. Wólczańska 211/215, 90924 Łódź, Poland, e-mail: aleksandra.krolak@p.lodz.pl Tomasz Wiktorski – University of Stavanger, Faculty of Science and Technology, Department of Electrical and Computer Engineering, 4036 Stavanger, Norway, e-mail: tomasz.wiktorski@uis.no Stein Ørn – University of Stavanger, Faculty of Science and Technology, Department of Electrical and Computer Engineering, 4036 Stavanger, Norway, e-mail: stein.orn@uis.no *Corresponding author
REFERENCES [1] X. Dong, C. Wang, W. Si, “ECG beat classification via deterministic learning”, Neurocomputing, vol. 240, May 2017, 1–12. DOI: 10.1016/j.neucom.2017.02.056. [2] F. Castells, P. Laguna, L. Sornmo, A. Bollmann, J. Roig, “Principal component analysis in ECG signal processing”, EURASIP J. Adv. Signal Process., 2007. DOI: 10.1155/2007/74580.
Journal of Automation, Mobile Robotics & Intelligent Systems
[3] A. Daamouche, L. Hamami, N. Alajlan, F. Melgani, “A wavelet optimization approach for ECG signal classification”, Biomed. Signal Process. Control, vol. 7, 342–349, Jul. 2012. DOI: 10.1016/j. bspc.2011.07.001. [4] D. Benitez, P. Gaydecki, A. Zaidi, A. P. Fitzpatrick, “The use of the Hilbert transform in ECG signal analysis”, Comput. Biol. Med., vol. 31, no. 5, 399–406, 2001. DOI: 10.1016/S00104825(01)00009-9. [5] M. Moavenian, H. Khorrami, “A qualitative comparison of Artificial Neural Networks and Support Vector Machines in ECG arrhythmias classification”, EXPERT Syst. Appl., vol. 37, no. 4, Apr. 2010, 3088–3093. DOI: 10.1016/j.eswa.2009.09.021. [6] M. M. A. Rahhal, Y. Bazi, H. AlHichri, N. Alajlan, F. Melgani, R. Yager, “Deep learning approach for active classification of electrocardiogram signals”, Inf. Sci., vol. 345, Jun. 2016, 340–354. DOI: 10.1016/j.ins.2016.01.082. [7] M. Lagerholm, C. Peterson, G. Braccini, L. Edenbrandt, and L. Sornmo, “Clustering ECG complexes using Hermite functions and self-organizing maps”, IEEE Trans. Biomed. Eng., vol. 47, no. 7, 838–848, Jul. 2000. DOI: 10.1109/10.846677. [8] “biosppy.signals — BioSPPy 0.2.2 documentation” [Online] Available: http://biosppy.readthedocs. io/en/stable/biosppy.signals.html#biosppy-signals-ecg. [Accessed: 16-Jul-2016]. [9] “Documentation — SciPy.org” [Online]. Available: https://www.scipy.org/docs.html. [Accessed: 29-Apr-2017]. [10] “scikit-learn: machine learning in Python — scikit-learn 0.18.1 documentation” [Online]. Available: http://scikit-learn.org/stable/. [Accessed: 29-Apr-2017]. [11] F. Pedregosa et al., “Scikit-learn: Machine learning in Python”, J. Mach. Learn. Res., vol. 12, Oct. 2011, 2825–2830. DOI: 10.1016/j.patcog.2011.04.006. [12] A. Lourenço, H. Silva, P. Leite, R. Lourenco, A. Fred, “Real Time Electrocardiogram Segmentation for Finger based ECG Biometrics (PDF) – Semantic Scholar”. [Online]. Available: https://www.semanticscholar.org/paper/Real-Time-Electrocardiogram-Segmentation-for-Louren%C3%A7o-Si lva/358eee4f2080303f1ad0c7df866b98fb8922 2d8d/pdf. [Accessed: 13-Aug-2016].
VOLUME 12,
N° 1
2018
[13] I. I. Christov, “Real time electrocardiogram QRS detection using combined adaptive threshold”, Biomed. Eng. OnLine, vol. 3, 2004, p. 28. DOI: 10.1186/1475-925X-3-28. [14] A. Gautam, Y. D. Lee, W. Y. Chung, “ECG Signal De-noising with Signal Averaging and Filtering Algorithm”. In: Third International Conference on Convergence and Hybrid Information Technology, 2008, vol. 1, 409–415. DOI: 10.1109/ICCIT.2008.393. [15] P. Laguna, R. Jané, P. Caminal, “Automatic detection of wave boundaries in multilead ECG signals: Validation with the CSE database”, Comput. Biomed. Res., vol. 27, no. 1, 1994, 45–60. DOI: 10.1006/cbmr.1994.1006. [16] P. W. Macfarlane, B. Devine, E. Clark, “The university of Glasgow (Uni-G) ECG analysis program”, Computers in Cardiology, 2005, Lyon, 2005, 451–454. DOI: 10.1109/CIC.2005.1588134. [17] “Glasgow 12-lead Analysis Program – Physician’s Guide”, Physio Control. [Online]. Available: https:// docs.google.com/viewerng/viewer?url=http:// www.physio-control.com/uploadedFiles/learning/clinical-topics/Glasgow_PhysiciansGuide.pdf. [Accessed: 21-Jul-2016]. [18] K. Wang, R. W. Asinger, H. J. Marriott, “ST-segment elevation in conditions other than acute myocardial infarction”, N. Engl. J. Med., vol. 349, no. 22, 2003, 2128–2135. DOI: 10.1056/NEJMra022580. [19] U. Demšar, P. Harris, C. Brunsdon, A. S. Fotheringham, and S. McLoone, “Principal Component Analysis on Spatial Data: An Overview”, Ann. Assoc. Am. Geogr., vol. 103, no. 1, 106–128, Jan. 2013. DOI: 10.1080/00045608.2012.689236. [20] B. Hariharan, J. Malik, and D. Ramanan, “Discriminative Decorrelation for Clustering and Classification”, in Computer Vision – ECCV 2012, 2012, 459–472. DOI: 10.1007/978-3-642-33765-9_33 [21] P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis”, J. Comput. Appl. Math., vol. 20, 53–65, 1987. DOI: 10.1016/0377-0427(87)90125-7.
Articles
31
Fig. 7. An overview of the hierarchical cluster analysis results with feature combinations defining each of the clusters
Journal of Automation, Mobile Robotics & Intelligent Systems
32
Articles
VOLUME 12, N° 1 2018
7. Appendix
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
������� ���� ��� S���� ����� D�������� �� ����� 3D ������ ��� ��������� ���� M�������������� ��bm��ed: 15th December 2017; accepted: 15th March 2018
Erik Kučera, Oto Haffner, Erich Stark DOI: 10.14313/JAMRIS_1-2018/4 Abstract: Nowadays, virtual tours are very popular and many people would like to see a virtual house before the acquisi�on of the real one. The paper demonstrates a crea�on of a virtual tour for smart house developed in Unity engine. This virtual tour is connected with microcontroller from �rduino family which has a�ached several sensors and actuators. These electronic devices react to the events in the virtual tour and vice versa.
user perceives a real world and also a virtual world in the same time. The use of this feature is in practice undisputed and it is assumed that mixed reality will become a new standard in many areas such as education, marketing, modeling of complex mechatronic systems, etc.
Keywords: virtual tour, smart house, microcontroller, Unity engine, virtual reality, mixed reality
�. �ntrod�c�on Modern forms of visualisation are now realized on the basis of the development of new ICT technologies (e.g. interactive applications made in 3D game engine [7], virtual reality or mixed reality). Visualisation of process modelling, identi�ication and control of complex mechatronic systems, elements and drives using virtual and mixed reality allows students to get a much better and quicker understanding of the studied subject compared to conventional teaching methods. Nowadays, there is a trend of using interactive 3D applications and virtual reality in virtual tours for houses, cars and other products. Also, many interactive 3D applications for education are being developed. Toyota offers modern virtual showroom [5] for their customers. This showroom was developed using Unreal Engine. There are also interactive applications from Animech Technologies. This company offers many education modules like Virtual Car, Virtual Truck or Virtual Gearbox [6]. Using these applications students can understand the functionality of mentioned devices and they can look into their interior and detach their individual components in detail. Very interesting project is a virtual clinic [3]. This project is supported by the University of Miami or Charles R. Drew University of Medicine and Science in Los Angeles. This interactive application offers an insight into the actual functioning of a larger clinic, and they can also try to diagnose patients. Students are thus trained through a real experience with the health system, but this complex system is modelled and simulated in virtual reality. An absolute novelty is Microsoft HoloLens [4], the arrival of which has led to the emergence of a completely new segment of mixed reality. Mixed reality has unquestionable advantages over virtual reality, as the
���� �� ����o�o� �olo���� � ����� ���l��� ���l����o� (Volvo) [8] For Microsoft HoloLens there are more education and virtual tour applications. Application HoloTour [2] provides 360-degree spatial video of historical places like Rome or Peru. The application complements 3D models of important landmarks that have not been retained or supplementary holographic information about elements in the scene.
2. Proposed Vision
The impulse for the project was the vision of intelligent house control in a mixed reality. Mixed reality (MR) — is the merging of real and virtual worlds to produce new environments and visualisations where physical and digital objects co-exist and interact in real time. MR is an overlay of synthetic content on the real world that is anchored to and interacts with the real world. The key characteristic of MR is that the synthetic content and the real-world content are able to react to each other in real time. This term was introduced by Microsoft when developed Microsoft HoloLens. The vision is to connect mixed reality device (Microsoft HoloLens) with hardware in smart house like devices for control of sun blinds, etc. Then the user will be able to control smart house devices using HoloLens. Useful feature of HoloLens is that it can recognize the room where the user is. This is appropriate for proposed vision as it can offer control only for devices that are presented in the same room as the user.
33
Journal of Automation, Mobile Robotics & Intelligent Systems
The question is how to connect Microsoft HoloLens with sensors and actuators? Applications for HoloLens can be developed only with Unity Engine or Microsoft A�Is. So the �irst step in the proposed vision was to connect Unity engine with hardware. For prototyping the Arduino family microcontroller was chosen. This connection of Unity engine with sensors and actuators is described in the proposed paper.
�. �ain As�ects o� �ro�osed A���ica�on
This paper describes an interactive 3D application that simulates virtual tour of the smart house and its exterior. The application is implemented in Unity engine. As it is the interactive application that responds to the perceptions and changes from the environment, it is necessary to connect it with external hardware which captures the signals from the environment and sends the data to the application. As the best candidate to solve this problem, Arduino family microcontroller has been chosen. Arduino will be connected to the computer via the USB port and connection will be established through the serial port. Through this port, the data from sensors will be sent to the application. It is important to note that communication will not run in only one direction (from Arduino to the computer) but also from the computer to Arduino. So it is possible to control actuators connected to Arduino. The scheme can bee seen in Fig. 2.
VOLUME 12,
N° 1
2018
- option turn on/off a TV using IR controller when the user is at a suf�icient distance from the TV - option to view historical data about indoor and outdoor temperature - alerting the user of the unfavourable state of the application A use-case diagram is in Fig. 3. Application
Show user menu
Turn on/o
the TV <<extends>>
Show interface of central control unit
<<extends>>
Show indoor temperature
Show outdoor temperature
<<extends>>
User
Show historical data Turn on/o the light in the room
Free movement in the house
Fig. 3. Use-case diagram
4. Sensors and Actuators The application is based on a number of the necessary sensors and actuators connected to the microcontroller: - �ire sensor - sound sensor - light sensor
Fig. �. � sc�eme �� ��e �r���sed a���ica��� The proposed application has its own data storage. This storage can be used for statistical evaluations or retrieval of historical data. The database was created using cloud service Microsoft Azure [1]. The application must meet these functional requirements: - option to move in the house and in outdoor areas - ability to view the current temperature
- user menu and ability to set COM port for Arduino microcontroller - ability to turn on/off light in rooms by loud sounds like clapping
- fan rotation on room ceilings when temperature is higher than a certain value
- triggering of �ire alarm when detecting the presence of �ire in a real environment
34
- stretching the curtains in the living room in low light conditions and vice versa
- temperature sensor
- IR receiver These sensors will be mapped in the application for a certain functionality. Also, few actuators will be used: - LED diode - buzzer
4.1. Fire Sensor One of the basic sensors of the proposed system is a �ire sensor (Fig. 4) that detects the presence of a �lame. In principle, it is a detection of infrared light with a wavelength in the range of 760 to 1100 nm. Its core parts include an infrared sensor, a potentiometer, an operational circuit ampli�ier and a LED. There are different types of these sensors, but two most wellknown are three-pin and four-pin sensors. Four-pin sensors have one pin for the analog connection. 4.2. Sound Sensor
The sound sensor (Fig. 5) is a small board with a microphone that enables sound detecting from the environment. By connecting to the analog pin, it is possible to detect the intensity of the incoming sound.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Fig. 7. Temperature sensor Fig. 4. Fire sensor
Fig. 8. IR receiver
5. Imp�ementa�on of the �pp�ica�on Fig. 5. Sound sensor 4.3. Light Sensor Light sensor (Fig. 6) is also called a photoelectric sensor because it converts light energy into electric signals. The more light it gets on the surface of the light-sensitive part, the resistance decreases. Normal value ranges from 8 to 20kΩ.
Fig. 6. Light sensor 4.4. Temperature Sensor Temperature sensor (Fig. 7) of type TMP36 is used. It is a low-voltage thermal sensor that provides a voltage output that is proportional to the sensed temperature. This device is also very easy to use and requires no external calibration. 4.5. IR Receiver
The last important sensor in proposed project is the infrared receiver (Fig. 8). It has also a build-in infrared transmitter but it is not used. As the infrared transmitter, a modern smartphone can be used.
5.1. Processing Data from Sensors As it was mentioned, the data from sensors will come from Arduino in a continuous stream. It is, therefore, important to determine the data format so that it can be easily recognize the sensor and what value the sensor has captured. On the Arduino side, an in�initely uninterrupted cycle will take place, and on the Unity side, the C # programming language will provide parsing functions, which will process the information and perform the necessary functionality. The data format is: {sensor}+”_”+{type_of_ sensor}+”_”+{measured_value} The �irst part is a characteristic string that will let us know that there is some data from the sensors. It is important to start with a particular string because if we also have other data from other sources in the application that we would like to send through our application, it might happen that we are simply mixing the data. This is a situation that should be avoided. The second part will be a unique identi�ier for individual sensors connected to Arduino. The last part will be the measured value from the connected sensors. See details in Table 1. Ta�. �. Iden��ers and �ossi��e �a�ues �or Sensors Type of sensor Light sensor Temperature sensor Sound sensor Fire sensor IR receiver
�den����er
Measured value
light temperature
from 0 to 1024 from -40 to 120
sound �lame ir
from 0 to 2014 �ire � calm signal
35
Journal of Automation, Mobile Robotics & Intelligent Systems
A very important part is the de�inition of boundary values captured on sensors when a system will perform a certain function corresponding to the measured values. On of these values is the volume for the sound sensor which will allow the system to turn on or off the lights in the room. It is important to set this value suf�iciently sensitive to clapping near the sensor but at the same time high enough to �ilter out any ambient noise. Few test has been made (Fig. 9) and it was found that good boundary value would be 90.
Measuring the sound in the room
Normal noise
Clapping
Fig. 9. Measuring the sound in the room Another boundary value in the system is the value of light in the room. If we capture values of light under the certain value (boundary value) the curtains on the ground �loor will spread out. To determine the boundary value it was necessary to make several measurements (Fig. 10) in different light conditions. The chosen boundary value is 100.
VOLUME 12,
N° 1
2018
mouse and use arrows for basic movement. For future purposes, such control is easy to map in virtual reality. That is why we decided to use a point at the center of the screen instead of a mouse control. Using Escape the user turn on/off the main menu (Fig. 12). Another important menu is GUI (Fig. 13) for temperature inspecting. This menu shows when the user is close to the central control panel in the virtual smart house. The exterior of the home has been designed to match the overall home visualisation to create the atmosphere of a luxury smart house with all the equipment from the collection of cars, a swimming pool and trees. In the interior there are many interactive points that interact with the user in a certain way. These are televisions, lighting, fans, or curtains. In Fig. 14, there is a living room of the presented smart home. It is possible to see many interactive elements. The �irst one is a television that can be turned on and off by an external controller. The second one is the curtains that pull and stretch automatically depending on the intensity of the light in the home (i.e. light sensor connected to Arduino) and the third one is the ceiling fans that are spinning at an excessive interior temperature. Fig. 15 shows the light in the house that can by controlled by clapping. If the user is close to the light and claps, the light turns on or off. For this functionality, the sound sensor is used as it was stated. On each �loor, there is a control unit on the wall (Fig. 16). When the user focuses and clicks on it, the menu (Fig. 13) of the central unit opens. In Fig. 17 there is the exterior of the smart house.
6. Conclusion
Fig. 10. Measuring the light value The last boundary value is the temperature that was set to 27. 5.2. Classes and Data
Class diagram can be seen in Fig. 11. Used MySQL database is closely linked to the API that provides the interface between the database and the application. This API carries the RESTful service characteristics. For proposed needs, it is not necessary to use all CRUD operations. Three unique URIs were de�ined through which it is possible to access database data (Table 2). 5.3. Menu and Interface
36
The interface of the application should be simple and understandable. For the proposed application, it is best to use the �irst person view as it offers the most realistic experience. It is possible to rotate with the
Nowadays, there is a trend of using interactive 3D applications and virtual reality in virtual tours for houses, cars and other products. This paper describes an interactive 3D application that simulates virtual tour of the smart house and its exterior. The application is implemented in Unity engine. As it is the interactive application that responds to the perceptions and changes from the environment, it is necessary to connect it with external hardware which captures the signals from the environment and sends the data to the application. In future research, it would be interesting to use this experience and develop the application for mixed reality (e.g. Microsoft HoloLens) that will communicate with real sensors and actuators. In this way, it will be possible to control a real smart house using a mixed reality application.
ACKNOWLEDGEMENTS
This work has been supported by the Cultural and Educational Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic, �EGA 030STU-4/2017, by the Scienti�ic Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic under the grant VEGA 1/0733/16 and VEGA 1/0819/17.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
SerialPortLineReader - m_SerialPort: SerialPort = null - m_ReadLoop: Thread = null - m_LockHandle: object = new object() - m_Lines: List<string> = new List<string>() «property» + IsDataAvailable: bool
Distance + player: GameObject + distance: float
+ ReadLine (): string - _ThreadFunc ()
Update ()
«use»
ReadSensors
ExitToMenu
+ temperatureResult: decimal + serialPort: SerialPort + inputPort: InputField + fireAlarm: GameObject + sensorsPanelCheck: GameObject + isStartGame: bool = false - reader: SerialPortLineReader isActive: bool = true - restartTime: float - delayTime: float = 3f - comPort: string + fifthRoomLights: GameObject + thirdRoomLights: GameObject + sixthRoomLights: GameObject + eigthRoomLights: GameObject - currentFireTime: float - totalFireTime: float = 7f - isFire: bool = false - line: string - movieTexture: MovieTexture - tvDistance: float = 3.5f + tv1: GameObject + tv2: GameObject + tv3: GameObject + tv4: GameObject + tv5: GameObject + tv6: GameObject + tv7: GameObject
+ + + + + + +
overlay: GameObject menu: GameObject player: GameObject centerCursor: GameObject central: GameObject sensorsPanelCheck: GameObject home: GameObject
CentralInformations + dateTime: Text Start () Update ()
Update ()
Start () Update ()
ActualExternalWeather
«use»
SetTemperature
«use»
BeaconController - lightSource: Light + speed: float = 720
+ central: GameObject - temperature: string Start () + setExternalTemperature () + setInternalTemperature ()
Start () Update () parsePort (line : string) responseIR (value : string) playOrStopTv (movieTexture : MovieTexture, tv : GameObject) responseFire (value : string) responseSound (value : string) responseTemperature (value : string) responseLight (value : string): IEnumerator + checkSensors () + openPort ()
«use»
+ temperature: Text - currentIP: string - currentCountry: string - currentCity: string + finalTemperatureMax: string + finalTemperatureActual: string Start () SendRequest (): IEnumerator + getTemperature (): string + getTemperatureMax (): string
HouseDBLoad
«use»
«use»
RotateVentilator isHigh: bool = false speed: float = 0
+ graph: LineChart + errorLabel: GameObject + graphPanel: GameObject + central: GameObject - externalTemperature: string - internalTemperature: string - graphData: ChartData2D - dataExternal: ArrayList - dataInternal: ArrayList + labelLine: GameObject + axisXLabel: GameObject + axisYLabel: GameObject - lineLabels: List<Text> = new List<Text>() - lineXLabels: List<Text> = new List<Text>() - lineYLabels: List<Text> = new List<Text>() - isActive: bool = false - isError: bool = false - parseArray: JSONArray Start () Update () + SendRequest (): IEnumerator + UpdateLabels ()
Start () Update () changeRotation (): IEnumerator checkIsHight ()
EnableCentralUnit + centralUI: GameObject + central: GameObject OnMouseDown () «use»
OpenCentral + distanceToSee: float whatHit: RaycastHit + central: GameObject + center: GameObject + player: GameObject Update ()
ExitOnClick + home: GameObject + Exit ()
Fig. 11. Class diagram Tab. 2. Proposed API for MySQL database URI
HTTP method
uri_of_server/all uri_of_server/{id} uri_of_server/add
GET GET POST
Response data type JSON JSON boolean
Description all records One record with id Positive/negative answer according to the success of the operation
37
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Fig. 15. Lights in the bedroom
Fig. 12. Main menu
Fig. 16. Central control unit
Fig. 13. Temperature menu
Fig. 17. Exterior
Fig. 14. Interior of the living room
AUTHORS
38
Erik Kučera∗ – Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, Ilkovicova 3, Bratislava, Slovakia, e-mail: erik.kucera@stuba.sk, www: www.uamt.fei.stuba.sk. Oto Haffner – Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, Ilkovicova 3, Bratislava, Slovakia, e-mail: oto.haffner@stuba.sk, www:
www.uamt.fei.stuba.sk. Erich Stark – Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, Ilkovicova 3, Bratislava, Slovakia, e-mail: erich.stark@stuba.sk, www: www.uamt.fei.stuba.sk. ∗
Corresponding author
REFERENCES
[1] M. Copeland, J. Soh, A. Puca, M. Manning, and D. Gollob. “Microsoft azure and cloud computing”. In: Microsoft Azure, 3–26. Springer, 2015. [2] M. Corporation. “Holotour”, 2017.
[3] D. Parvati, W. L. Heinrichs, and Y. Patricia, “Clinispace: a multiperson 3d online immersive training environment accessible through a browser”, Medicine Meets Virtual Reality 18: NextMed, vol. 163, 2011, 173.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
[4] P. Rauschnabel, A. Brem, and Y. Ro. “Augmented reality smart glasses: �e�inition, conceptual insights, and managerial importance”, 07 2015.
[5] K. Sloan. “Rotor brings toyota showroom 360 to life with unreal engine”, 2016.
[6] A. Technologies. “Animech technologies showreel 2013”, 2013. [7] A. Thomas. “Variant: Limits”, 2017.
[8] E. Uhlemann, “Connected-vehicles applications are emerging [connected vehicles]”, IEEE Vehicular Technology Magazine, vol. 11, no. 1, 2016, 25– 96.
39
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
���������� � ������ R��������� M���� ���� �� ����� M��������� ��������� ������ed: 8th March 2018; accepted: 27th March 2018
Artur Nowosielski, Piotr A. Kowalski, Piotr Kulczycki DOI: 10.14313/JAMRIS_1-2018/5 Abstract: This �a�er �resents a �ro�osal o� a model error mi�ga�on te�hni��e based on the error distrib��on anal�sis o� the original model and �rea�ng the addi�onal model that tem�ers the error im�a�t in �ar���lar domain areas iden��ed as the most sensi�ve. �oth models are then �ombined into single ensemble model. The idea is demonstrated on the trivial two-dimensional linear regression model. Keywords: ensemble model, error mi�ga�on, regression model, linear regression, FPA, RSS, RSE
�. ��trod�c�o�
40
Error measurement is one of the fundamentals of mathematical modelling. �y de�inition, error is a measure of a modelled parameter value disturbance from its expected value. The expected value is a theoretical value for the in�inite population, which means it�s impossible to get its exact value. For the machine learning purposes trained with a �inite collection of samples, expected values are calculated from that sample set and model error is measured against that empirical expected value. Depending on particular modelling technique, model may have systematic error, that is, error dependant on one of the input parameters. Such an error is sometimes referred to as bias or skew. Also, model may yield higher error in some particular ranges of input parameter domains. In such cases, model can be extended by a component that mitigates the error impact, that also depends on the input parameter or parameters. Of course, such a component may be included directly to the original model. However, there is a variety of cases when this should not or could not be done. For example, model may be closed, immutable component, provided by some external service or a legacy one. Also, recalculating the whole model may be expensive in terms of computing power or data may start to get burden with a skew after the model got trained. The already running model can be hard to recon�igure on production environment or its recon�iguration may cause downtime whereas it may be required to work with no down-time, for example because of the service level agreements. For those reasons, this paper presents alternative approach: building a separate error model and combining it with the original model in the ensemble model, that sums the output of the actual phenomenon model with a �ix provided by the error mitigation model. It needs to be stated that presented approach is just a problem mitigation
rather than �ix for the root-cause. This is just another model and may have the same problems as any other model. However, thanks to being limited to subset of input variables and being focused on another dimension of the modelling goal (difference between training sample instead of absolute value) there are cases when it performs well. Model ensembling has proven to be an effective way of combining multiple models for the sake of increasing the output accuracy over the single-technique models [2] [3]. However, a typical use case is to combine multiple models made with different techniques and then judge which output is the best or aggregate all the outputs into single model response, for example by taking average, weighted average or sum of multiple components. This approach is popular in combining classi�ication and clusterization models.
2. Model In order to present the proposed approach, a simple linear regression model will be discussed brie�ly. The bike sharing system data set [1] is used. The set presents a number of a municipal bike rentals (registered and occasional users) in the Washington metropolitan area. Input parameters include weather conditions (temperature, wind, humidity) and time of year and day. Floating point parameters are normalized to a range of [0, 1]. The data �ile includes 1�3�� entries representing hourly registered data points. As there is no outliers nor incomplete entries, each record contains 16 features. Output variable is a total count of bike rentals, that spans both casual and registered users. For the idea demonstration, a single input variable of hours of day is used. Model accuracy is measured with two common error metrics: the Residual Sum of Squares (or sum of squared errors; RSS) and the Residual Standard Error (RSE). The RSS is more convenient to use as the optimization goal for the algorithm, however its values are unintuitive when it comes to human interpretation, because they are orders of magnitude higher than actual output values. It is calculated as follows: RSS =
n ∑ i=0
(ϵi ) =
n ∑ i=0
(ri − mi )2 ,
(1)
where r is the bike rentals vector and m is the vector of model output values calculated for the same input data as corresponding ri samples. The RSE is calculated on
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Fig. 1. Average and median bike rentals count and modelling curve the basis of the RSS with the following formula: √ RSS RSE = n−p−1
(2)
Where n is a number of samples and p is a number of optimized parameters in a model, that is regression equation coef�icients. The whole n−p−1 value is often referred to as a number of degrees of freedom. Figure 1 presents average and median number of rentals by each hour (yellow and green dots, respectively), as well as a curve representing the model output and the RSE error per each hour (red dots). The model is a linear regression model calculated as a 3rd order polynomial and thus has four parameters. The following equation expresses a general model formula: fm (x) = w3 x3 + w2 x2 + w1 x + w0
(3)
where w if the coef�icients vector. �oef�icients are optimized with the Flower Pollination Algorithm [4] and the formula with supplied coef�icients is: fm (x) = −0.18048221x3 2
+ 4.647178x − 8.96642054x
(4)
+ 20.47717682
The RSE of this model is 122.58.
3. Error Model
The �irst step in error distribution analysis is a visual assessment of error metric value. Figure 2 presents a curve representing a number of rentals per each hour of day estimated by the model together with red dots marking the RSE value per hour. It is visible that error is relatively higher in rush hours, that is, at 7 - 8 am and 5 - 7 pm. In case of more sophisticated models with multiple input variables, they do not have equal impact on the
output variable. More formally, input variables’ impact can be compared by calculating and comparing some kind of an input importance measure. The measure depends strongly on selected modelling method. For example, in a model where all the inputs are embedded linearly into a model equation, input importance can be directly inferred by taking an absolute coef�icients values. Similarly in neural networks, where a notion of input weight is one of fundamental concepts. Similarly as in the actual domain modelling, two approaches can be distinguished when it comes to error modelling. The �irst one is based on a detailed analysis of the error distribution over the input parameters. This approach is useful when error distribution can be easily aligned to a common known function, such as a logarithm or linear. The second one is a black-box approach, where error distribution is not a subject of a detailed analysis, but a metaheuristic algorithm is applied to align best function or best coef�icients to a prede�ined class of functions, for example polynomial function. Error value distribution analysis lets to choose which input variables should be involved in the error model. However, both error metrics discussed previously are mean metrics, which means they miss important information about a sign of the difference between empirical samples and estimated value and also about the sign of the estimated value itself. Another critical question while considering a proper function for the purpose is choosing an appropriate benchmark, that is whether it should refer to average values of the training samples set or to the median or, possibly, some other measure. This decision depends on the model output value variance. In case of highly-variable values, any central tendency measure may be inappropriate and quartiles or n-th deciles could work better. Error mitigation function has multiple desired features. Obviously, it should decrease error value at least at some sensitive points, while not increasing it at the same time throughout whole domain. The function should not �it to the training data too precisely. Too
41
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
strict alignment to the training data may result in yielding worse results when applied to a real data. There are over�itting-prevetion techniques, such as crossvalidation that can be used for both submodels. As stated previously, error is higher in rush hours. Example error mitigation component formula is: 0, if x ∈ [0, 7) 2 −200(x − 8) + 250, if x ∈ [7, 10) −40, if x ∈ [10, 14) fe (x) = −60, if x ∈ [14, 17) 100, if x ∈ [17, 19) 0 if x ∈ [19, 23) (5) where x is the input variable. Ranges, constant values and function coef�icients were chosen arbitrarily. �oth models are in fact independent of each other and can be created using any technique. Figure 3 displays the error mitigation function graph.
4. Ensemble Model
When the error model is ready it needs to be combined with the original model of the discussed phenomenon. There are multiple ensembling methods. However, in the discussed case, both sub-models have clearly de�ined roles. The initial model is responsible for estimating the actual result and is focused on dealing with all the input data. There is also the error model, that tries to mitigate the original model’s skew and it estimates the error, not the value itself. That makes most of commonly used techniques, such as voting, stacking, blending or bucketing unapplicable for the purpose. The presented application uses simple sum function that sums the basic model output and the error mitigation model output. The general formula is: f (X) = fm (X) + fe (X)
(6)
where fm () is the original model, fe () is the error model and thus f () is the ensemble model. X is the input samples vector. The RSE of the ensemble model is 114.22, and is 7% lesser than RSE of the initial model.
5. Summary
This paper presented an idea of ensembling a linear regression model together with additional model that decreases average error in particular parts of the input variable domain by adding/subtracting a constant or function depending on input variable value to/from an estimated output value. The trivial twodimensional example is used to demonstrate the idea.
AUTHORS
42
Artur Nowosielski∗ – Findwise Sp. z o.o., 00-023 Warsaw, Poland, e-mail: artnowo@gmail.com, www: Findwise. Piotr A. Kowalski – AGH University of Science and Technology, 30-059 Cracow, Poland, e-mail: pako-
N° 1
2018
wal@ibspan.waw.pl, www: Faculty of Physics and Applied Computer Science. Piotr Kulczycki – AGH University of Science and Technology, 30-059 Cracow, Poland, e-mail: kulczycki@ibspan.waw.pl, www: Faculty of Physics and Applied Computer Science. ∗
Corresponding author
REFERENCES
[1] H. Fanaee-T and J. Gama, “Event labeling combining ensemble detectors and background knowledge”, Progress in Arti�icial Intelligence, 2013, 1– 15, 10.1007/s13748-013-0040-3.
[2] A. Janusz, T. Ta�ma�er, and M. S� wiechowski, “Helping AI to Play Hearthstone: AAIA’17 Data Mining Challenge”. In: M. Ganzha, L. Maciaszek, and M. Paprzycki, eds., Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, vol. 11, 2017, 121–125, 10.15439/2017F573. [3] Q. H. Vu, D. Ruta, and L. Cen, “An ensemble model with hierarchical decomposition and aggregation for highly scalable and robust classi�ication”. In: M. Ganzha, L. Maciaszek, and M. Paprzycki, eds., Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, vol. 11, 2017, 149–152, 10.15439/2017F564. [4] X.-S. Yang. “Flower Pollination Algorithm for Global Optimization”. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Arti�icial Intelligence and Lecture Notes in �ioinformatics), volume 7445 LNCS, 240–249. 2012.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Fig. 2. RSE per each hour and modelling curve
Fig. �. Error mi�ga�on func�on plo�
Fig. 4. Ensemble model curve and RSE values per hour of day 43
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Optimization of Membership Function Parameters for Fuzzy Controllers of an Autonomous Mobile Robot Using the Flower Pollination Algorithm Submitted: 2nd March 2018; accepted: 30th March 2018
Oscar R. Carvajal, Oscar Castillo, José Soria
DOI: 10.14313/JAMRIS_1-2018/6 Abstract: In this work we describe the optimization of a Fuzzy Logic Controller (FLC) for an autonomous mobile robot that needs to follow a desired path. The FLC is for the simulation of its trajectory, the parameters of the membership functions of the FLC had not been previously optimized. We consider in this work with the flower pollination algorithm (FPA) as a method for optimizing the FLC. For this reason, we use the FPA to find the best parameters with the objective of minimizing the error between the trajectory of the robot and the reference. A comparative study of results with different metaheuristics is also presented in this work. Keywords: Flower Pollination Algorithm, membership functions, optimization problems, fuzzy system
1. Introduction The use of fuzzy logic as a control technique has been gaining popularity in control systems [1], [2], [3], [4], classic control is less intuitive so in many applications due to the complexity of implementation we would rather use fuzzy systems than classic control systems, nowadays there are a lot of applications that are working with fuzzy logic and they have been growing rapidly [5], [6]. The FPA was introduced by Xin-She Yang in 2012 [7], it has been used in application areas such as classification, pattern recognition and in this case for optimization. The FPA is based on the inspiration of the flower pollination process when the pollen is moved by pollinators such as insects or animals like honeybees, kind of birds, butterfly, etc. The rest of the paper is structured in the following way. In Section 2 we describe the FPA algorithm, its performance and characteristics, the equations that represent the functionality of the global and the local search and some of the applications of this algorithm. In Section 3 we show the design of the Fuzzy Logic Controller, then in Section 4 we define the parameters that we have to optimize, in Section 5 we describe the results and simulation of the robot. Finally, in Section 6 there is a conclusion of this work. 44
2. Flower Pollination Algorithm The Flower Pollination Algorithm is a metaheuristic introduced in 2012 by Xin-She Yang, some of the applications of this algorithm are mainly to solve problems with single objective or multi-objective problems. The FPA has been inspired in a natural process called pollination of flowers, there are thousands of species of plants in this natural process, it is the transfer of pollen from a flower to another flower ether from the same plant or a different plant. The pollination process needs two parts, the first one, is the pollinator and the second is the plant with at least a flower, there are two types of pollinators, they are abiotic and biotic, abiotic like the air, and water, and biotic can be animals and insects, like butterflies, bees, and some species of animals, for example some kinds of bats. A study shows us that the 90 percent of the pollination is performed in the biotic way, so the 10 percent is carried out in the abiotic way. The biotic way can be view as the global search of the algorithm, and the abiotic way can be as the local search of the FPA. Sometimes the biotic way is called cross pollination and the abiotic way self-pollination. There are four characteristics in this method, they are the following [7]: 1. Biotic and cross-pollination can be considered processes of global pollination, and pollencarrying pollinators move in a way that obeys Levy flights (Rule 1). 2. For local pollination, abiotic pollination and selfpollination are used (Rule 2). 3. Pollinators such as insects can develop flower constancy, which is equivalent to a reproduction probability that is proportional to the similarity of two flowers involved (Rule 3). 4. The interaction or switching of local pollination and global pollination can be controlled by a switch probability p ∈ [0, 1], slightly biased toward local pollination (Rule 4). The four characteristics of the FPA have been converted in equations that describe the features of this algorithm. In the global pollination the pollen is carried out by animals and insects that can travel to longer distances and a wider range. Equation 1 is for the global pollination [7]. x it +1 = x it + γ L(λ )( g* − x it )
(1)
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Where is the pollen i at the iteration and is the best solution in the current iteration. The in Equation 2 is the parameter that represents the strength of pollination and it is based on the Levy Flights [7]. L ~
λΓ(λ )sin(πλ /2) 1 , s s0 > s π s1+λ
(2)
Here is the standard gamma function and it is valid for larger steps greater than zero. Global Pollination can be represented in equation 3 [7]: x it +1 = x it + (x tj − x kt ) (3)
Where and are pollen of different plants of the same species. The parameter is for a random search for a uniform distribution in [0, 1].
3. Fuzzy Controller
The FLC is of Mamdani type, as the Figure 1 shows us it has two inputs and two outputs. Where the inputs are the error in angular velocity (ew) and the error in linear velocity (ev). The outputs are torque one (T1) and torque two (T2) that need to have each wheel of the robot in this case the robot has two wheels with servomotors [8], [9], [10], [11].
1. Fuzzy Controller Fig.Fig. 1. Fuzzy Controller Structure Structure The inputs and outputs membership function’s linguistic values are N, Z, and P, they stand for Negative, Zero, and Positive respectively, and they are in the range [-1,1]. The outputs of the FLC are triangle membership functions, and the input membership functions are of triangle type in the Z linguistic variable, and trapezoidal form in the N and P linguistic variables [11], [12], [13]. We have an example in Figure 2.
Fig. 2. Membership functions of the FLC
Fuzzy rules can be considered like the knowledge of an expert in a specific field, they are represented in the sequence IF-THEN to associate a condition through linguistics variable. There are 9 rules for the FLC to deal with the robot, they are the following [8], [14]: 1. If (ev is N) and (ew is N) then (T1 is N) (T2 is N) 2. If (ev is N) and (ew is Z) then (T1 is N) (T2 is Z) 3. If (ev is N) and (ew is P) then (T1 is N) (T2 is P) 4. If (ev is Z) and (ew is N) then (T1 is Z) (T2 is N) 5. If (ev is Z) and (ew is Z) then (T1 is Z) (T2 is Z)
6. If (ev is Z) and (ew is P) then (T1 is Z) (T2 is P) 7. If (ev is P) and (ew is N) then (T1 is P) (T2 is N) 8. If (ev is P) and (ew is Z) then (T1 is P) (T2 is Z) 9. If (ev is P) and (ew is P) then (T1 is P) (T2 is P)
4. Optimization of Parameters of the Membership Functions
The parameters were optimized using the FPA, the simulation model is shown in Figure 3, it is based on the kinematics of a differential robot [15], the modArticles
45
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Fig. 3. Simulation model of the differential robot
5. Simulation Results
el has a classic closed loop control system where the controller is the FLC. In Figure 2 we show you an example of the parameters of the membership functions that we have optimized. The model is called by the FPA to update all the variables of the simulation every short time to determine the actual error, this error determines the stopping criteria of the algorithm, it means that when the error is acceptable, the algorithm finishes its process and show the results. In some cases, the FPA needs more iterations for convergence to the solution and in other cases it needs fewer iterations.
For the simulation 30 experiments using the FPA were performed, we have obtained good results, an average of the error with the medium square error (MSE) is 0.00483803 and a standard deviation of 0.002779863. The parameters of the FPA that were moved manually are the following, the population size (n) between a recommended range of 10 to 25, iterations, and the probability value where a 0.8 value has been recommended since 2012. In Table 1 we show all the experiments that we have explained before.
Table 1. Experiments of optimization using the flower pollination algorithm Experiments 1
0.0046
3
0.0035
2 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
46
Articles
MSE Error
20
0.009
FPA Experiments Population
Probability
n = 23
p = 0.8
n = 25
n = 25
p = 0.9
p = 0.8
Iterations 20000 876
2182
0.006
n = 10
p = 0.8
13959
5.69E-04
n = 17
p = 0.8
1211
0.0004 0.0044 0.0062
0.000089862 0.0029 0.0064 0.0053 0.0039 0.0072
5.41E-04 0.0073 0.0065 0.0065 0.0091
8.20E-03
n = 15 n = 18 n = 17 n = 17 n = 20 n = 22 n = 24 n = 18 n = 25 n = 10 n = 11 n = 12 n = 13 n = 14 n = 16
p = 0.8 p = 0.8
2414 4219
p = 0.8
20000
p = 0.3
4080
p = 0.2 p = 0.4
10589 4610
p = 0.5
12091
p = 0.7
9547
p = 0.6 p = 0.8 p = 0.8 p = 0.8 p = 0.8 p = 0.8 p = 0.8
6415 6796 1969 2706 9451
20000 9500
Journal of Automation, Mobile Robotics & Intelligent Systems
Experiments
MSE Error
VOLUME 12,
FPA Experiments Population
Probability
n = 19
p = 0.8
15101
p = 0.9
10755
p = 0.85
1200
21
0.0085
23
0.0066
n = 20
0.0085
n = 25
22
24 25 26 27 28 29 30
Average
Standard Deviation
0.0010404 0.0041 0.002
0.0032 0.0054 0.0058 0.0014
0.00483803
0.002779863
N° 1
n = 17 n = 20 n = 23 n = 23 n = 22 n = 23 n = 23 -
-
p = 0.8 p = 0.8
p = 0.85 p=1
Iterations 630
8950 1217 2037
p = 0.95
17500
p = 0.78
9641
p = 0.83 -
-
2018
3405 -
-
Figure 4. Membership functions and trajectory of the robot with respect a reference of experiment 6
Figure 5. Experiment 6 Articles
47
Journal of Automation, Mobile Robotics & Intelligent Systems
For example, in experiment 6 we obtained an MSE of 0.000569, with a population size of 17, the probability value is 0.8, and 1211 iterations. We illustrate the result of the parameters of the membership functions, and the trajectory of the robot and we compare it with the reference, we have the result as we can see in figure 4. In Figure 5 we focus on the trajectory and the reference as we can see they are very close. Table 2 shows the comparison of the MSE of each metaheuristic, the FPA obtains a better average in all of them. The comparison is with Genetic Algorithms with type-1 and type-2 inference system and with the Ant Colony Optimization with type-2 inference system and dynamic adaptation of parameters. Table 2. Comparison of the methods for the same optimization problem MSE Average
Standard Deviation
FPA
GA + T1FS GA + T2FS ACO + T2FS [13] [13] DINAMIC [27]
0.00483803 0.438709 0.400899 0.00277986 0.050195
Experiments
30
30
0.00325 30
0.0096 0.0148 30
As we can see with the proposed method we obtained better results compared with the methods mentioned before, with the GA+TIFS and GA+T2FS the parameters of the membership functions are also moved manually, however with the ACO + T2FS the parameters were moved dynamically but we also obtained better results.
6. Conclusions
In this work we proposed a methodology to solve the control problem for the optimization of the trajectory of an autonomous mobile robot. We used a fuzzy system of Mamdani type to determine the trajectory and a bioinspired algorithm to optimize its parameters so we obtained the best FLC for the best trajectory for the simulation of the robot. We performed a comparative study with respect to other metaheuristics based on the average and standard deviations, and we have obtained better results. In future work we can consider moving the parameters of the FPA dynamically and compare with other algorithms. We have realized that the FPA is a more effective method for the optimization for the simulation of the differential autonomous mobile robot than other methods in the literature. In the future we envision using the optimization method with type-2 fuzzy controllers for the autonomous mobile robot. Of course, it is more difficult to optimize type-2 fuzzy controllers, but these can be more effective in dynamic and uncertain enviroments for the robots”. 48
Articles
VOLUME 12,
N° 1
2018
AUTHOR Oscar R. Carvajal, Oscar Castillo*, José Soria – Tijuana Institute of Technology, Tijuana BC México. * Corresponding author. E-mail: ocastillo@tectijuana.mx
REFERENCES [1] C. T. Kilian, Modern Control Technology: Com ponents and Systems, USA: Delmar, 2006. DOI: 10.1007/3-540-30368-5_19. [2] L. A. Zadeh, “A Rationale for fuzzy Control”, J. Dy namic Systems, vol. 94, no. 1, 1972, 3–4. DOI: 10.1115/1.3426540. [3] L. A. Zadeh, “Fuzzy Sets”, Departament of Elec trical Engineering and Electronics Research La boratory, vol. 8, no. 3, pp. 338-353, 1965. DOI: 10.1016/S0019-9958(65)90241-X. [4] L. A. Zadeh, “Towards a generalized theory of uncertainty”, Information Sciences, vol. 172, no. 1–2, 2005, 1–40. DOI: 10.1016/j.ins.2005.01.017. [5] H. Erdem, “A Practical Fuzzy Logic Controller for Sumo Robot Competition”. In: Ghosh A., De R.K., Pal S.K. (eds) Pattern Recognition and Machine In telligence. PReMI 2007. Lecture Notes in Computer Science, vol 4815. Springer, Berlin, Heidelberg. 2007. DOI: 10.1007/978-3-540-77046-6_27. [6] A. Hechri, A. Ladgman, F. Hamdaoui, A. Mtibaa, “Design of Fuzzy Logic Controller for Autonomous parking of Mobil Robot”, International Journal of Science and Techniques of Automatic Control and Computer Engineering, vol. 5, no. 2, 2011, 1558–1575. [7] X.-S. Yang, Nature-Inspired Optimization Algo rithms, US: Elsevier, 2014. [8] L. Astudillo, O. Castillo, P. Melin, Chemical Opti mization Algorithm for Fuzzy Controller Design, Tijuana: Springer, 2014. DOI: 10.1007/978-3319-05245-8. [9] O. Castillo, L. Aguilar, S. Cardenas, “Fuzzy Logic Tracking Control for Unicle Mobile Robots”, En gineering Letters, vol. 13, no. 2, 2006, 73–77. [10] O. Castillo, L. Amador-Angulo, J. R. Castro, M. Garcia-Valdez, “A comparative study of type1 fuzzy logic systems, interval type-2 fuzzy logic systems and generalized type-2 fuzzy logic systems in control problems”, Information Sciences, vol. 354, no. 1, p. 257–274, 2016. DOI: 10.1016/j.ins.2016.03.026. [11] O. Castillo, C. Caraveo, F. Valdez, “Optimization of fuzzy controller design using a new bee colonyalgorithm with fuzzy dynamic parameter adaptation”, Applied Soft Computing, vol. 43, no. 1, 2016, 131–142. DOI: 10.1016/j.asoc.2016.02.033. [12] O. Castillo, P. Melin, L. Amador-Angulo, O. Mendoza, J. R. Castro, A. Rodríguez-Díaz, “Fuzzy Sets in Dynamic Adaptation of Parameters of a Bee Colony Optimization for Controlling the Trajectory of an Autonomous Mobile Ro-
Journal of Automation, Mobile Robotics & Intelligent Systems
bot”, Sensors, vol. 1, no. 1, 2016, –1–27. DOI: 10.3390/s16091458. [13] O. Castillo, R. Martínez-Marroquín, P. Melin, F. Valdez, J. Soria, “Comparative study of bio-inspired algorithms applied to the optimization of type-1 and type-2 fuzzy controllers for an autonomous mobile robot”, Information Scienc es Informatics and Computer Science Intelli gent Systems Applications, vol. 192, no. 1, 2012, 19–38. DOI: 10.1016/j.ins.2010.02.022. [14] M. Lagunes, O. Castillo and J. Soria, “Optimization of membership function parameters for fuzzy controllers of an autonomous mobile robot using the firefly algorithm”. In: Fuzzy Logic Augmentation of Neural and Optimization Al gorithms: Theoretical Aspects and Real Applica tions, Tijuana, Springer, 2018, 199–206. [15] R. Sharma, D. Honc, F. Dušek, “Predictive control of differential drive mobile robot considering dynamics and kinematics”. In: Proceed ings of 30th European Conference on Modelling and Simulation, vol. 1, 2015, no. 1, 1–7. DOI: 10.7148/2016-0354. [16] G. Dudek, M. Jenkin, Computational Principles of Mobile Robotics, New York: Cambridge University Press, 2010. [17] C. Rekik, M. Jallouli, N. Derbel, “Optimal trajectory of a mobile robot using”, Computer Appli cations in Technology, vol. 53, no. 4, 2016, 348– 357. DOI: 10.1109/SSD.2010.5585508. [18] R. Zhao, D. Hwan Lee, H. Kyu Lee, “Mobile Robot Navigation using Optimized”, International Journal of Fuzzy Logic and Intelligent Systems, vol. 15, no. 1, 2015, 12–19. DOI: 10.5391/ IJFIS.2015.15.1.12. [19] R. Martínez Marroquín, O. Castillo, J. Soria, “Optimization of Membership Functions of a Fuzzy Logic”, In: Castillo O., Pedrycz W., Kacprzyk J. (eds) Evolutionary Design of In telligent Systems in Modeling, Simulation and Control. Series: Studies in Computational Intel ligence, Springer, vol. 257, no. 6, 2009, 3–16. DOI: 10.1007/978-3-642-04514-1. [20] T. Y. Abdalla, A. Abdulkareem, “A PSO Optimized Fuzzy Control Scheme for Mobile Robot Path
VOLUME 12,
N° 1
2018
Tracking”, International Journal of Computer Applications, vol. 76, no. 2, 2013, 11–17. DOI: 10.5120/13217-0608. [21] R. Martínez, O. Castillo, L. T. Aguilar, “Optimization of interval type-2 fuzzy logic controllers for a perturbed”, Information Sciences, vol. 179, no. 1, 2009, 2158–2174. [22] M. A. Sanchez, O. Castillo, J. R. Castro, “Generalized Type-2 Fuzzy Systems for controlling a mobile robot and a performance comparison with Interval Type-2 and Type-1 Fuzzy Systems”, Expert Systems with Applications, vol. 42, no. 1, 2015, 5904–5914. DOI: 10.1016/j.eswa.2015.03.024. [23] F. Valdez, P. Melin, “Comparative Study of PSO and GA for complex mathematical functions”, Journal of Automation, Mobile Robotics and In telligent Systems, vol. 2, no. 1, 2008, 43–51. [24] R. Martinez Soto, O. Castillo, L. Aguilar, R. Diaz, “A hybrid optimization method with PSO and GA to automatically desing Type-1 and Type-2 fuzzy logic controllers”, International Journal of Machine Learning & Cybernetics, 2015, vol. 6, no. 2, 175–196. DOI: 10.1007/s13042-0130170-8. [25] P. Melin, L. Astudillo, O. Castillo, F. Valdez, M. Garcia, “Optimal design of type-2 and type1 fuzzy tracking controllers for autonomous mobile robots under perturbed torques using a new chemical”, Expert Systems with Ap plications, vol. 40, no. 8, 2013, 3185-3195. DOI: 10.1016/j.eswa.2012.12.032. [26] O. Castillo, R. Martinez Marroquin, P. Melin, F. Valdez, J. Soria, “Comparative study of bio-inspired algorithms applied to the optimization of type-1 and type-2 fuzzy controllers for an autonomous mobile robot”, Information Sciences, vol. 192, 2010, no. 12, 19–38. DOI: 10.1016/j.ins.2010.02.022. [27] F. Olivas, F. Valdez, O. Castillo, C. Gonzales, G. Martinez, P. Melin, “Ant colony optimization with dynamic parameter adaptation based on interval type-2 fuzzy logic systems”, Applied Soft Computing, vol. 53, 2016, 74–87.
Articles
49
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
Block-Structured Models Composed of Nonlinear Fuzzy Dynamic and Static Parts – a Case Study Submitted: 7th November 2016; accepted: 20th June 2017
Piotr Bazydło, Piotr Marusak
DOI: 10.14313/JAMRIS_1-2018/7 Abstract: The paper addresses issues of the dynamic fuzzy TakagiSugeno models identification for multi-step ahead prediction. In the case of highly nonlinear models, standard Takagi-Sugeno models may be hard to identify if they should be designed for recurrent prediction generation. In such a case, alternative fuzzy block-structured models composed of fuzzy dynamic and fuzzy static parts may be useful. Two main benefits of the proposed models are: (1) possibility to speed-up model tuning procedure, (2) potential to fine-tune an already available, standard Takagi-Sugeno model. The benefits offered by the proposed models are illustrated using the example of identification of a nonlinear process – a system consisting of two tanks of different shapes (cylindrical and conical ones). Keywords: block-structured models, Takagi-Sugeno models, identification, modelling, fuzzy logic
1. Introduction
50
Identification of dynamic systems is an important task, as it is crucial in many applications and fields of knowledge, including process control, robotics and economy [3]. Some processes can be efficiently modeled with linear models using Least Squares Method for identification [8]. However, linear dynamic models applied to the real systems tend to be often inaccurate, due to the nonlinear character of the processes. Tuning of such models is even more difficult in the case of multi-step ahead prediction. In order to improve modeling accuracy, nonlinear models can be used. Most popular nonlinear models are based on polynomials [4], neural networks [11] and fuzzy logic [12]. An example of the nonlinear dynamic model is a model based on fuzzy Takagi-Sugeno (TS) fuzzy system [14]. This model is based on fuzzy membership functions and local linear models. Identification of these models is not an easy task, especially in the case of multi-step ahead prediction [7]. There are several identification methods strictly designed for these models, like ANFIS (Adaptive Neuro-Fuzzy Inference System) [5]. This method determines the shape of fuzzy membership functions in the first place. Then, it solves quadratic programming problem in order to identify parameters of linear local models. The main disadvantage of this method is a high likelihood of
failure, when multi-step ahead models are taken into consideration. Methods of identification of TS models can be divided into global and local approaches [1]. The global approach generates highly accurate global models, but local models are not proper linearizations of the process in selected steady-state points. In the case of the local approach, the local models are in fact linearizations of the process in several steady-state points, on the other hand, global output of such a model may be not satisfactory. In order to overcome this issue, multi-objective identification methods have been introduced [6]. Despite many attempts towards creation of a universal and highly effective Takagi-Sugeno identification method, none of them gives a satisfactory result good enough in the case of multi-step ahead prediction for any plant. Thus, in the case of many processes, individually adapted identification procedures should be used. Another example of a nonlinear model is the Wiener model, which consists of a linear dynamic block preceding a nonlinear static model [4]. Thus, the Wiener model is a composition of two different models. For example, linear dynamic part can be modeled as the ARX model and nonlinear static part can be provided as the TS fuzzy model. Such a structure can be efficiently used in Model Predictive Control (MPC) algorithms [10, 15]. In [2, 13] a Wiener model has been used to model a polymerization reactor and a distillation column respectively. In [9] an example of a fuzzy Wiener model has been given. Then, it has been shown that this model can be efficiently used in the MPC algorithms. These model belongs to the class of nonlinear block-oriented models. In general, several advantages of these models can be highlighted: low cost in identification, low computational complexity, possibility to approximate nearly all systems (with some exceptions) and block-oriented structure itself, which can be useful in for example control algorithms. Both: Wiener and TS models have some drawbacks. Wiener models (especially tuned with traditional approach) are inefficient in the case of processes with highly nonlinear dynamics, because its dynamic part is linear, although other block-oriented nonlinear models can be successfully used inidentification of systems with nonlinear dynamics [18]. Moreover, new methods can be used to improve Wiener model tuning e.g. Maximum Likelihood methods [16, 17]. These methods can be used for the reduction of problems concerning bias. On the other hand, well-
Journal of Automation, Mobile Robotics & Intelligent Systems
cks. al with part ar of er, odel 7]. of ell-
ard ultiof
ng en nt till use: sily d as ure and
VOLUME 12,
N° 1
2018
tuned TS model can be efficient for nearly all processAs Takagi-Sugeno model consists of many linear es. However, these models may be very hard to idenmodels, output of the dynamic part of the block-structify in the case of models designed for a multi-step tured model can be expressed as the normalized ahead prediction. Moreover, identification of fuzzy weighted sum of outputs of linear dynamic models: models often requires more heuristic approaches. It ∑ (1) impedes procedure of model tuning and is strongly = ∑ connected with the fact that often different heuristics where l is the number of rules, wi is firing strength have to be used for different problems. Despite these i (weighting factor) of the i-thrule and ylin issues, TS models are still considered as useful in con( k ) is the output of the i-thlocal dynamic model given by: trol engineering, because: (1) they are similar to linear models, so can be easily used in control algorithms, (2) they are considered as universal approximators (2) and (3) TS model structure enables tuning of its individual elements (e.g. individual linear models). where aij and bij are parameters of the dynamic modIn this paper, advantages of both block-structured el, na and nb define model dynamics, denotes delay, c and fuzzy models have been merged. The goal of this denotes constant value, is the output of the linear dypaper is to provide simple yet effective method for namic block in the k–m-th sampling instant, u(k – m) the improvement of Takagi-Sugeno model identifiis the input of the model in the k–m-th sampling incation. Fuzzy block-structured models composed of stant. Output of the dynamic TS model is then used as a nonlinear dynamic part and of a nonlinear static an input to the static part described by: part have been presented. Theyare combination of the two aforementioned models. In comparison to the �� � ���� ∑�� �� ∙ ����� ��� ∙ ���� �� ����� ���� ����� ∑�� ��� �� ∙ �� ���� ����� � �� � �3� ��� � ∙ ������ �� ���� = Wiener model or the TS model, the block-structured���� = ���� ���� ��� ∑�� ∙ �� � � �� ∙ � � � �3� � ��� � ��� ���� �� ∑�� ���� ��� ����� ���� �3� ���� = (3) model can be identified easier. It can be also used to ∑�� ��� ���� ���� ��� �� �� ���� ∑ � �� �� ��� improve quality of a roughly tuned TS fuzzy model. As ���� ls is the number of fuzzy rules in the static part �� �ℎ there are many effective methods for identification ofwhere ls iswhere fuzzy rules in the static in part �ℎ� � a membership function the isofthe thenumber model, µof is(·)is � �ℎ � where ls is the number fuzzy rulesand in the part where lsof isthe number of fuzzy rules in static �� � = �� ∙ ℎ� block-oriented models for strongly nonlinear systemsof the model, �� static �� the athe membership function in the th µ rule fuzzy TSof static model, � is(·)is � , �is� are �ℎ � = � � � � � ∙ ℎ� (·)is amodels in the the islinear �� �� part offuzzy the ofµ model, membership (·) isina membership function [18, 19], their structure was used for improvement th ruleofof is parameters the of the is- in �� �ℎ� � = themodel, TS static model, andconsequents �function � , �� are �� �� is is th ofTS the fuzzy static and ,and �� are fuzzy static (i.e. parameters of the local therule is-th rule of model theTS fuzzy TSmodel, static model, a0 , a1 of TS models. Main goal of this paper is to show thatparameters of linear models in the consequents of��the models). It is worth to notice that consequents of the parameters of linear models in the consequents of the are parameters of parameters linear models in local the consequents advantages of the block-oriented models can be suc- fuzzy TS static model (i.e. of the rules can be assumed constant. Then theofmodel fuzzy TS static model (i.e. parameters the local the fuzzy staticthat model (i.e. parameters of the lo- where h1and cessfully used during facilitation of identification ofmodels). of It is worth toTS notice consequents of the simplifies to: models). It is worth to notice that consequents of the of respectively, F cal be models). It isconstant. worth toThen notice consequents other models. rules can assumed thethat model h where h1and inpu F2denotes rules can be assumed constant. Then the model where the rules can be assumed constant. Then the modelrespectively, In chapter 2, the block-structured model and its simplifies to: F output flow of to: respect simplifies to: ∑��simplifies advantages are described. Chapter 3 contains de���� ∙ ���� disturbance flo inpu F2denotes ���� ��� ����� F �4� ���� = 2denot scription of an example process. In chapter 4, applitank 1 and output flow of t2 �� ∑���� ��� ����������� output parameters arf cations of ARX, Wiener and TS models to modeling ���� ∑�� ∙ � � �� disturbance flo � ���� �� �� ��� �� ∑���� ��� ����� ���� �4�∙ �� ���� = (4) of the example process are presented. All described tankdisturba 1 and 2 ��= �4� ���� ���� ∑ tank � �� �� ��� �� models have been compared using Mean Square Error ���� parameters are ∑���� ��� ����� ���� parame (MSE).Chapter 5 contains description of application After substituting (1) into (3), the proposed block After substituting into (3), the proposed block of the proposed block-structured model. Additionally, structured model can(1) be formulated as follows: structured model can be formulated as follows: advantages of block-structured 1. Block-Structured Modelsmodels highlighted in chapter 2 are demonstrated. Finally, chapter 6 con- After substituting (1) into (3), the proposed block After substituting into (3), the proposed block ���� model structured can be(1) formulated as follows: cludes the paper.block-structured models have been The proposed � � ��� � �� ∑� as follows: structured model can be formulated ∑ ∙ � ∗ � � ��� � ��� �� ∙ ���� ��� ��� ∑�� �∗� � � ���� � inspired by the Wiener models. They are composed of ���� ��� � � � ∑��� �� ∑��� �� �5� nonlinear models; see fig. 1. Both dynamic and static = � � ��� ∑ � ∙ � � 2. Block-Structured Models ��� ��� �� parts of the proposed model can be expressed as any���� ���� (5) ∑���� ��� � � � � ��� ∑����� � �� ∙ � � ��� � �� ���� ∗ ∑ nonlinear modelblock-structured (e.g. fuzzy TS models, multilayer ��� The proposed models have been ∑�� � �∑��� �� ∑∙ � � ��� �∗ � �� � ∑� ��� � � � � ��� ��� �� ∙ ���� � � ∗ ���� �� �� � ��� �� ∙ ���� �� � ∑ perceptron neural models or polynomials). This work �developing ∗ �∑����� ����� � ���� � ���∑� inspired by the Wiener models. They are composed of The motivation the ��� ����main � block-strucThe main motivation for developing the block∑��� ��� for ∑ �5� � = � ��� � ���of roughly identified is focusedmodels; on block-structured models composed of �5� nonlinear see fig. 1. Both dynamic and static models was improvement = tured structured was roughly � ∙� ��� �improvement ��models ∑ � of � ��� ∑dynamic � � � ��� ∑ � ∙ � TS fuzzy dynamics and TS fuzzy statics. �� ���� � � ��� ��� �� identified TS models by extending a model parts of the proposed model can be expressed as any dynamic TS models a model ∑���� by ���∑extending � � using low ��� �� � ∑ using low of number of additional parameters. Wiener ��� �� Wiener nonlinear model (e.g. fuzzy TS models, multilayer pernumber additional parameters. models
F
F
models have linear dynamic part, thus they may not ceptron neural models or polynomials). This work is linear dynamic part, thus they not be suitThe mainhave developing the may blockbe motivation suitablemotivation for for processes with highly nonlinear The main forhighly developing the dynamics. blockfocused on block-structured models composed of TS able for processes with nonlinear structured models was dynamics. Fuzzy improvement TS models of areroughly universal structured models was improvement of roughly fuzzy dynamics and TS fuzzy statics. TS models are by universal identifiedFuzzy dynamic TS models a model F approximators. However, in extending the caseapproximators. of fuzzy models, Howidentified TSfuzzy models bytoextending aofmodel increase of parameters leads theWiener curse of using low number ofcase additional parameters. ever, in dynamic the of models, increase paramlow number of part, additional parameters. Wiener dimensionality (though the TS models it(though is, modelsusing have linear they may not eters leadsdynamic to the curseinofthus dimensionality in models have linear dynamic part, thus they may not however, not as disruptive as in the case of Mamdani be suitable for processes with highly nonlinear the TS models it is, however, not as disruptive as in be models). suitable for with nonlinear Moreover, TS models are highly often hard to In Single Inpu dynamics. Fuzzy TS processes models are universal the case in of Mamdani models). TS models F is considered identify the caseTSof models Moreover, with recurrent 1 dynamics. Fuzzy models are universal approximators. However, the case fuzzy models, are often hard toinidentify inofthe of models with the system isF prediction (multi-step ahead prediction). approximators. However, into the casecase of fuzzy models, Fig. 1. Structure of a SISO block-structured model with increase of parameters leads the curse of Fig. 1. Structure of a SISO block-structured model with recurrent (multi-step process can increase of prediction parameters leads toahead theit prediction). curse of 2. Example control plant nonlinear dynamic and nonlinear static parts; dimensionality (though in the TS models is, disturbance in nonlinear dynamic and nonlinear static parts; dimensionality (though in the TS models it is, u – input, y – output, y – input of the static model however, not as disruptive as in the case of Mamdani to calculate ou u – input, y – output, ydyndyn – input of the static model Example identification of theasblock-structured however, not as inoften the case of model Mamdani equations, eq models). Moreover, TSdisruptive models are hard totwo In Single Inpu has been performed for the system consisting of Moreover, TS models arerecurrent often hard to F is considered In Sing identifymodels). in the case of models with 1 tanks of different shapes (cylindrical and conical ones, identify in the caseprediction). of models with recurrent the system F1is cons prediction (multi-step ahead is l Fig. 2). Such a process can be described by the Articles 51 prediction (multi-step ahead prediction). thecan sys As Takagi-Sugeno model consists of many linear 2. Example controlfollowing set of equations: process plant proce
�
Journal of Automation, Mobile Robotics & Intelligent Systems
2. Example control plant Example �� identification of the block-structured � �ℎ� � consisting ��system = � � �for � � model has been performed the �� of two tanks of�� different shapes (cylindrical and conical ones, Fig. 2).�Such �ℎ� � � �can = ��a process � �ℎbe � � described by ��of equations: the following set �� �ℎ� � = �� �ℎ� , �� �ℎ� � = �� �ℎ� , ��� ��� � �� �=ℎ��� =� � ∙ ℎ���,�ℎ�� � ��� = �������� �6� �� �ℎ� � = �� ∙ ℎ�� ,��
�3�
2018
linearized in the steady-state point h2 = 8.41 cmand F = 51 cm3/sare presented in Fig. 3 (disturbance flow FDwas assumed constant and equal to 7 cm3/s). Step responses of the nonlinear model are compared with the ones of the linearized model in Fig. 4.
��� = �� �ℎ� � � �� �ℎ� � �� �� �ℎ� � = �� �ℎ� , �� �ℎ� � = �� �ℎ� , �ℎ � (6) �� � = �� ∙ ℎ�� , �� �ℎ� � = �� ∙ ℎ� , � ��� = �������� �6�
where h1and h2denote liquid levels in tanks 1 and 2 where h1 andF1in h2 denotes denote liquid 1 andtank, 2 respectively, inputlevels flowintotanks the first respectively, F1inflow denotes input flow tank, to theFfirst tank, input to the second stands for F2denotes 3 F denotes input flow to the second tank, F stands for 2 3 output flow of the whole system and FD stands for the levels 1 and 2 whereflow h1and 2denote output ofhthe wholeliquid system andinFtanks D stands for the volumes of liquid in disturbance flow. Vdenotes 1and Vinput 2denote respectively, F flow to the first 1in disturbance flow. V1 and V2 denote volumes oftank, liquid tank 1 and 2, respectively. Values of the process input flow to the second F3process stands for F2denotes in tank 1 and 2, respectively. Valuestank, of the pa2 = 300system , α1 = parameters are: A2 whole for15.9 the , α2 output flow ofAthe FD stands 2cm , Cand 1= 0.75 rameters are: 2 = 300 cm , C1 = 0.75, α1 = 15.9, 2 = 20, V2denote volumes of liquid in disturbance flow.=V20, 1anddelay τ=40s. delay = 40 s. tank 1 and 2, respectively. Values of the process F , C1= 0.75 , α1 = 15.9 , α2 parameters are: A2 = 300 cm2F F = 20, delay τ=40s. 1 in
Fig. 3. Static characteristic of the process
D
1
F 1 in
h 1
ock s:
F D
F 1
F 2 h 1
F 2
� ���� �
blockughly model iener y not inear versal odels, e of it is, mdani rd to rrent
N° 1
part he isare of the cal f the el
�� ���
VOLUME 12,
� ����� 5��
h 2 F 3
h 2
�5�
F 3
F 1 in
F D
F 1
F 1 in
F D
F 1
h 1
h 1
F 2
F 2
Fig. 4. Control plant dynamics
h 2
h 2 F 3
F 3
Fig. 2. System of two tanks 2. System Fig. 2. SystemFig. of two tanks of two tanks
Observing Figs. 3 and 4 it is clear that the example process has both: nonlinear statics and nonlinear dynamics. During the experiments the input value F waschangedto the following values:20, 30, 40, 51, 60, 70, 80 cm3/s. It can be noticed that the bigger difference between actual input and the input at the steady-state point is, the bigger difference between responses of the linear and nonlinear models can be observed.
In Single Input Single Output (SISO) case, first flow F1In isSingle considered to be an input(SISO) of thecase, system. Input Single Output first Output flow of the system is liquid level in the second tank, h2. The considered to be an input of the system. Output of F 1is In Single Input Single Output (SISO) case, first flow the system level in as thea MISO secondplant tank, with h2. The process can is beliquid also treated disto be an input of the system. Output of F1is considered processinput can be alsoFDtreated as a account. MISO plant with to turbance flow taken into In order thecalculate system is liquid level in the second tank, h . The 2 account. In order disturbance inputvalue flow Fdirectly D taken into output from the differential process can be also treated as a MISO plant with 3. Application of Linear, Wiener to calculate output value directly from the differential equations, equations (6) can be transformed into: model and Takagi-Sugeno models equations, equations (6) can be into transformed into: account. In order disturbance input flow F D taken two Observing Figs. 3 andchapter 4 it is clear that the example of linear, Wieto calculate output value directly from the differential This presents application ones, process has both: nonlinear statics and nonlinear �ℎ � � � � � �ℎ ner and TS models to the example plant. Some as� � equations � � �(6) can be transformed into: equations, e = dynamics. During the experiments input value F sumptions concerningthe model identification have been �� 3 ∙ ℎ�� ∙ �� waschangedto the following values:20, 30, 40, 51, 60, �7� done: �ℎ� �� �ℎ� � �� �ℎ� 3 It can beplant noticed the bigger = (7)70, 80 cm /s.Control hasthat been identified within input �� �� difference between actual input and the input at the range F ∈ <0,80>cm3/s. steady-state point is, the bigger difference between During identification, control plant is considered responses of the linear and nonlinear models can be One should notice that FDis not delayed. Thestatic to be a SISO process, with input flow F and liquid level observed. characteristics of the control plant, and of the model h2 being the output. One should notice that FDis not delayed. Thestatic 52 Articles characteristics of the control plant, and of the model 3. Application of Linear, Wiener and Takagilinearized in the steady-state point h2= 8.41 cmand F
Only recurrent models with multi-step ahead predictions are considered. Journal of Automation, Mobile Robotics & Intelligent Systems
As the considered control plant is slow, sampling period is equal tomodels T = 10s.with multi-step ahead preOnly recurrent
dictions are considered. Disturbance flow FD was constant and equal to 7 As the considered control plant is slow, sampling cm3/s. period is equal to T = 10 s. The presented flow modelFDhas been identified Disturbance was constant andbyequal to 3 linearization of differential equations (6) in the 7 cm /s. steady-state point F = 51 cm3/s and h2= 8,41 cm. It is All of the models are compared with given by:the ideal the ideal nonAll of the models nonlinear model, received fromare thecompared equationswith (6). received from theusing equations Models havelinear been model, evaluated and compared the (6). Models have been evaluated and compared using the Mean Mean Square Error (MSE) ���� ��� = ��� ��� � 1� � �� ��� � 2� � �� ��� � 3� + Square Error (MSE) +�� ��� � 5� + �� ��� � 6� + �� ��� � 7��9� �
� 1 � �8� = ���� ��� � ����� (8) � � ��� Where �1=−2.83818251, �2=2.68214541,
−3 the origiwhere �y3r(k) and y(k) denote output values of =−0.84396289, �1=0.86185551∙10 , nal model (6) in the discrete sample k and of an idenwhere yr(k) andmodel, y(k) denote output −3 values of the −3 tified respectively. 2=−0.04738442∙10 , �3=−0.81447108∙10 . original model �(6) in the discrete sample k and of an The presented has been identified by identified model, respectively. 3.1. Application ofmodel a Linear Model linearization of differential equations (6) in the The presented has3/s been byItlineand identified h2= 8,41 cm. is steady-state point model F = 51 cm arization of differential equations (6) in the steadygiven by: 3 4.1 Application of a Linear Model state point F = 51 cm /s and h2= 8,41 cm. It is given by:
���� ��� = ��� ��� � 1� � �� ��� � 2� � �� ��� � 3� + (9) +�� ��� � 5� + �� ��� � 6� + �� ��� � 7��9�
Where a1 = −2.83818251, a2 = 2.68214541, Where �1=−2.83818251, �2=2.68214541, a3 = −0.84396289, b1=0.86185551 ∙ 10−3, −3 b2 = −0.04738442 ∙ 10 , b3 = −0.81447108 ∙ 10− −3. �3=−0.84396289, �1=0.86185551∙10 3,
�2=−0.04738442∙10−3, �3=−0.81447108∙10−3. Fig. 5. Test of linear model
The linear model has been verified on the test data set obtained using the original equations (6); see Fig. 5. The linear model fails to imitate both statics and dynamics of the tanks at the satisfactory level. Imperfections in statics can be mostly seen at the lower range of input values. For the low values of the flow F, liquid level h2 is negative. Bad representation of dynamics can be observed especially for the higher range input where linear model achieves Fig. 5.ofTest of values, linear model the steady state too quickly. The test shows that the linear The model is not accurate enough and identification linear model has been verified on the test data of the nonlinear process model should be done. set obtained using the original equations (6); see Fig. Fig. 5.fails Test to of imitate linear model 5. The linear model both statics and dy-
namics of the tanks at the satisfactory level. Imperfec4.2tions Application Wiener in statics of canabe mostlyModel seen at the lower range of input values. For the low values of the flow F, liquid The linear beenrepresentation verified on theimproving test data set The first andhas natural step towards level h2simple is model negative. Bad of dynamics obtained using the original equations (6); see Fig. 5. the model (9) is to extend with a nonlinear canlinear be observed especially for it the higher range of inThe linear model fails to imitate both statics and put values, where linear model achieves the steady dynamics of the tanks at the satisfactory level. state too quickly.inThe test can shows that theseen linear model Imperfections statics be mostly at the is not accurate enough and identification of the lower range of input values. For the low values of nonthe linear model should be done. negative. Bad representation flow F,process liquid level h2 is of dynamics can be observed especially for the higher range of input values, where linear model achieves the steady state too quickly. The test shows that the
VOLUME 12,
N° 1
2018
4.2. Application of a Wiener Model The first simple and natural step towards improving the linear model (9) is to extend it with a nonlinear static part. Such an approach will lead to obtaining static Suchmodel, an approach lead to obtaining the thepart. Wiener with will linear dynamics preceding Wiener model, with linear dynamics preceding nonlinear statics, which is given by: nonlinear statics, which is given by: �� ����� ���� = �
�� ����� ���� = �
�
������ ������ � �∙���
�
������ ������ � �∙���
∑���� �� ����� ���� ∙ ���� ∙ ���� ��� + ��� � �10� (10) ���� = ∑���� �� ����� ����
where µi(·) denote the generalized Gaussian membership where µi(·) denote the generalized Gaussian memberfunctions and ylin(k) denotes output of the linear dynamic ship functions and y (k) denotes output of the linear model (9), the values oflinthe parameters are as follows: dynamic model (9), the values of the parameters are c1=–3.674, c2=7.969, σ1=18.61, σ2=14.44, ��� =35.65, as follows: c =–3.674, c =7.969, σ =18.61, =14.44, 1 2 1 2 static part. Such an approach will lead to obtaining the 2 static part ��� =–36.17, ��� =0.7923, �� =2.053. The fuzzy, a01 =Wiener a02 = –36.17, a11 =dynamics apreceding 35.65, 0.7923, 1 = 2.053. The model, with�linear of the model haspart beenofidentified using Adaptive Neuro- usfuzzy, static the model has been identified nonlinear statics, which is given by: Fuzzy Inference System (ANFIS) MATLAB tool. � ing Adaptive Neuro-Fuzzy Inference System (ANFIS) ������rules, ������ � Nonlinear static part consists of two fuzzy because MATLAB tool. Nonlinear static �∙� part consists of two � � � improved bigger number �of� �� rules has= not the model ��� ���� fuzzy rules, because bigger number of� rules has not significantly. The test of the Wiener model using the test ����� ������ � �test of the Wiethe model significantly. The dataimproved set is presented in Fig. 6. � �∙� �� ��the � set is�presented in Fig. 6. ner model using test = data ��� ���� ���� =
∑���� �� ����� ���� ∙ ���� ∙ ���� ��� + ��� � ∑���� �� ����� ����
�10�
where µi(·) denote the generalized Gaussian membership functions and ylin(k) denotes output of the linear dynamic model (9), the values of the parameters are as follows: c1=–3.674, c2=7.969, σ1=18.61, σ2=14.44, ��� =35.65, ��� =–36.17, ��� =0.7923, ��� =2.053. The fuzzy, static part of the model has been identified using Adaptive NeuroFuzzy Inference System (ANFIS) MATLAB tool. Nonlinear static part consists of two fuzzy rules, because bigger number of rules has not improved the model significantly. The test of the Wiener model using the test data set is presented in Fig. 6. Fig. 6. Test of the Wiener model
Fig. 6. Test of the Wiener model Although the Wiener model performs much better than linear model (6), it is still not perfect enough. It can be Although the Wiener model performs much better noticed that at the higher range of input flow the Wiener than linear model (6), it is still not perfect enough. It model not only achieves steady-state values too fast, but canisbe noticed that at thedeviation higher range of input there also a considerable between steady-flow the Wiener model not only achieves steady-state valstate values. ues too fast, but there is also a considerable deviation between steady-state values.
4.3 Application of a Takagi-Sugeno Model
In 4.3 the case of the example plant, identification of a TS Application of a Takagi-Sugeno Model fuzzy model working in a satisfactory way with multiIn the case of the example plant, identification of step ahead prediction is not easy. The aim was to a TS fuzzy model working in a satisfactory way with identify a model, which would be able to properly multi-step prediction is not The aim whole range of easy. inputs, with aswas predict outputahead h2in the to identify Fig. a model, able to proper6. Testwhich of the would Wiener be model ly predict output h2in the whole range of inputs, with as low number of parameters (andmuch of fuzzy Although the Wiener model performs betterrules) than as possible. Firstit of all, standard linear model (6), is still not perfectidentification enough. It cantools be noticed that at thenot higher of inputa stable flow therecurrent Wiener like ANFIS are ablerange to identify model not only achieves steady-state values too fast, but there is also a considerable deviation between Articles steady- 53 state values.
Journal of Automation, Mobile Robotics & Intelligent Systems
low number of parameters (and of fuzzy rules) as possible. First of all, standard identification tools like ANFISfor arethe notconsidered able to identify a stable recurrent model system of tanks. In order to model for the considered system of tanks. to low number of parameters (and of fuzzy rules) as identify a properly working model, a lotInoforder optimizaidentify a properly working model, a lot of possible. First of all, standard identification tools like tion procedure calls have been performed. After many optimization procedure calls have been performed. ANFIS are not able to identify a stable recurrent experiments, the following model consisting of three After many experiments, the following model model for the considered system of tanks. Infunctions order to local linear models and three membership consisting of three local linear models anda lot three identify a properly working model, of has been identified membershipprocedure functions calls has been optimization have identified been performed. After many experiments, the following model consisting of three local linear models and three � membership identified �� − 1� has �� − 2� = −��� ����functions − �been ����� ��� � � ��� � − �� ���� �� − 3� ���� ��� − 5� � ���� ��� − 6� � ��� ���� − 7� � ��� ����� ��� = �−�� ���� �� − 1� −� �� � ��� �� − 2� = −�� ���� �� �− 1� − �� � ��� �� − 2� ���� � ��� − �� ���� �� − 3� �� − 3� � − �������� � � �� ��� − 5� � � � � � ��� − 6� � � �� ��� − 7� �� �� � ���� ��� − 5� � � ��� − 6� � � ��� − 7� � � � � � � � �� − 2� � � ��� ��� = −�� ���� �� − 1� − �� �− 1� − ��� � � ������ �� − 2� ���� � ���� = −��� ���� �� − 3� � − � � ��� − �������� �� − 3� � � �� ��� − 5� � � ��� − 6� �� �� ��� − 7� �� ��� � � � � ���� ������ − 5� 6�−�1� ��−��� � 2� ���− =� −�������� −�� �� �− 7� � ��� �
� ���
�
���
− ��� ���� �� −13� � �� �� ���� = ��� ��� − 6� � ��� ��� − 7���� ��� 5� � ��� � ���−−1�� ���� �� − 1� − ��� �� � 1�� ��� 1 �� ����� �� − 1�� = ���� ���� �� 1 − 1� − ��� � 1 � � �� �� ����� − 1�� = ��� �� ���� �� − 1� − ��� �� � 1�� ��� 1 �� ����� �� − 1�� = �� ���� �� − 1� − ��� �� 1 � 1�� ��� �� ����� �� − 1�� = �� ���� �� − 1� − ��� �� � 1�� ��� 1 �� ����� �� − 1�� = �� ���� �� − 1� − ��� �� � 1�� ��� ∑���� �� ����� �� − 1�� � ����� ��� �11� (11) ���� ��� = � �� − 1�� ∑ � ��� �� ����� ∑��� �� ����� �� − 1�� � ����� ��� �11� ���� ��� = ∑���� �� �� − 1�� ��� ��are Parameters of Takagi-Sugeno model presented in Table 1. In this particular case,model µi(·) are Parameters of Takagi-Sugeno arethe presented generalized bell membership functions. Results the in Parameters of Takagi-Sugeno model are in Table 1. In this particular case, i(·) arepresented theof general(·) are the Table 1. In this particular case, µ test of the fuzzy model are presented in Fig. 7. i ized bell membership functions. Results of the test of generalized bellare membership the fuzzy model presentedfunctions. in Fig. 7. Results of the test of the fuzzy model are presented in Fig. 7. Table 1. Parameters of Takagi-Sugeno model ���Table = −2.99269755 ��� =of3.09986829 ��� = −1.10786562 1. Parameters Takagi-Sugeno model
Table 1. Parameters of Takagi-Sugeno model
��� = 0.00097673 ��� = 0.00144840 ��� = −0.00265196 � � � = −2.99269755 � = 3.09986829 ��� = −1.10786562 � � ��� = 0.02325236 ��� = −2.07214459 ��� = 1.40937893 � � �� = −0.00265196 � �� = 0.00097673 � ��� = 0.00144840 �� =�0.00743793 �� = −0.00209157 ��� = −0.31914971 ��� = 0.02325236 �� = −2.07214459 ��� = 1.40937893 � � ��� = −0.00149540 � = −0.05903962 � � � =�−0.49863498 ��� = −0.31914971 ��� = 0.00743793 �� = −0.00209157 ��� = −1.35260971 ��� =�0.85935076 ��� =� 0.02694692 ��� = −0.00149540 �� = −0.05903962 �� = −0.49863498 � � �� = −0.02428196 � = −0.00135914 �� = 0.03383882 ��� = −1.35260971 � ��� = 0.85935076 � ��� = 0.02694692 ��� =�2.38310951 � = 1.20259040 � = 9.65702467 �� = −0.02428196 �� ��� = −0.00135914 �� ��� = 0.03383882 ��� = 2.38310951
54
��� = 1.20259040
��� = 9.65702467
The TS fuzzy model is very well tuned. Some imperfections can be noticed only in the magnified plot. However, the differences are very small and typical for such identification tasks. In comparison to the Wiener model, the MSE coefficient is of order of magnitude better. One should note that procedure of TS model identification was very time-consuming, especially in comparison to the rapid identification Articles
��� = 3.71510776
��� = 0.08657941
��� = 2.18542619 ��� = 0.50682709
VOLUME 12, N° 1 2018 ��� = 7.20816195 ��� = −0.92594015
of Wiener model. Moreover, TS model required heu��� = 3.71510776 ��� = 2.18542619 = 7.20816195 ristic approach, which was strictly���defined for the presented system�of tanks. As the heuristic approach 0.50682709 ��� = −0.92594015 ��� = 0.08657941 �� = was used, it is hard to strictly define convergence rate. However, TS model identification time was significantly longer than in case of Wiener model, where strict mathematical rules can be used for system identification. The goal of the proposed fuzzy block-structured models presented in the next chapter is to connect their structure with fuzzy approach, in order to significantly improve procedure of fuzzy models identification.
Fig. 7. Test of the Takagi-Sugeno model Fig. 7. Test of the Takagi-Sugeno model
The TS fuzzy model is very well tuned. Some imperfections can be noticed only in the magnified plot. However, the differences are very small and The TS fuzzy model is very well tuned. Some typical for such identification tasks. In comparison to imperfections can be noticed only in the magnified theplot. Wiener model,the thedifferences MSE coefficient is of order of However, are very small and magnitude better. One should note that procedure typical for such identification tasks. In comparisonof to TS model identification was coefficient very time-consuming, the Wiener model, the MSE is of order of especially in comparison to the rapid identification Fig. 7. Test of the Takagi-Sugeno magnitude better. One should notemodel that procedure of of Wiener model. Moreover, TS model required heuristic TS model identification was very time-consuming, approach, which was strictly defined for the of especially in comparison to the rapid identification presented system of tanks.TS Asmodel the heuristic approach Wiener model. Moreover, required heuristic was it is hard toof strictly definedefined convergence 5. used, Application thestrictly Proposed approach, which was for the rate. However, TS model identification time was Block-Structured Model presented system of tanks. As the heuristic approach significantly longer than in case of Wiener model, was used, it is hard to strictly define convergence rate. In the proposed block-structured model, a nonlinwhere strict mathematical rules can be used for However, TS model identification time was ear fuzzy static model follows a nonlinear dynamic system identification. of the proposed fuzzy significantly longerThe thangoal in case of Wiener model, model. The ANFIS tool has been used to identify nonblock-structured models presented next where strict mathematical rules caninbethe used for linear static part, input ofgoal which is the output of the system identification. of the proposed fuzzy chapter is to connectThe their structure with fuzzy dynamic model (11). The model consists of two block-structured models presented in the next genapproach, in order to significantly improve procedure eralized Gaussian membership functions and of two chapter to connect their structure with fuzzy ofisfuzzy models identification. linear models approach, in order to significantly improve procedure of fuzzy models identification. Gaussian membership functions and of two linear models
5. Application of the Proposed Block� ������ ������ � Structured Model ����� 5. Application Block�� ����� ����of= �the Proposed � Structured Model ������ ������ �a nonlinear In the proposed block-structured model,
�����dynamic model. fuzzy static model follows ���� a=nonlinear �� ����� � In the proposed block-structured model, anonlinear nonlinear The ANFIS tool has been used to identify fuzzy static model followsisathe nonlinear model. static part, input of which output dynamic of the dynamic The ANFIS tool has been used to identify nonlinear model (11).� The model consists� of two generalized � static part, input of��� which of the ����is� the ∑��� ��� output � �dynamic �� �� � ���� ��� �� �12�(12) ���� = (11). The model consists of two generalized model ∑���� �� ����� ����
where c5 = −2.103, 5 = 1.818, c6 = 19.59, 6 = 3.257, �5=−2.103, �6=1�.5�,a�166 = 0.9908, =3.257, a15where a05� 5=1.818, = 0.9964, = −0.002777, � � � 6� =0.��64, � =−0.002777, � =0.��08, � =0.1114. � a0� = 0.1114.� Results of the� test of the � model are Results of the of the model are presented Fig.well 8. presented intest Fig. 8. The obtained model is in very The obtained model is very well tuned and almost tuned and almost perfectly mimics the original equamimics the original equations (6). tionsperfectly (6).
valu value Such TS fu
In struct th Wien the ca contr The ca
5.2 I S
O struc exis sub
Journal of Automation, Mobile Robotics & Intelligent Systems
Fig. 8. Test of the block-structured model 5.1. C omparison with the Takagi-Sugeno and Wiener Models The proposed block-structured model has slightly improved already identified Takagi-Sugeno model. It can be observed after comparing the magnified fragments of Figs. 7 and 8 that modeling at the higher values of level h2 has been improved. Difference in values of the MSE coefficient is equal to ΔE/n = 0.0005. Such a small difference could have been foreseen, as TS fuzzy model has been already very accurate and well-tuned. In comparison to the Wiener model, the block-structured model is about 11 times better (comparing the MSE). Still, the block-structured model has Wiener-specific structure, what can be beneficial in the case of some special applications (like predictive control cooperating with the set-point optimization). The tests confirm that the block-structured models can offer better performance comparing to the Wiener models). 5.2. Improvement of roughly tuned Takagi-Sugeno models One of the benefits gained from using block-structured models is possibility to improve already existing TS fuzzy models. Comparing results from subsection 4.3 and section 5, it is hard to clearly confirm such an advantage. However, two factors should be taken into consideration: Firstly, TS fuzzy model has already been very well tuned. Secondly, it was mentioned that TS fuzzy model for the proposed plant has been hard to identify. It required hundreds of optimization procedure calls (Sequential Quadratic Programming, Active-set and Genetic Algorithm optimization methods) and considerable computational effort. Thus, the identification process is very time-consuming. TS fuzzy models can be especially hard to identify in the case of models with multi-step ahead prediction. When one-step ahead prediction is considered, standard tools and methods, like ANFIS, are usually sufficient. It is because they are adjusted to the
VOLUME 12,
N° 1
2018
non-recurrent problems. This is due to the objective function, which consists of already defined membership functions (treated as constants in the latter optimization steps) and linear consequents of TS fuzzy model. Such an approach leads to the quadratic programming problem (a convex function). In multi-step ahead prediction, membership functions cannot be treated as constants, what leads to the nonlinear optimization problem. In the case of the process under consideration, ANFIS was unable to determine a stable model for the system of two tanks. An approach tailored to the given problem was required. In such cases, block-structured models can be found useful. When fine-tuning of a TS fuzzy model is too hard, it can be stopped and current best model can be improved by adding a nonlinear static part. Such an approach can considerably improve performance of the fuzzy model, without significant increase of computation complexity. Now, a TS fuzzy model obtained during the identification procedure of the fine-tuned model presented in subsection 4.3 will be improved using the proposed approach.
Fig. 9 The first roughly tuned TS fuzzy model The model was obtained pretty fast, but the value of the MSE coefficient (E/n=0.1544) is better than in the case of the Wiener model, though visually Wiener model test looks better (compare Figs. 6 and 9). It is caused by the better adjustment of control plant dynamics in the TS fuzzy model, which greatly impacts the MSE coefficient. The Wiener model is in fact better only in prediction of process statics. Improvement has been done by identifying the nonlinear static part using ANFIS tool. Several variants of the static part have been tested: with linear or constant consequents and 2, 3 or 4 Gaussian membership functions. The obtained results are presented in Table 2. The initially not too well tuned TS fuzzy model has been improved almost 4 times. In the case of linear consequents of the fuzzy model, two membership functions are sufficient to significantly improve TS fuzzy model. In the case of constant consequents three membership functions are sufficient to provide similar quality of the model, as in the case of the modArticles
55
Journal of Automation, Mobile Robotics & Intelligent Systems
el with linear consequents. Selection of constant consequents may be useful in the case, when computation time matters. Test of the block-structured model with 2 membership functions and linear consequents in the static part is presented in Fig. 10. Table 2. Improvement of first TS fuzzy model conseq. type
num. of mem. functions
VOLUME 12,
N° 1
2018
Similar tests have been performed as in the case of the first roughly tuned TS fuzzy model. The static part has been also identified by using the ANFIS tool. Gaussian membership functions have been selected. Table 3 presents results of tests of different block-structured models based on the above mentioned TS fuzzy model. Table 3. Improvement of second TS fuzzy model
linear
constant
2
E = 0.044 n
E = 0.054 n
3
E = 0.043 n
E = 0.045 n
4
E = 0.043 n
E = 0.044 n
conseq. type
num. of mem. functions
linear
constant
2
E = 0.206 n
E = 0.238 n
3
E = 0.203 n
E = 0.207 n
4
E = 0.203 n
E = 0.205 n
Like in the previous case, the static model with 2 Gaussian membership functions and linear consequents was enough to significantly improve TS fuzzy model. The model with constant consequents needed 3 Gaussian membership functions in order to achieve quality similar to the one offered by the model with linear consequents. Test of the model with 2 linear consequents in the static part of the model is presented in Fig. 12.
Fig. 10 The first roughly tuned TS fuzzy model followed by the nonlinear fuzzy static part The second roughly tuned TS fuzzy model is even worse than the first one. Test of this model is presented in Fig. 11. The MSE coefficient of this model is equal E/n=0.888.
Fig. 12. The second roughly tuned TS fuzzy model followed by the nonlinear fuzzy static part
Fig. 11. The second roughly tuned TS fuzzy model 56
Articles
In the considered example, the block-structured model is about 4.5 times better than the initial TS fuzzy model. Using ANFIS tool, identification of the nonlinear static model is almost instant, therefore the identification procedure of the model is much faster and effortless than in the case of the TS fuzzy model described in Sect. 4.3. One should notice that usage of different identification techniques for static part may provide even better model.
To sum up, this subsection proves that a roughly Journal Automation, Mobilecan Robotics Systems tunedof TS fuzzy model be significantly improved 5.3 Disturbance Modeling in& Intelligent Block-Structured by adding and identifying a nonlinear static block, Models input of which is an output of the nonlinear dynamic To sum up, thisofsubsection proves processes, that a roughly model. In the case a highly nonlinear and So far, SISO models beenanconsidered. multi-step ahead prediction, approach may tuned TSonly fuzzy model canhave besuch significantly improved notuseful. only onstatic the input However, liquid level h2depends turn out to be very by adding and identifying a nonlinear block, flow F, but also on the disturbance flow F . Taking the D input of which is an output of the nonlinear dynamflow F Dintoconsideration as one of the system inputs, ic model. In the case of a highly nonlinear processes, leads to the MISO model. In the current work, and multi-step ahead prediction, Block-Structured such an approach 5.3 Disturbance Modeling changedin in the range from 0 disturbance FDhas been may turn out to be very useful. 3 Models to 15 cm /s. The static characteristic of the plant considered as a MISO system presented in Fig. 13. 5.3. Disturbance Modeling in is Block-Structured Models So far, only SISO models have been considered. So far, liquid only SISO have considered. not been onlydisturbance on the input level hmodels InHowever, block-structured models the additional 2depends However, liquid level depends not only on thethe in2 in flow F, but also the hdisturbance flow FD. Two Taking input can be on included different ways. put flow F, buttoalso onmodeling the flow Finputs, D. Takas disturbance one have of thebeen system flow FDintoconsideration approaches MISO thus ing the flow Fthe intoconsideration ascurrent one of the system Dthe leads toIn MISO Inathe work, investigated. firstmodel. of them, simplified one, it is inputs, leads to the MISO model. In the current work, been changed in the range 0 disturbance FDhas assumed that the disturbance is taken intofrom disturbance Fonly hasinbeen changed in theofpart range from static characteristic theof plant to 15 cm3/s. consideration the nonlinear static the 0 D The 3 toconsidered 15 cm /s. The characteristic of approach thein plant conas a static MISO system is presented Fig. 13. model. The structure of the model in this is sidered as a MISO system in is Fig. presented in Fig. 13. presented 14. In block-structured models the additional disturbance input can be included in different ways. Two approaches to MISO modeling have been thus investigated. In the first of them, a simplified one, it is assumed that the disturbance is taken into consideration only in the nonlinear static part of the model. The structure of the model in this approach is presented in Fig. 14.
Fig. 13. Static characteristic of the process with two input flows: F and FD Fig. 13. Static characteristic of the process with two input flows: F andthe FD additional disturIn block-structured models bance input can be included in different ways. Two approaches to MISO modeling have been thus investigated. In the first of them, a simplified one, it is assumed that the disturbance is taken into consideration only in the nonlinear static part of the model. The structure of themodel model in this approach is presented Fig. 14. MISO with disturbance taken into Fig. 13. Static characteristic of the process with two in consideration Fig. 14. only in the static part of the model input flows: F and FD
Fig.MISO 14. MISO model disturbance taken into Fig. 14. model withwith disturbance taken into consideration only in the static part of the model consideration only in the static part of the model
Therefore, in the first model, the nonlinear dynamic part the same asthe in eq. (11) and the nonlinear Therefore, in theisfirst model, nonlinear dynamic static part is as follows: part is the same asformulated in eq. (11) and the nonlinear static part is formulated as follows: �� ����� ���� = � �� ����� ���� = � �� ����� ���� = �
������ ������ � �����
������ ������ � �����
������ ������ � �����
� � �
����� ��� = ���� � ���� ��� � ��� � �� ��� � ��� � ����� ��� = ���� � ���� ��� � ��� � �� ��� � ��� � ����� ��� = ���� � ���� ��� � ��� � �� ��� � ��� � �
� Therefore, in the first model, the nonlinear dynamic ������ ������ � part is the same as in eq. (11) and the nonlinear static ����� =� ����� �� part VOLUME 12, N° 1 2018 is���� formulated as follows: � � ������ ����� ��� = ���� � ���� ��� ���� � ������ � ���� �� �� � � � � ������ � ��� � ����� ��� �=� ����� ���� = �� �� ���� ��� ��� � �� � � � � �� � ������ ����� ��� = ��� � ���� ��� ���� � ��� �� �� � ���
�
�
���� �� �� ��� ���� = � ∑���� �� ����� ���� � ����� ��� � ������ ������ ��12� � ��� = � ∑ ����� � �� ��� �= ���� ���� �� ����� ����
(13)
� � � �presented ���Table ���� ��� =of��the � ��� ���are � � �model � � �� ��� �in Parameters 4. Parameters� of the model are presented in Table 4. � � � It is ��� ��� ��� = �� � � � � � � � � � fuz� It is worth FD is�not ���� to notice ���the disturbance � � that � worth to notice that the disturbance� FD is not fuzzified � zified it is=present only in�the ��� ��� ���� in �� consequents � �static � �the � �� ��� ��� ��� � � of the � and it is�and present only consequents of the static TS fuzzy model. TS fuzzy model. ∑���� �� ����� ���� � ����� ��� Table 4.� Parameters of block-structured model �12�with ��� = ∑in���� �� static ����� ���� disturbance includedof the part model with Table 4. Parameters block-structured disturbance included in the static part c5 = 0.09089 s5 = 2.009 c6 = 8.124 Parameters of the model are presented Table 4. It is �� �� �� =in9.491 s6 to = 4.001 = 10.26 fuzzified worth notice that thec7disturbance FD is snot 7 = 1.925 = 0.09089 = 3.998 and it is present only in the consequents of the static 5 ��a25== 1.067 4.001 TS �� fuzzy = a15 =18.9 0.1034 model.�� = a3.996 0 = –0.7514
��� ��� a = 1.044 a = 0.3213 = −0.7514 a = –2.606 = 0.1034 Table Parameters� of block-structured ��� model with �� ��� =4.1.044 disturbance included in the static apart a = 1.061 a = 0.4334 = −2.606 = –4.206 = 0.3213 ��� = 9.491 ��� ��� �� = 1.061 �� �� = 0.09089 = 3.998 = 0.4334 = −4.206 The approach obtain a MISO �� second = 4.001 �� =to18.9 �� = block-struc3.996 tured �model is more �complex. The �model contains �� The structure �� �� =of1.067 a couple nonlinear dynamic models. 0.1034 in Fig. = −0.7514 of this MISO model is=presented 15. The approach ��� block= 1.044 ��� to obtain a MISO ���second structured model is=more complex. The model 0.3213 = −2.606 contains a couple of nonlinear dynamic�models. The � � = 1.061 � � �� in Fig. 15. � � of this MISO model structure is presented = 0.4334 = −4.206 ��� = 1.067 6 2
6 1
6 0
7 2
7 1
7 0
��� � 0. �= � � � = =�0.0
��� = ��� =
���� = �� �=���0.0 = =��0.0
The second approach to obtain a MISO blockstructured model is more complex.dynamic The model Fig. 15. MISO model with two nonlinear models Fig. 15. MISO modelofwith two nonlinear contains a couple nonlinear dynamicdynamic models. The models structure ofmodel this MISO is presented in Fig. 15. Fig. 15. MISO with model two nonlinear dynamic models
���� = �� �=� 0.0
As in the previous approach, the nonlinear dyAs in the previous approach, the nonlinear dynamic namic part for the flow F is the same as in eq. (11). part for the flow F is the same as in eq. (11). However, However, another dynamic nonlinear dynamic has to be As in thenonlinear previous approach, the nonlinear another model has model to be dynamic identified, identified, of eq. the(11). disturbance part for the describing flow F is theinfluence same in describing influence of the as disturbance FDHowever, on the FD another nonlinear dynamic model has to be identified, on the plant: plant: describing influence of the disturbance FD on� the �������� � plant: �������� � � ��� � �� ������ �� − 1�� = ������� ������� �� �
�� ������ �� − 1�� = � �
�
= 0.0
In the
In the
��
� � � ����������������� �� � � ����� ������� �������� � � � � � ����������������� �� � � ����� ������� �������� � � � � ����� ���� �
�� ������ �� − 1�� = � �
�� ������ �� − 1�� = � �
�� ������ �� − 1�� = � �
���� � ��� =��� −��� � ����� −�1���� =−� 1� − � � �� �� �
��
�� − 2�
� �� − 3� − ��� ��� ���− �� − 2� ���� � ��� = −��� ���� ��1� − �� � ��� �� �� � � � �� �� +�� �� �� − 1� + � − 2� + � − 3� + ��� � � � − ��� ������ �� − 3��� � � � 1� − � � �� −�2� ���� � ��� = −�� ������ �� − � � + �� +�� �� �� − 1� + �� ��� �� − 2� + ��� �� ����� − �3� � � + �� − 3� − �� ��� ���− �� − 2� ���� � ��� = −��� ���� ��1� − �� � ��� �� �� � � � �� +�� �� �� − 1� + � 2� + � − 3� + ��� � ��� �� − �� � � � − 3��+ − �� ����� � � �� −�2� ���� � ��� = −�� ������ �� − 1� − � �� � + �� − �3� +�� �� �� − 1� + �� ��� �� − 2� + ��� �� ����� � � + �� − 3� − �� ��� ���− �� − 2� ���� � ��� = −��� ���� ��1� − �� � ��� �� �� � � � �� �� �� 2� − + 3� �� � + ��� +�� �� − 1� + +� − 3�Articles − ������� − �� �
�
� �����
�
�
57
+��� �� �� − 1� + ��� �� �� − 2� + ��� �� �� − 3� + ��� ���� � ��� = −��� ����� �� − 1� − ��� � ��� �� − 2�
�� � Robotics & Intelligent Systems Journal of Automation, Mobile
− 3� are + presented in − ��� ����� ��model Parameters of the disturbance � � � � � 5. +�� �� �� − 1� + �� �Table � �� − 2� + �� �� �� − 3� + �� � � ���� � ��� = −�� ����� �� − 1� − �� � ��� �� − 2� ��
�
�� − 3� + − ��� ���� �� disturbance model Table 5. Parameters of the � � +�� �� �� − 1� + �� �� �� − 2� + ��� �� �� − 3� +
�
�� �� = 6.461 �� = 2.009 �� = 8.124 �� = 1.458 � �� = 10.26 �� = 1.925 ∑��� ���������� �� − 1�� �� ����� ��� (14) � �� = 0.4494 ������� = −1.09 �� ��� = �13� � ∑���� �� ������� �� − 1�� = −0.3266
Parameters of the disturbance model are presented in Table 5. Parameters of the disturbance model are presented in 5. Table 5. Parameters of Table the disturbance model c5 = 6.461
s5 = 2.009
c6 = 8.124
5 2
5 3
Table 5. Parametersc of the disturbancesmodel s = 1.458 = 10.26 = 1.925 6
7
7
�� = 6.461 �� = 2.009 a = –1.09 –0.3266 ��� = 1.458 ��� a==10.26 ���� = −1.09 ��� � � 5 � b = 0.003863 = 0.002509 b1 = 0.002509 ==0.003863 −0.3266 � � �� = −1.226 �� = 0.21
�� = 8.124 0.4494 ��� a= =1.925 ����� = 0.4494 b = 0.00137 = 0.00137 � �� a = 0.21 a = –1.226 a = –0.3538 = −0.3538 ��� = 0.5818 ��� ��� a = 0.5818 b = 0.0001568 b = 0.0003302 = 0.0001568 = 0.0003302 ��� = −1.229 ��� ��� b = 0.0002166 a = 0.01735 a = –1.229 = 0.0002166 = 0.01735 ��� = 0.5867 ��� ��� =a −0.354 = 0.01735 a = 0.01735 b = 0.01735 = 0.0003254 ��� b = 0.0005275 ��� b = 0.0001986 ��� a = 0.02841 = 0.0005275 = 0.0001986 = 0.02841 5 1
5 2
5 0
6 1
6 3
6 2
6 0
7 2
mic ever, fied, e
6 2
6 1
6 3
dels
5 3
7 1
7 3
7 2
7 1
7 3
7 0
In the model, the nonlinear static part is composed of four fuzzy rules: In the model, the nonlinear static part is composed of four fuzzy rules: �� ����� � ���� = � �� ������ ���� = �
������ ������ � �
������ ������ �
2�
��� ����� � ���� = �
2�
��� ����� � ���� = �
�
���
�
����� �
�����
������
��
�
�
������� �
� �����
������
��
������� �
� �����
�
�
�
�� = �� ����� � ���� � ��� ����� � ���� �
���
��� = �� ������ ���� � ��� ����� � ���� �
��� = �� ������ ���� � ��� ����� � ����
3�
�
��� � ��� = �� � ���� � ����� ��� + ��� � ���� � ��� + ��� � �
��� � ��� = �� � ���� � ����� ��� + ��� � ���� � ��� + ��� � �
��� �� ��� = ��� � ����� � ����� ��� + ���� � ���� � ���
ed in 58
+ ���� �
�
��� �� ��� = ��� � ����� � ����� ��� + ���� � ���� � ��� Articles
+ ���� �
��� + ���� � ���� � ����� ���� � ������������ � ��� �� ��� �� == ���� � ����� �� ��� � ��� �� � � �� + �� � � ��� ����� � ���� ��� = �� �����∑� ���� �� (15) ��� ���� ��� � �14� ���� = ∑�� ��� �� ��� = �� ������ ���� � ��� ����� ���� ��
Parameters of the nonlinear static part are presented� in ��� = �� � ��of�� �the ��� + ��� static ��� are ��� �Parameters + ��pre����nonlinear � ���� �part � � Table 6. � sented in Table 6. ��� � ��� = �� � ���� � ����� ��� + ��� � ���� � ��� + ��� � �
�� �� ��� ��� + �static ��� = ��� � ��of ��� ��6. � ���� Table Parameters nonlinear part in � the � � ���� � �� block-structured model��with disturbance included in +� � the dynamic model � �� ��� = ���s� �� ��� + ���� � ���� �c��=��0.0751 � � ����� ��� = 8.018 c = 18.84 s = 8.073 8
c10 = 6.724
a28 = 0.8044
8
+ ���� �
s10 = 2.076
9
c11 = 11.13
∑�� ��� ���� ���
�14� ���� a18 = = 0.4272 �� a08 = –3.234 ∑��� ��
9��
s11 = 2.195
a29 = 1.299
a1 = 0.4489 = –4.359 static a2 =part 0.8786 a1 = 1.427in Parameters of thea0 nonlinear are presented Table 6. 9
a010 = –9.973
9
a211 = 1.164
10
a111 = 1.13
10
a011 = –12.57
Both MISO models have been verified using a test data set. This data set is very similar to the one used earlier, but now it includes also changes of the disturbance FD. In the case of a simplified approach with disturbance considered only in the static part of the model, the MSE is equal to E/n = 0.0178. In the case of the more complex model with two dynamic parts, E/n = 0.0064, thusthe MSE coefficient of the MISO model with two dynamic parts is about 2.5 times better than in the case of the simplified MISO model. Fig. 16 presents comparison between two fragments of test data set for both MISO models. In this piece of test, changes of disturbance FD occur. It can be noticed that the model with disturbances included only in the static part has significant problems with modeling influence of FD on system dynamics. On the contrary, the model with two separate nonlinear dynamic models reconstructs dynamic behavior of the process appropriately. To sum up, the MISO model with two dynamic parts is very accurate and it can be used as a multi-step ahead predictor for the considered control plant.
6. Conclusions
�� = �� ����� � ���� � ��� ����� � ����
��� 2�
��
��� �
� ��� + ��� � � ���� ��� � ��� = �� � ���� � ����� ��� + ��� ���� �� �� ��� ����� � ���� = � � 1 2018 ��� + ���� �12, ��� �� ��� = ��� � ����� � ����� ���VOLUME ����N° �� �� = �� ����� ������� � ��� ����� � ���� + �� � �
�
In this paper, block-structured models based on TS fuzzy systems have been proposed. The models have been tested during modeling of the example process, consisting of two tanks (cylindrical and conical ones). All models have been designed for multi-step ahead prediction. The designed model has been compared with other widely used dynamical models, such as linear ARX, the Wiener model and the standard TS fuzzy model. The goal of the method proposed in this paper was to merge advantages of block-structured models and TS models. Block-structured models can be used for rapid identification of many systems, while Takagi-Sugeno models can be used to relatively easy synthesize control algorithms.
Journal of Automation, Mobile Robotics & Intelligent Systems
VOLUME 12,
N° 1
2018
proper heuristic approach for the system identification. They can significantly simplify model identification procedure without sacrificing quality of modeling. Moreover, their structure can be easily extended to the MISO models, if needed.
AUTHOR
Piotr Bazydło* – Research and Academic Computer Network (NASK), Kolska 12, 01-045 Warsaw, Poland, piotr.bazydlo@nask.pl Piotr Marusak – Institute of Control and Computation Engineering, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland, P.Marusak@ia.pw.edu.pl *Corresponding author
REFERENCES
Fig. 16. Comparison of MISO models with one (up) and two (bottom) dynamic parts The tests have shown that drawbacks of the other models can be relatively easy reduced by means of the proposed block-structured models with the nonlinear static model following the TS fuzzy dynamic model. Thus, the following benefits of the block-structured models (on contrary to Wiener or dynamic TS fuzzy models) can be listed: • Better performance than the one offered by the Wiener models, in the case of systems with highly nonlinear dynamics; • Possibility to improve already identified TS fuzzy model. This benefit is especially noticeable when TS fuzzy model is hard to identify; • Possibility to significantly shorten identification time; • Possibility to include another input in the model without much effort; • Possibility to identify separate dynamic models for different inputs and connect them with a single (or multiple) nonlinear static model(s). (Versatility of the proposed approach.) The structure of the proposed models can be easily used in the Model Predictive Control algorithms cooperating with the set-point optimization. Future work will concern application of the block-structured models in such control algorithms. To sum up, fuzzy block-structured models can be a good alternative to TS fuzzy models, especially when it is hard to find
[1] J. Abonyi, R. Babuska, “Local and global identification and interpretation of parameters in TS fuzzy fuzzy models”, IEEE International Conference on Fuzzy Systems,vol. 2, 2000, 835–840. DOI: 10.1109/FUZZY.2000.839140. [2] H.H.J. Bloemen, C.T. Chou, T.J.J. Van den Boom, V. Verdult, M. Verhaegen, T.C. Backx, “Wiener model identification and predictive control for dual composition control of a distillation column”, Journal of Process Control, vol. 11, issue 6, 2001, 601–620. DOI: 10.1016/S09591524(00)00056-1. [3] R. Isermann, M. Münchhof, “Identification of Dynamic Systems”, 2011 Springer-Verlag. DOI: 10.1007/978-3-540-78879-9. [4] A. Janczak, “Identification of Nonlinear Systems Using Neural Networks and Polynomial Models”, 2005 Springer-Verlag. DOI: 10.1007/b98334. [5] Jang, J.-S.R., “ANFIS: adaptive-network-based fuzzy inference system”, IEEE Transactions on Systems, Man and Cybernetics,23 (3) 1993, 665– 685, DOI: 10.1109/21.256541. [6] T.A. Johansen, R. Babuska, “Multiobjective Identification of TS fuzzy Fuzzy Models”, IEEE Transactions on Fuzzy Systems, vol.11, issue 6, December 2003, 847–860. DOI: 10.1109/ TFUZZ.2003.819824. [7] T.A. Johansen, R. Shorten, R. Murray-Smith, “On the interpretation and identification of dynamic TS fuzzy fuzzy models”, IEEE Transactions on Fuzzy Systems, vol. 8, issue 3, June 2000, 297– 313. DOI: 10.1109/91.855918. [8] L. Ljung, System Identification: Theory for the User, 1999 Prentice Hall PTR Englewoods Cliffs, New Jersey. [9] P. Marusak, “Efficient MPC algorithms based on fuzzy Wiener models and advanced methods of prediction generation”, Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol. 7267, 2012, 292–300. DOI: 10.1007/978-3-642-29347-4_34. Articles
59
Journal of Automation, Mobile Robotics & Intelligent Systems
[10] S.J. Norquay, A. Palazoglu, J.A. Romagnoli, “Model predictive control based on Wiener models”, Chemical Engineering Science, vol. 53, issue 1, 1998, 75–84, DOI: 10.1016/S00092509(97)00195-4. [11] K. Patan, Artificial neural networks for the modelling and fault diagnosis of technical processes, 2008 Springer-Verlag Berlin Heidelberg [12] A. Piegat, “Fuzzy Modeling and Control”, 2001 Physica-Verlag Heidelberg. [13] G. Shafiee, M.M. Arefi, M.R. Jahed-Motlagh, A.A. Jalali, “Nonlinear predictive control of a polymerization reactor based on piecewise linear Wiener model”, Chemical Engineering Journal, vol. 143, issue 1–3, 2008, 282–292. DOI: 10.1016/j.cej.2008.05.013. [14] T. Takagi, M. Sugeno, “Fuzzy identification of systems and its applications to modelling and control”, Transactions on Systems, Man and Cybernetics, vol. 15(1), 1985, 116–132, DOI: 10.1109/ TSMC.1985.6313399.
60
Articles
VOLUME 12,
N° 1
2018
[15] P. Tatjewski, Advanced control of industrial processes: structures and algorithms, 2007, Springer-Verlag London. DOI: 10.1007/978-1-84628635-3. [16] A. Hagenblad, L. Ljung, A. Wills, “Maximum Likelihood Identification of Wiener Models”, Automatica, 44, 2008, 2697–2705, DOI: 10.1016/j. automatica.2008.02.016. [17] L. Vanbeylen, R. Pintelon, J. Schoukens, “Blind Maximum-Likelihood Identification of Wiener Systems”, IEEE Transactions on Signal Processing, vol. 57, 2009, 3017 – 3029. [18] F. Giri, Er-Wei Bai (Eds), “Block-oriented Nonlinear System Identification”, Lecture Notes In Control And Information Sciences, 2010, Springer-Verlag Berlin. [19] A. Van Mulders, L. Vanbeylen, K. Usevich, “Identification of a block-structured model with several sources of nonlinearity”, Control Conference (ECC), 2014, IEEE.