A Distributed Discovery of Communicating Resource Systems Models
Abstract: |In communicating resource systems (CRS), a set of independent, hierarchically composed resources communicate to realize a business process. Examples of CRSs include REST ful Web services, cloud computing platforms and hierarchical distributed systems. Unfortunately, their complexity, distributed nature, and usually long lifespan can lead to problems like live locks, deadlock occurrences and invocation loops. This makes CRSs hard to maintain and manage; it also leads to a loss of QoS or system failures. Moreover, it is important to take into account, in system management, the exact way how the system gets to some desired or undesired state. Therefore, it is crucial to analyze processes, executed in a system to _nd and _x the above problems. Unfortunately, CRS process models are hardly ever available. Therefore, we present a new distributed algorithm, dRMA, discovering process models from CRS event logs. In order to reduce the representational bias between the discovered process model and the real process, we have de_ned a Communication Net, which can express features of CRSs: directed communication channels, hierarchy, and a resource perspective. Finally, we have evaluated dRMA on a cluster, and have showed that distributing jobs can signi_cantly reduce the time of process model discovery.
Existing system: In order to ful_ll the requirements for high availability and performance, systems that provide services are dis- tributed. One way to implement distributed systems is the Service Oriented Architecture (SOA) paradigm. In SOA, system functionalities are split into independently developed elements called Web Services (WS).Web Services expose standardized interfaces which are invoked remotely by other services. Services can be developed in- dependently of each other, and are loosely coupled, which means that services are usually self contained and do not have knowledge of other services. Proposed system: In this section we introduce the dRMA algorithm and the Communication Net. The main idea behind the proposed algorithm is to expose the work flow perspective of a process, and a resources perspective of that work flow. Distributing the algorithm is done by performing discoveries of local resources models, each on an independent cluster node. Communication Net is a Petri Net suited for modeling CRSs. Firstly, this approach shows how independent com- munication requests are processed by independent resources. Secondly, it is possible to see when resources communicate and with which resources. Finally, it is possible to see depen-dencies among the interacting resources, and their internal events. Additionally, the re_ned alpha algorithm and the alpha plus algorithm have been proposed, and they allow one to _nd even better models (including so called "one long" loops or implicit dependencies).These algorithms are based on the same idea as the alpha algorithm, so it is possible to use them with our approach in a straightforward way by extending STEPIV in the same way these algorithms are extending the original alpha algorithm. Nevertheless, such a simple algorithm allows one to discover a wide range of even quite complicated processes. Based on and our evaluation presented in this paper, we believe this approach is fully su_cient. Advantages: In spite of clear advantages of SOA and CRS, they are obviously not fully problem proof. Their distributed nature, the focus on providing functionality through composition, and relying on local knowledge of resources and services, may lead to some unexpected behavior.
A CRS may consist of hundreds or even thousands of cooperating resources. Each of these resources performs simple local. The most important advantage of the presented algo- rithm is the fact that it takes the resource perspective, as well as workflow perspective. Resource perspective is an ability to discover a composition of CRS from resources, and to _nd the way the resources interact with one another. The workflow perspective is an ability to express local processes of system resources, and a global process of the whole system. Therefore, the resulting model shows how local processes of resources are cooperating in order to execute the global process of a system. This leads to an easier- to-read and comprehend process model, i.e. a composition of several smaller Petri Nets, with additional information about communication of resources as well as direction of that communication.
Disadvantages: An initial approach to mining processes in SOA was presented in where the idea was to gather distributed logs and discover interactions between services. Unfortunately, the authors use classical PM algorithms which discover a big, at model of the process executed by the entire system, which leads to a huge representational bias between the reality and the discovered process model. In addition, these approaches were domain speci_c for SOA. Models of business processes are hardly ever available a priori. Therefore, to analyze the execution of these processes, models of processes should be constructed (discovered) from the gathered (historical) execution logs. Model discovery is often referred in literature as process mining (PM). Process mining as a research discipline has proven its value in discovering, verifying and enhancing process models mainly in enterprises .However, there are also other domains where PM methods and tools can be used in order to get information about a system's behavior, _nd its strengths and weaknesses, and nally propose some
Modules: Process learning: Models of business processes are hardly ever available a priori. Therefore, to analyze the execution of these processes, models of processes should be constructed (discovered) from the gathered (historical) execution logs. Model discovery is often referred in literature as process mining (PM) ,. Process mining as a research discipline has proven its value in discovering, verifying and enhancing process models mainly in enterprises CRSsinformation about a system's behavior, _nd its strengths and weaknesses, and _nally propose some improvements.
Process discovery: However, there are also other domains where PM methods and tools can be used in order to get presented and, where the idea was to gather distributed logs and discover interactions between services. Unfortunately, the authors use classical PM algorithms which discover a big, at model of the process executed by the entire system, which leads to a huge representational bias between the reality and the discovered process model. In addition, these approaches were domain speci_c for SOA. The presented approach extends the reference alpha algorithm by decomposition the problem into a set of smaller problems (one model for each resource in a CRS). Then, based on these partial models, it com- bines them into a model of the entire process, composed by communication. However, RMA introduces two problems. Firstly, being a modi_cation of the process discovery alpha algorithm, it inherits its complexity, which naturally leads to ine_ciency when dealing with discovering models of systems with a signi_cant numbers of distinct events. Secondly, RMA is not fully suited for CRSs because it does not take the hierarchy of resources under consideration, analogically to the URL address hierarchy. Our algorithm shows that dividing the process discovery problem into subproblems is an e_cient and viable way to drastically cut down computation time of discovering the whole global process model. We chose 10 computing nodes since our test examples had a maximum of 10 resources (NRES) that could be mapped onto computation nodes. We were able to do it because our approach discovers local resource processes in parallel, and next reduces them to
a global process by discovering communication. Still, there are two ways to obtain further reduction in computation time. Distributed computing : Now, a new paradigm called Resource Oriented Archi- tecture (ROA) [5] is gaining more and more attention in both research and industry. In contrary to SOA, ROA uses a declarative, data centric approach to modeling and imple- menting distributed systems [6]. In ROA the functionality of a distributed system is scattered over a collection of commu- nicating resources, and a complex functionality is provided by a composition of the resources. In practice, ROA systems are implemented according to the REST model where RESTfulWeb Services (REST-WS) are in the form of resources, which communicate using the HTTP protocol. In ROA, the main focus is put not only on resources, but also on their local process execution (behavior) and a global process execution (global behavior) resulting from a communication among them. ROA systems are examples of a more general class of distributed systems, called Communicating Resource Systems (CRS), which comprises a set of independent and hierarchically composed resources that communicate in or- der to realize complex business processes (global processes). Process modeling: Operations but also can orchestrate other resources by invoking them. This is called the execution of a global business process of the system. In addition to that, systems are implemented with a long lifespan and rolling updates in mind. Therefore, they are implemented in a modular and incremental way, which can lead to bad usage of some system components. Moreover, over time, requirements of systems clients may change, so the systems must adapt. New or changed functionalities are implemented, which may be based on often obsolete technologies. That makes the systems hard to understand, and hard to use. In addition, in CRSs there are usually multiple resources performing the same or very similar functionalities, so their business pro- cesses may de_ne alternative execution paths if something goes wrong. Therefore, not only the _nal result must be taken into account in the management of the system, but also the exact way how the system gets to some desired or undesired state, and behavior like: deadlocks,
livelocks, resource invocation loops, stateful communication, and re- dundant resources. In this context, it is crucial to be able to inspect the model of the business process [11] to _nd out how the system works, and how it can be optimized or _xed.