(Swinburne Decentralised Workflow for Grid)
[Overall System Architecture]
SwinDeW-G (Swinburne Decentralised Workflow for Grid) is a grid workflow management system running on GT4 compatible grid infrastructures or testbeds. It combines p2p (peer to peer) and grid technologies to simulate the enactment of business processes in a decentralised manner. This prototype functions using the JXTA platform (www.jxta.org) for p2p and the Globus Toolkit for grid.
This report is intended to provide an explanation of how to use and develop on SwinDeW-G. It is assumed that the reader has a basic understanding of workflow and process enacting.
Before commencing it is required that SwinDeW-G and its related software is installed. An explanation of how to do this is available in the SwinDeW-G Setup Guide, http://www.swinflow.org/docs/Setup_Manual.pdf.
The overall system architecture is depicted in Fig. 1 below. The following sub-sections will describe each of these layers in more detail. As we can see, SwinDeW-G sits on a grid infrastructure named SwinGrid (Swinburne Grid).
Grid Workflow Applications
Core Grid Middleware
(High performance computers, 2.0 Tera FLOPS supercomputers)
Fig. 1. Overall System Architecture of SwinDeW-G Environment
An overall picture of SwinGrid is depicted in Fig. 2 which contains many grid nodes distributed in different places. Each grid node contains many computers including high performance PCs and/or supercomputers composed of many computing units. The primary hosting nodes include the Swinburne CITR (Centre for Information Technology Research) Node, Swinburne ESR (Enterprise Systems Research laboratory) Node, Swinburne Astrophysics Supercomputer Node, and Beihang CROWN Node in China. They are running Linux, GT (Globus Toolkit) 4.04 or CROWN grid toolkits 2.5 where CROWN (China R&D Environment Over Wide-area Network) is an extension of GT4 with more middleware, hence compatible with GT4. Besides, the CROWN Node is also connected to some other nodes such as those in Hong Kong University of Science and Technology, and University of Leeds in UK. The Swinburne Astrophysics Supercomputer Node is cooperating with APAC (Australian Partnership for Advanced Computing), VPAC (Victorian Partnership for Advanced Computing) and so on.
Fig. 2. Physical Network Layout of SwinGrid
Fig. 3. Structure of SwinDeW-G
Currently, SwinDeW-G is deployed at all primary hosting nodes in Fig. 2. It is planned to extend to other nodes in the near future. SwinDeW-G is a peer-to-peer based grid workflow management system. A grid workflow is executed by different peers that can be distributed at different grid nodes. Different peers communicate with each other directly in a peer-to-peer fashion. As shown in Fig. 2, each grid node can have a number of peers. A peer can be simply viewed as a capable grid service.
The detailed structure of SwinDeW-G is depicted in Fig. 3. At build-time, grid workflow specifications are defined. Then, at run-time, they are executed by different peers which facilitate the computing and resource sharing power of underlying grid infrastructure - SwinGrid.
Fig. 4 shows a sample grid workflow execution in SwinDeW-G. Grid workflow activities are executed by different peers located in different grid nodes.
Fig. 4. Sample Grid Workflow Execution in SwinDeW-G Environment
When a grid workflow is executed, each activity is assigned to a peer. This assignment is based on which peer is the most suitable to execute the activity. To be suitable the peer must first be capable of executing that activity and not be busy with other tasks. Once all the activities have been assigned to a peer, the grid workflow is then executed from start to finish. Each peer that has a task assigned to it will communicate with the other peers so that the grid workflow executes in the correct order.
SwinDeW-G peers can be launched automatically via a java client. This client is located in the ‘clients’ directory under the name … . The running of this client executes the shell script … When used it allows the user to easily launch any number of peers via a GUI shown in Fig. 5.
Fig. 5. The SwinDeW-G peer launching client
Note: Please ensure that the ports that SwinDeW-G uses are not blocked by a local firewall. They need to be open for communication on the local machine.
When the ‘Create peers’ button is pressed a data directory will be created for each peer. These peers will then load within their own command window. This command window will display an output of the peer’s actions. A description of this output can be found in section 5.
Each peer will automatically have a directory created for it. The peer uses this directory to access its configuration and capabilities files and to cache data for JXTA communication. When the ‘Create peers’ button is pressed, the program creates a unique directory for each peer and then copies the files from the ‘defaults’ directory into it. The data directories are only created for peers that do not already have one. In other words if peer1 already has a data directory containing its configuration files then no changes will be made to it when peer’s are launched.
The file ‘platformConfig’ is the configuration file for JXTA. JXTA uses this to get information about the peer before it loads. This information includes the peer’s name, ip address, port and login data. The SwinDeW-G client modifies the ‘platformConfig’ file for each peer to dynamically set their configuration data based upon the user input.
The file ‘log4j.properties’ configures how the output will be displayed.
Another file that is copied into each peer’s data directory is ‘capabilities.xml’. This file contains a list of the capabilities that the peer will have. By default each peer will be assigned all the capabilities in the Cheque Processing workflow. These are Scan Cheque, Analyse Image and Access Accounts.
To further experiment with SwinDeW-G it is possible to limit or change the capabilities that each peer has. This can be done by modifying ‘capabilities.xml’ and reloading the peer.
The SwinDeW-G client is also able to upload and enact workflow processes. This can be done once a number of peers have been launched.
Each SwinDeW-G peer will have its own command window. This window will show the output from the different tasks that the peer is performing. An example of this output is shown below.
The output of SwinDeW-G is divided into four columns. These columns are labelled in the image shown in Fig. 7 and are described below.
All the output that is displayed in the command window is also output to a text file. This text file can be found in the … directory under the name … . If unknown errors occur it is recommend to email the generated log files to the SwinDeW-G development team.
Note: The log files are emptied every time a peer loads. So each log file only contains the output of the last peer execution.
An example workflow has been included in the SwinDeW-G distribution to demonstrate the prototype’s functionality. The file that contains this workflow can be located under ‘clients/bank_workflow.xml’. The following section describes how to upload and enact this process along with a description of the workflow process.
Fig. 8 is a description of the workflow that we will use in our test case.
This is a graphical representation of the workflow contained in ‘bank_workflow.xml’. It is intended as a simplified version of how cheques are processed in a bank once they have been received. A description of the different elements in the diagram is provided bellow.
To start with we need to load 3 peers on which the workflow will be enacted. To do this open the SwinDeW-G client and open the ‘Peer Launcher’ tab. Use the following parameters in the input fields:
Once these parameters have been entered then press the ‘Create peers’ button.
Three command windows should now open which will show each peer loading and joining groups according to their capabilities.
The workflow now needs to be uploaded onto one of the peers. It does not matter which peer we use because when a SwinDeW-G peer receives a workflow it will automatically distribute it amongst its neighbours. In this example we will upload the workflow to peer1.
First, open the SwinDeW-G client and open the ‘Client’ tab. Now enter the URI of peer1 and click ‘Connect’. Next, browse to where ‘bank_workflow.xml’ is located. Once this is done, click the ‘Load’ button.
In peer1’s command window you should now see the workflow being loaded and distributed to peer’s 2 and 3.
Processes should only be enacted on peers that have the first capability in the workflow. This does not affect our example as all three peers have all the capabilities of the workflow.
To enact the process first enter the process name “Cheque Processing” into the text box. Then click the ‘Enact’ button. Be sure that the client is still connected to peer1.
In the command window for each of the peers you should see the process enacting from start to finish.
To further experiment with the functionality of SwinDeW-G you may want to run this test using additional peers. Also it is possible to limit the capabilities that each peer has so that certain tasks in the workflow can only enact on some of the peers.
· J. Chen and Y. Yang, Adaptive Selection of Necessary and Sufficient Checkpoints for Dynamic Verification of Temporal Constraints in Grid Workflow Systems. ACM Transactions on Autonomous and Adaptive Systems (TAAS), accepted, 2007, to appear, [PDF].
· Y. Yang, K. Liu, J. Chen, J. Lignier and H. Jin, Peer-to-Peer Based Grid Workflow Runtime Environment of SwinDeW-G, Proc. of 3rd International Conference on e-Science and Grid Computing (e-Science2007), Bangalore, India, Dec. 2007, [PDF].
· J. Yan, Y. Yang and G. K. Raikundalia. SwinDeW - A Peer-to-peer based Decentralized Workflow Management System. IEEE Transactions on Systems, Man and Cybernetics, Part A, 36(5):922-935, 2006.