Enabling High-Level Access to Grid Computing Services
by José Miguel Alonso, Vicente Hernández, Germán Moltó
The Globus Toolkit is considered the de facto standard for Grid computing. However, its steep learning curve has discouraged its widespread adoption. In order to overcome this disadvantage an object-oriented software has been developed, which provides an intuitive interface that completely hides the direct interaction with Globus.
Grid computing technology enables the collaborative usage of computational resources across different organizations. Nowadays, the Globus Toolkit represents the industrial standard for Grid computing, and is involved in a large number of research projects. Even though Globus offers all the necessary tools to perform Grid-based executions, its steep learning curve means an intense training process is required before its benefits can be fully exploited in each particular area of interest.
In the framework of the project GRID-IT (TIC2003-01318), funded by the Spanish Ministry of Science and Technology, a middleware has been developed by the Networking and High performance Computing Research Group (GRyCAP) of the Valencia University of Technology in order to provide easy and transparent access to Grid computing facilities.
Purpose and Description of the Middleware
This middleware offers an object-oriented, high-level application programming interface which allows the process of remote task execution in a Grid deployment to be simplified. It is intended to give support to parameter sweep applications, which involve the execution of multiple instances of a task. These sorts of executions are typically resource-starved and thus Grid computing offers significant benefits by enlarging the computational capabilities of a single organization with new computational resources from abroad.
As represented in Figure 1, the middleware was developed on top of the Java Commodity Grid Kit, allowing the user to completely avoid direct interaction with Globus services such as GRAM (for job execution), GridFTP (for high-performance file transfer), or MDS (for resource characteristics discovery). Instead, the user is provided with a high-level API that exposes a very natural and convenient point of entry to the Grid services.
|Figure 1: Middleware location with respect to the applications and Globus Toolkit.
It is important to point out that, with the direct interaction with Globus, Grid users must combine all the services provided by the toolkit in order to achieve their purpose, which generally represents remote task execution. This implies that the users must concentrate on how to perform the execution, rather than on what to execute, which is in factwhat they are interested in. Since it is the purpose of the Grid to offer advantages to users, the easier it is to use this technology, the sooner the benefits are achieved.
Figure 2 is a diagram showing some of the most important classes of the middleware. The user directly interacts with instances of these intuitive classes. For example, a GridTask represents an abstraction for a task that must be executed in the Grid, a GridResource represents an abstraction for a computational resource in the Grid infrastructure and the BasicScheduler provides scheduling capabilities for the allocation of GridTasks to GridResources.
|Figure 2: Simplified class diagram of the middleware.
With this middleware, the user need only describe the application and the supporting classes involve the entire infrastructure to achieve the remote task execution.
For each task, the middleware allows the dependent input file set to be described and appropriately staged into the resource before execution. The output files can also be specified, and they will be transferred back to the local machine after execution. The task may also specify an a priori quality of service, requesting a minimum amount of available RAM or a minimum number of processors in the remote resource, and refusing to be executed if these are not fulfilled.
The middleware also provides fault-tolerant dynamic scheduling capabilities for the allocation of tasks to resources. Fault tolerance is achieved by the application-dependent checkpointing support, periodically retrieving the checkpoint files to the local machine, so that an execution can be resumed in another resource in the case of failure.
This middleware is currently being used by the GRyCAP in a number of computational fields. First of all, in the area of biomedical applications, the study of the cardiac electrical activity, especially under pathological conditions, requires the execution of several parametric simulations in order to analyse, for example, the influence of certain anti-arrhythmic drugs.
On the other hand, the structural analysis of buildings in civil engineering design can require the simulation of a large number of different structural alternatives under diverse load conditions, in order to find that which best accomplishes with all the economic limitations, design aspects and safety requirements.
In conclusion, the development of high-level interfaces to Grid computing facilities greatly enhances the usability of and interest in this technology, allowing wider adoption of the Grid by non-experts.
José Miguel Alonso, Vicente Hernández, Germán Moltó,
Universidad Politécnica de Valencia/SpaRCIM, Spain
Tel: +34 963 87 73 56