Guest Expert
Martin C. Brown
Freelance Writer and Consultant, MCslp
Martin C. Brown

Build grid applications based on SOA

Concepts behind SOA and how to move grid applications to SOA model By Martin C. Brown (questions@mcslp.com), Freelance Writer and Consultant, MCslp First published by IBM at IBM developerWorks.

Grids and the Service-Oriented Architecture (SOA) are two systems that appear to be on a collision course. The SOA is a standard for building discrete services, potentially across multiple machines, which can be combined to build an application that reduces integration costs. Most modern grids employ Web services like the SOA, but there is more to merging the two systems than simply employing Web services. You must also adjust the architecture of your grid solution. This article explains the concepts behind SOA and what you should consider when moving your grid applications toward an SOA model.

Moving to SOA and grids

A perennial problem with grid applications is making them flexible enough to be used across a range of potential platforms and environments. While older grids used a dedicated solution with rigidly controlled hardware and environments, it has recently become clear that making your grid application run on a wider range of platforms enables you to easily expand the scope and power of your grid simply by adding more machines.

However, minor differences between platforms can cause significant headaches. For example, changes between Windows versions, even Windows NT and Windows 2000, can cause problems for such rigidly designed and optimized applications as are usually employed in a grid environment. An obvious solution is to remove the highly platform-specific elements and move to a more generalized environment.

The principles behind the SOA follow these basic rules. The SOA is a component-based model for building applications that divides applications into a number of discrete services that, individually, perform a specific function, but when put together make up the components of a larger application.

The fundamentals of SOA are not new. Object orientation has been a buzzword for years and distributed objects have been available for some time using technologies like CORBA. The key difference is that the SOA is based on a combination of the principles of object orientation and Web services and also marries this structure with an open system describing the interfaces available. By making Web services easier to find and identify, SOA makes it easier to deploy and distribute an SOA-based application. Because Web services are based on open standards and are, by definition, architecture- and platform-neutral, SOA-based applications can be deployed across a wide range of platforms.

In short, SOA is a method for exposing services and allowing computers to talk to each other and share power and functionality. Grids have slowly been moving toward a Web services architecture, first with the move by Globus to the Open Grid Standards Infrastructure (OGSI) and, further, with the Globus Toolkit 4.0 (GT4) release. SOA and grid technology are moving toward the Web Standards Interoperability technology, based on solutions such as the Web Services Resource Framework (WSRF) and others.

You can also see that SOA and grid have a lot to offer each other. This is not a case of grid technology making use of SOA principles, or vice versa. From the SOA perspective, grids offer an exceptional model for the distribution of information and resources, a key feature of the SOA model. From the grid perspective, SOA offers alternative, but flexible methods for adjusting the architecture of grid solutions and making them more transparent and supported on a wider range of platforms and environments.

Let's look at a traditional grid model, then at an SOA-based grid model to compare how the two differ and how you can start to think about your grid and SOA applications as a single resource.

Traditional grid model

To fully understand how SOA can improve your grid service -- and how to modify your applications -- let's look at a typical grid service based on traditional grid technology, including basic Web services. The basic structure is fairly simple and straightforward.

You can see the structure diagram of a typical grid environment in Figure 1. I've deliberately avoided specifying any type of grid software, as most work on the same basic principles.

Figure 1. The monolithic grid model
Figure 1. The monolithic grid model

Overall, the structure is fairly simple. We have a grid coordinator responsible for distributing information and work to individual nodes. The role of the coordinator -- also known by other names, including the distribution and management node -- is to run the grid. Communication between the coordinator and work nodes can be in a variety of solutions, although most systems (Globus included) rely on Web services. The model used here is often referred to as a cascade service, as the information and work requests are cascaded through the service from a single point and distributed down into the various nodes.

However, irrespective of the communication system used, the methodology is largely the same:

  • The rigid structure means nodes are contacted using a specific Web service or Web service interface to a specialized piece of software that handles requests from the node coordinator.
  • Work submissions are handled through the coordinator and on to the individual nodes, usually using a single Web service to submit the work. On the work nodes, a similar client communicates completed work back to the grid coordinator.
  • Any additional information required by the grid nodes (large data structures or reference materials, for example) may be accessible on the network through another service, which may or may not be supported through Web services. For example, some grids use a centralized SQL database for the storage of information accessed by a node directly, rather than through a Web service interface.

Overall, most grid services have -- until recently -- been based on monolithic code bases using proprietary methods for communication and exchanging information. This is changing to a Web services model, but even with the introduction of Web service standards, many solutions work on a Web service interface to the original monolithic application. For example, as can be seen in our traditional model above, submission to a node occurs through a single Web service interface to the grid node, which is actually an exposed Web services-oriented interface to the original application.

The problem with the monolithic approach, even when it uses Web services, is that it restricts your abilities to expand and grow. With a monolithic style, porting applications to other platforms and environments can be more complex. If your system is not based on Web services, the problem is even more severe. Growth is limited because of the reliance of a single coordination system responsible for distributing information across the network. If your client is also monolithic (even if coupled with a Web services exposed interface), deploying your grid application across a number of machines will be more difficult.

SOA application model

The SOA isn't simply another term for using Web services to integrate and communicate between parts of your application. SOA goes much deeper and defines a way of developing an application that switches the focus from a single application made up of multiple functions or objects into a structure that divides the entire application into a number of individual services. For example, consider an accounting application that has components that raise invoices and provide a method for paying them. In the traditional component application model, you might define objects and associated methods for the two tasks.

In a simple Web service environment, you would build an interface to the objects and methods to enable access for something you wanted remote access to. For example, creating invoices might be a task you would want to be accessible over a network connection. In all likelihood, the Web service would have a dedicated interface on a dedicated machine and building an interface to it would require knowing the server the service was running on, as well as full details on the interface required to interface to it.

In an SOA, every function of the accounting application would technically be a Web service. Each service would advertise its existence on the network, and you could perform any operation from any suitably authorized machine. Furthermore, because the individual services are components in their own right, they could reside anywhere on the network. We no longer need a dedicated server to handle the requests. We could be using a server farm, for example, and because the services advertise themselves, we don't need to worry about how to find them.

From this description, we can determine that the key elements of the SOA are:

  • Available as a discrete service -- The level of the granularity provided has yet to be determined as a standard. For example, it's not yet known whether you would expose a single invoice service capable of multiple operations or multiple services, each one supporting a different aspect, such as raising, paying, or altering an invoice.
  • Independent -- We don't care how they achieve the task we ask them. They just do it. Similarly, the services don't worry about how they achieve the result, either.
  • That services advertise themselves.

In theory, therefore, it should be possible to divide up an application into small components, which can then be connected together (by calling each other), in order to make the final application. Because these units can also be spread across multiple machines, we move from the monolithic, cascade structure, into a more flexible distributed, communicative structure. You can see an example of an SOA structure in Figure 2.

Figure 2. An example SOA grid model
Figure 2. An example SOA grid model

SOA grid model

It should be obvious from the description of the SOA model, and the current trend of grid technology's move toward a Web services structure, that the two are converging. In simple terms, the grid is a distributed system for sharing resources, while an SOA is a distributed architecture most concerned with service interoperability, easy integration, and simple, extensible, secure access. Both systems have common problems, including the problems of latency, concurrency and partial failures.

Both also use Web services, and Figure 3 shows a simple layout that could apply to both SOA and grid environments. Both employ heavy use of SOAP, XML, the Web Services standards, and the associated security, management, and other systems.

Figure 3. The SOA operational model
Figure 3. The SOA operational model

Whether your existing grid application uses Web services or not, you can see that the open standards detailed in Figure 3 make up the bulk of the SOA standard. To summarize, SOA makes use of the following technologies:

  • XML forms a core part of all the standards, from the SOAP protocol used to exchange information to the methods used to share description. For example, WSDL, an XML-based description language, is used to describe available services to potential clients.
  • SOAP provides the base methods for exchanging objects and calling methods. The underlying transport protocol is largely irrelevant, although it's likely you will be using HTTP.
  • A number of extensions are used to provide core services for interoperability between services. WS-Reliability and WS-Resource for example are used to help publicize and improve the reliability of communication.
  • Background standards, such as WS-Security and the new WS-Distributed Management standards, are used to help provide secure communications and to manage services remotely.

These principles and technologies are already being employed by the Globus Toolkit and built into the grid standards and applications.

To build them into your own applications will require some changes to the way you approach your application and development. For example, to merge grid applications with SOA principles, you need to:

  • Migrate your applications to a service model. If you are not already doing this, you must think about the individual operations of your application. We'll cover this in more detail later.
  • Divide your grid applications into smaller, discrete components. For example, instead of a single monolithic application with a Web service interface, split your application into individual Web services-based components. You might split the application servicing your grid nodes into individual services for accepting working, returning work, and reporting statistics.
  • Ensure that your nodes and controller are capable of independent working. Don't rely on permanent connections to database servers or other resources.

Conversely, to merge SOA applications with grid principles, you need to:

  • Eliminate areas in your application where they rely on a single host or environment to operate.
  • Incorporate statistical and monitoring information into your SOA services to determine typical grid load information, such as CPU and storage resources.

The result in both cases should be a more flexible and practical application.

Migrating applications

Changing an existing application is the most difficult method of changing a grid or SOA application because you need to find a way of modifying your application without significantly altering the way your application works and upsetting existing services.

The primary concern is to map out the organization of your application. Build lists of the operations performed, how communication operates between nodes and components in the grid, and compile sequences of processes. For example, you will want to produce a model of the submission sequence: what happens when a job is submitted, when the response is sent back, and when information about the status of a given resource is required. Armed with this information, build a list of services required to support these operations.

You now need to consider how you will implement these individual services. There is a slight difference between the approach of SOA and grid-based services in that SOA components are considered to be stateless while grids are stateful. The two need not be mutually exclusive. You can develop a stateless application that provides stateful information, providing that you include a method for recording the service's state that can be reported.

Remember that even with the migration to the new SOA model, core parts of your application will remain the same. In a CPU-intensive grid, for example, the core calculation functions that do the actual work in your grid will remain the same, as will any code that access data structures and information. For example, if you consider an image rendering grid, the data that makes up the original image description and the code (functions or applications) that converts this vector data into the final image will remain the same. Only the way in which you communicate the information and requests to the individual nodes will change.

Finally, with all this information in hand, start to migrate your application to the new component-based model. There are, unfortunately, no easy ways to produce this information, nor currently any tools to make it easier. Much of the information you need to determine will depend on your application and the environment you are converting your application from.

Summary

Even to a casual observer, SOA and grid seem to have similar goals and aims: to simplify an application, extend its abilities and supported platforms, and enable you to distribute work more easily across multiple machines. They certainly apply many of the same principle components -- Web services and XML -- but their approach is slightly different. You can also see that the benefits that each solution could provide to the other are significant enough to consider making use of the technology. For example, the independent nature of grid technology has benefits for deployment of SOA-based applications. And the flexible and heterogeneous nature of SOA would be ideal for grids to expand their platform base.

With an SOA-based grid, we gain a number of advantages. The monolithic model disappears. Instead of a single grid coordinator that controls the execution of units through the grid, work could be submitted to any node. When work is distributed, it should even be possible for individual nodes to send work directly to other nodes when they are unable to process the work in time. This autonomous architecture is possible because the individual components can talk to each other.

Resources

About the author

Martin Brown has been a professional writer for over eight years. He is the author of numerous books and articles across a range of topics. His expertise spans myriad development languages and platforms -- Perl, Python, Java, JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows, Solaris, Linux, BeOS, Mac OS/X and more -- as well as Web programming, systems management and integration. Martin is a regular contributor to ServerWatch.com, LinuxToday.com and IBM developerWorks, and a regular blogger at Computerworld, The Apple Blog and other sites, as well as a Subject Matter Expert (SME) for Microsoft. He can be contacted through his Web site at http://www.mcslp.com.

close window