|
Globus MDS
By nature, Grids have exponentially more moving parts than the typical, silo'd architecture that today's enterprises are moving away from, where dedicated hardware stacks support specific applications in a very static way. In Grids, resources come online into the production environment, and they can just as quickly go back offline. Jobs start, take up a certain quanta of resource to run, and then that resource is freed up again for the next job.
So how in the heck do Grid pros get real-time insight into what's happening, where and when?
The authors of Grid Computing, The Savvy Manager's Guide sum up the challenge: "In a dynamic on-demand environment, it cannot be assumed that applications have static, permanent knowledge of the available services and resources. Rather, at some reasonable time before executing partner tasks, the available services need to be interactively discovered, reserved and subsequently provided to the application."
In Globus Toolkit Grid environments, this need is satisfied by the Monitoring and Discovery System (MDS), the information services component of the Globus Toolkit that provides information about the available resources on the Grid and their status. On the client side, users reference the MDS user interface to get the real-time view they need of their Grid resources. On the provider end, MDS allows Grid participants to create the necessary interfaces that allow other users to get access to their resources. According to Jeffrey Hollingsworth and Brian Tierney (in Grid 2): "A major focus of MDS's design is achieving scalability in a system with large numbers of information providers and consumers."
Just what sorts of information does MDS take into consideration?
Well, when GT4 transitioned from the Open Grid Services Architecture (OGSA) to the Web Services Resource Framework (WSRF) -- so did MDS. Where OGSA was a service-based model, WSRF is a resource-based model. With this transition from OGSA to WSRF, instead of just having services, the Grid had to account for services AND resources. So there was no longer a one-to-one mapping between services and resources. And so notions of things like service data in OGSA and in GT3 become essentially resource properties in GT4 and MDS.
Today, MDS can ingest almost any data in XML, using standard WS query interfaces to the MDS index service.
"MDS uses a variety of plug-in modules to ingest XML data," said Mike D'Arcy, programmer at ISI. "From processes, to files, to remote servers, to connections over a socket to a non-WSRF service -- the system is very flexible on ingest methods. Anything that's a WSRF service and that publishes its own resource properties can very easily be pulled into MDS using its mechanisms. You can also advertise any kind of data that's expressed in XML by creating your own provider."
To collect information, MDS has several options for interfaces that can be plugged in -- so developers can use their own Java classes or executable programs to both publish information and pull it in. MDS users can also ingest data from other service entities to present a composite view of the data (otherwise known as "aggregation," which is tied to a service group mechanism).
"The other notion is where we have these ingest methods where the target, or the endpoint, if you will, is a resource property, and you can collect data from a variety of sources to synthesize a resource property in your server," said D'Arcy. "And that's really the bread and butter of what MDS does from the monitoring and collection standpoint."
"MDS also has a concept of 'soft state,'" said Laura Pearlman, system architecture lead at ISI. "MDS is, in a sense, 'self-pruning.' If a resource disappears off the network, it will eventually go away -- so MDS isn't out there looking for things that no longer exist. Pretty much any system that does any kind of aggregation has this capability, but MDS provides a standard that gets the information from many different sources and report it in a consistent way."
IT pros with just a basic understanding of XML have the necessary knowledge to make use of MDS. They essentially must only be able to write a simple shell script that the MDS framework invokes. By merely writing a script that collects the data and formats it into XML, it's a simple step to configure the MDS framework to run that script.
"In some cases, we have some schemas pre-defined," said Pearlman. "For example, to advertise information about a cluster, you would use the WSRF Service Group schema. So if you are running GRAM and you have some queuing system on the back end that no one's ever heard of and you want to write a provider, then you need to be able to write some programs and get the information. But you don't need to think about schema, because that's pre-defined for you."
Organizations already using MDS4 and/or planning to use it include OSG, TeraGrid, Earth System Grid, UCLA and China Grid. According to Pearlman, most users are still in the ramp-up phase.
"I think more people will come to the party as the list of provider interfaces been expanded," said D'Arcy. "Because those are at the end of the value chain, and that's where MDS will save Grid users the most time."
Among the expanding interface efforts, MDS is working on an effort to interface to Nagios, which would allow Grid users to read Nagios state files, and have a solution in place that would interact with the Nagios set up (and host that information in MDS). Many Globus GRAM users are also interested MDS incorporating the ability to peer deeper into clusters and hosts associated with GRAM servers -- so that in addition to seeing the hardware and operating system, MDS would be able to see other variables such as software catalogues.
close window |
|