GCJ: As a preface to our discussion of the Incubator project, could you give us an overview of dev.globus, including how and why it came about?
Schopf: Globus began ten years ago as an open source project with a very liberal license so the general community had access to our source code and could contribute to the project. This worked well for the first few years, when most of the outside contributions were from "friends and family" - groups we had close working relationships with or formal collaborations already set up. As we've grown, however, the ad hoc way we were handling bug fixes wasn't scaling well for the project as a whole, and we didn't have a simple way for new, larger pieces of functionality to be easily included.
So in 2005 we set up dev.globus - a source code development infrastructure and governance model to more easily expand the Globus community, allow additional projects to join Globus, and in general to make the process around Globus easier to understand and more transparent. We based what we did very strongly on the Apache Jakarta model. Control over each individual software component, called a project, in the hands of its most active and respected contributors, called committers, with the Globus Management Committee (GMC) providing overall guidance and conflict resolution.
The dev.globus infrastructure comprises repositories, email lists, Wikis, and bug trackers configured to support per-project community access and management. Like Apache, we feature voting for new development decisions, so on a per-project basis, anyone can say, Hey, have you thought about adding this functionality or changing that interface? Individual projects are guided by democratic and meritocratic processes, with the commiters for a given project guiding that course. Globus has always been open source, but now many more of the decisions are transparent and contributions are much more straight forward.
GCJ: What types of projects comprise dev.globus?
Schopf: Dating back to the 1.1.3 release, the Globus Toolkit has been defined by five basic project groups - two which provide basic services and three which provide higher level functionality. The two basic services are the security projects which cover all the infrastructure you need for basic secure services, authorization, authentication, delegation and the like. Then there are the C, Java, and Python core projects, sometimes called the common runtime components, which provide all the plumbing a service needs in terms of addressing, notification, life time management, error handling, and such.
Those two sets of functionality underpin the other three groups. Those are the data projects-including GridFTP, the relaiable file transfer service (RFT), replicable location, data replication services, and OGSA-DAI, which gives you interfaces to databases-the execution projects-namely, GRAM and MPICH-G2, a Grid enabled version of the MPI standard - and finally, what we're calling information services projects, which is just the Monitoring and Discovery Service (MDS4) at the moment.
Within dev.globus there are also three types of non-technology projects - distribution projects, documentation projects, and the Incubation projects. We actually have just one distribution project right now, the Globus Toolkit distribution. This project decides what will be included in a GT4 distribution, and supervises the release process. But you can imagine someone might have their own distribution project that might encompass a subset-or a superset!-of what's normally in GT4.
Presently, we have one documentation project, the GT release manuals, although we've talked about converting the tutorials into a separate project. Again, in the future we might have a large collaboration like the Open Science Grid or TeraGrid start up a documentation project within dev.globus that would give details about the specific Grid software they were using, how it's been tuned for that collaboration, or cookbooks for deployment and such specific to their work.
The third and final type of project are the incubation projects, which is how new software gets added to dev.globus.
GCJ: Could you tell us more about the Incubator projects?
Schopf: Incubator projects are fledgling Globus projects. These aren't just patches or smaller bits for existing projects, but software that is entirely new to Globus.
We created the Incubation process to incorporate this work in an orderly fashion. In order to become an Incubator project, applicants answer a few simple questions, and then the Incubation Management Project (IMP) committee evaluate the incoming proposals. The approved projects then become what we're calling a proto-project (Apache calls these podlings). We familiarize proto-projects with the basic dev.globus infrastructure, the wiki pages, repositorits, licenses, and the mailing lists. We also assign them a mentor to help with voting, recruiting new members, and negotiating licenses.
We then conduct quarterly reviews. The formal criteria are online, but the bottom line is, do you understand what it means to be Globus? Do you understand how to be an open project with outside contributors? Is your project set up in a way that is acceptable to the open source community? One of three things can happen as a result of the review. You can stay an incubator project until the next review if we feel you're not quite ready, for example not of your committers have signed the license yet or if you haven't mastered the mailing lists. You can be asked to retire, although we don't envision that happening very often. The much more common exit point is what we've been calling escalation, at which point you become a full-blown Globus project.
So we've more or less adopted Apache's application process, with a few simplifications. And its working for us: we're accepting new projects, and we're hoping to escalate some projects shortly after GlobusWorld!
GCJ: What kinds of projects are you most receptive to?
Schopf: In general, just about anything can get submitted to us as long as it has something to do with Grids and Globus. If you look at the current projects, some of them are normal everyday software, like GridWay, which is a metascheduler, or the GridShib project, which interfaces between two security technologies, and some of them are a bit more unusual, like the Metrics project, which defines what metrics in terms of usage and performance Grid software should take into account.
One thing I should say is that the Incubation Management Project does not make any judgments based on technology. For example, we would never reject a scheduling project because we already have two included in dev.globus. As long as you show your project is relevant to Globus, our only consideration is whether you will escalate to a full Globus project.
GCJ: What if I were to propose something closely related to an existing project, like a tutorial?
Schopf: We would basically help the existing projects and the applicants determine how they fit into what we have, and what would make the most sense for all parties. So going withth example of a new tutorial project, we'd ask whether the new tutorial would fit better within an ongoing project (for example the current release documentation project or if there were another tutorial project already underway), or not. We'd ask, have you talked to the existing guys? And if they said, we have and our work doesn't fit within the existing project there for these reasons, and the existing project agreed, then they would very likely be accepted as an incubator project. We have no objections to having multiple approaches, or similar projects in general.
GCJ: Can you talk about some of the current projects in the incubator process?
Schopf: We've actually ramped up quite quickly from the start of the Incubator work in April and now have a dozen active projects, including a few we've grandfathered in that were already in de facto incubation before we had set up dev.globus.
Two of the pre-existing projects are the dynamic accounts and virtual workspaces projects, which are both overseen by Kate Keahey. Together, they project a vision of the Grid as a virtual environment, in which the locations of jobs and outgoing files are unspecified.
Among the newer projects, GridShib is a great one. They're incorporating Shibboleth's authentication procedures into Globus's security so that Shibboleth's substantial user base at universities can more easily participate in Globus projects.
GridWay is another project I quite like. Ignacio Llorente of the Universidad Compultense de Madrid oversees this scheduling project which which is built on top of GRAM, for job submission, and MDS4, for basic resource data. Altogether, it's a much needed piece of work that fills a gap we simply haven't had the opportunity to address. It's also worth noting that no one from the old Globus Alliance group is in any way associated with GridWay.
If you visit dev.globus, you'll find links to all 12 projects and their wiki pages.
GCJ: Finally, what should we look for in the next few months?
Schopf: Well, these projects have actually been alpha testing the incubator process itself. Alpha testing is just about finished now, and next, we're going to finalize the process, end to end.
For those projects in incubation, we will be conducting a final review prior to escalating them. So you can look for them to come online as full-blown Globus projects soon.
And as you know, we're accepting incubator projects pretty much all the time. If your readers are interested, they should fill out a candidate proposal, and send it to incubator-committers@globus.org. Of course, everyone is more than welcome to get in touch with me, as IMP chair, with questions, and I'll be at GlobusWorld as well if they want to talk to me in person.
In general, I think Globus is entering a period of stabilization. The old corps of Globus folks can't do it all on their own, and the Incubation process is proving a powerful method of encouraging open contributions and accruing new and interesting functionalities.
Editors Note:
Be sure to see Jennifer Schopf's talk on the Globus Incubation Management Project at GlobusWORLD / GridWorld this month.
close window |
|