Earlier this year, Linux management vendor Levanta made headlines with its sponsored study, "Get the Truth on Linux Management" In the study, feedback from more than 200 Linux users corroborated that Linux is not "high touch, tough to manage" - and that the management criticisms being made in the "Get the Facts" campaign are largely without merit.
Formerly known as LinuxCare - the early Linux services and support start-up that was born at the same time as Red Hat and VA Software - Levanta today manufactures a Linux management appliance. In this issue of the Globus Consortium Journal, we speak with Levanta's Vice President of Engineering, Adam Fineberg, who tells us about some of the key characteristics that make Linux especially desirable for clustering, Grids and virtualization.
GCJ: What are the characteristics of Linux that make it an attractive choice for Grid environments?
Fineberg: One of the key characteristics of the Grids themselves that make them so useful is the ability to so quickly and inexpensively keep scaling. So obviously Linux is well-suited for that, just by the nature of the fact that it runs on commodity hardware and the OS itself is free.
From a more technical perspective - some of the key aspects of an operating system that you really need take advantage of in a Grid computing environment are the networking and file systems. The networking side is very important because of the large number of nodes, the need to add times, exchange information between the nodes with low latency, as well as access shared storage systems and devices. Linux does very good 'zero copy' networking, meaning that once the data reaches the network stack, it doesn't have to be copied again all the way through the rest of the operating system. That really keeps the networking efficient in Linux systems.
With respect to file systems - because of the very strong interface that's defined within Linux, there are a great number of file systems that are available for you. And that's something that's fairly unique to the Linux OS. Most of the operating systems don't actually have a large number of file systems available for them, other than some standard ones like NFS. That makes it relatively easy to pick a file system that's well-suited for your particular application. So having access to, for instance, XFS or JFS -- which are two very high performance file systems that have good characteristics, but by the same token have very different implementations and therefore very different operating characteristics -- you can optimize by choosing the file system that's best suited for your application.
GCJ: Is the Linux virtual files system itself uniquely suited for Grid environments?
Fineberg: I don't think it's so much the VFS itself, but the fact that they have a very strong published VFS that enables all of these different file systems to be made available to Linux. For instance, if you look at Solaris or AIX or one of the other UNIX systems, they typically don't have a published VFS, or if they do, it's a published interface that's mostly well-suited for the particular files that they've already written. The nice thing about the Linux VFS is that it's evolved over the years specifically to enable supporting large numbers of different file systems with varied characteristics.
GCJ: Linux systems provisioning is one of the key focal areas for Levanta. How would you say the provisioning demands in a Grid environment differ from those in a normal production environment in enterprise?
Fineberg: It really just depends on what sort of application you're looking at. One application for Grids is weather modeling - and that's generally not an application where you need to do anything above the ordinary in the way of provisioning. The algorithms really don't change all that quickly, and because of that, you can essentially have a single image that you provision all of your servers from.
But then you look at something like a rendering farm for motion pictures, where you have huge numbers of servers all working on rendering the frames of a picture. What you find there is that they're continuously changing the algorithms - for things like motion and texture and all of the different characteristics - and they're constantly re-engineering all the algorithms to create the particular outcome they're looking for. It's the creative aspects of the application that makes it one that is continuously changing, whereas, looking at the weather scenario, there's not really any significant creativity element there. It's a matter of just putting more and more nodes in the Grid to get finer and finer resolution. The algorithm itself is essentially static.
GCJ: What's Levanta's take on the Globus Toolkit?
Globus is sort of at a different level than where we tend to focus - Globus is a set of tools and APIs for the applications, and while we can provision the applications and we can manage the deployment and change configuration, we don't actually involve ourselves in the application itself. So while some of the people that we've been talking to about using Levanta in their Grid configuration may use the Globus Toolkit, it's not something we would likely interact with directly.
Globus is for the underpinnings of Grid. Globus will help the commercial user develop applications in a way that is standard, has some open APIs, and does not require you to be a Grid expert while you're developing the applications. So I certainly expect to see many more Linux Globus Grids in use outside of the educational and scientific arena.
GCJ: Levanta was formerly LinuxCare - focused on professional services and support around Linux. One of the perceptions today about Grid is that it's very "high touch, tough to manage" - that you need an army of operations folks going around and configuring and fixing everything. Do you see parallels between the "high touch, tough to manage" fears / concerns that existed in the early days of Linux, and those that Grid will need to overcome?
Fineberg: Sure, and I think that's why Grid hasn't taken off yet, to a large extent. That's why people haven't invested in the applications to make them well-suited for deploying on a Grid. Grids are used in industry today primarily where there just isn't any alternative approach.
So again, going back to the rendering example, they just ran out of horsepower in any other approach. They didn't go to Grid because they thought it was more efficient and less costly. Frankly, while they care a lot about the cost, the cost is not the driving factor for the studios that are rendering movies. They have to get the movie done and they have to make it more exciting than everyone else's movies. So if that means going out and buying the latest and greatest SGI or specific HPC solution, that's what they are going to do.
But I think we're approaching a new phase where we're going to start seeing people deploying Grid applications because it really is more cost-effective and they really can scale better and they have much finer-grained control over their scaling.
No one wants to settle for saying, OK, I have ten boxes that each cost me a million dollars and I'm now at 95% capacity, so I have to spend another million dollars and have that machine running at 2% capacity for the next year. They want to look to Grid to significantly decrease the unit of cost to add more capacity. So I'm looking at $4000 to put another node in place, and yeah, that node is only a small fraction of what that big million-dollar machine was, but that's OK, I can make sure to get much higher utilization out of the machines that I'm actually investing in today. And when I need to buy another node tomorrow, I'm going to be buying a faster, cheaper node, because every day these things get faster and cheaper.
And it's really going to be more cost-effective once the Grid applications are out there and the tools to manage Grids are commonly known to people.
I would love to see Grid applications actually start being used very widely, because I'm a big believer in having that very fine-grained control over what you're doing, over what you're provisioning, over what you're managing, and over what your costs are. It's just it's always been very hard to do. My background really is from the academic and the research areas, and I have worked on a lot of applications that -- had we been able to deploy them in a Grid configuration -- we would have been able to do so much more for so much less.
GCJ: What are a couple of examples of the types of applications you were dealing with?
Fineberg: The one that I spent most of my time working on was statistical data modeling for speech recognition applications at Apple. Some of the modeling that I needed to do actually required me to get time on the Cray supercomputer to run that app. But the costs of doing that were just huge. And the problem wasn't so much just the cost, but that any given run that I was doing, I had no way of knowing ahead of time how successful it was going to be. So I didn't have a way of judging what the cost/benefit ratio was going to be for using that time.
And a Grid would have been a much more cost-effective way of doing it. Also, the whole idea of modeling the application in a granular fashion like that would have enabled me to do smaller tests and not essentially have a big monolithic thing that had to run to completion in order to know how successful we were going to be.
And when I was at IBM, I also worked at speech recognition. We had clusters of SP2 nodes that were ungodly expensive machines. But we were able to - using our own software that we developed -- do a very, very, very primitive form of Grid. We had job schedulers and you could set up your job in such a way that the job scheduler would know that it could run a whole bunch of these things on different machines at the same time in order to try and make use of these very, very expensive machines, to try and keep them busy all the time. So we spent a whole lot of time and effort making sure that we could schedule all these things in a very efficient way, because we had to keep the machines busy. Otherwise we couldn't justify the cost of them.
close window |
|