MEDICUS - Globus Based Radiology Image Delivery
Stephan G. Erberich
Assistant Professor of Research Radiology and Biomedical Engineering
The Saban Research Institute
Children's Hospital Los Angeles
Stephan G. Erberich

GCJ: Can you please tell us about MEDICUS, what it is, and how it works and what it does?

Erberich: The MEDICUS project actually started about three years ago, when Dr. Nelson, who's chair at radiology at Children's Hospital Los Angeles, and myself got a grant awarded by the NIH to build up a data center and connect, at the time, 23 hospitals which perform pediatric clinical trials for the Children's Oncology Group. At the time, we looked at the grid technology, and at that time it was kind of sketchy in terms of what grid technology would prevail and become the standard. Globus was obviously one candidate, but it wasn't clear. There were other technologies available as well.

So we started off in connecting these sites using virtual private network technology, but after a year we found out that that's not a very feasible technology. Because it's peer-to-peer it's very limited, and it doesn't allow cross-communication between different sites. It's basically a star typology. So we looked into grid technology again, and at the time, web services became an issue within the Globus Alliance Globus Toolkit. So we thought that would be the right model and building up an infrastructure based on web services. And most of the things which are needed within the MEDICUS project are already part of the Globus Toolkit. Basically, data storage and data transport using GridFTP, for instance, or a security model using X.509 certificates were in place. So it was for us, well, just a very small step to basically integrate medical devices into the grid.

What we really needed to have in a medical Grid is the capability to exchange and to manage Digital Imaging and Communication in Medicine (DICOM) images. The DICOM protocol is basically a network protocol and data format for medical images. And today, every vendor of medical devices, like GE, Siemens, Philips, and so on, have to comply to that industry standard. So what needed to be done is that this DICOM protocol had to be translated into the grid domain. The goal of the MEDCIUS project was to develop this interface which can communicate DICOM images smoothly and transparently, totally invisible for the end user, into Grids, back and forth. As a result Radiologist can now share images globally using the underpinning Grid.

So we decided to develop this interface, which we call the DICOM grid interface service, and that's the core part right now in the MEDICUS project. At this point we were already collaborating with Carl Kesselman and Ann Chervenak from the Information Science Institute. They helped a tremendous deal to understand the Grid technology involved in a project of this scale. So we teamed up and designed the whole project together. About a year ago, we had the first release of the DICOM grid interface service ready to deploy, and we deployed it at a few sites. And we demonstrated this workflow, for medical devices and local image databases, called PACS systems, to the grid and back and forth, at the 2005 annual meeting of the Radiology Society of North America.

Then we continued to develop certain aspects, technical aspects, improve certain methods within that DICOM grid interface service, and we submitted the project to the Globus Incubator mechanism, and it got accepted. That was in - I think August or September - when the Globus Medical Imaging and Computing for Unified Information Sharing project (MEDICUS) became its name and official status as open-source software. Besides the DICOM grid interface service a web-based tool, the DICOM Clinical Trial Identifier, to preserve privacy by removing patient identifying information from the images is also part of the MEDICUS. So we found that a name for this project had to be something which is overlapping all these different tools which are in there. So that's what MEDICUS stands for. It's basically an environment with a set of tools which allow you to share and process medical images.

A first use-case of MEDICUS was for clinical trials, where we have, at this point, 40 medical institutions joined in a Globus Grid communicating images. One key feature of the MEDICUS design is that Radiologists will continue to use their existing DICOM legacy infrastructure. So you have these very expensive radiology display workstations on the order of tens of thousands of dollars, which are used daily to read the diagnostic reports from. They load the images on the display work station, go through the files and review all the images, and then report the case. Now MEDICUS delivers these images over the Grid to virtually any review end-point, conveniently reusing the individual Radiologist's environment.

You have to understand that the typical hospital scenario - and that's a use case which we are targeting for 2007 - is not on the limited order of clinical trials. Our clinical trial data center currently only has about 500 cases or about 40 gigabytes of data, because these patients are well selected depending their disease. That's not a lot. If you think about clinical use of images, a typical radiology practice actually creates about, I would assume, about 10 gigabytes or more of data per day. So you can easily see that the real application is not within clinical trials, because these are only a very few selected research study cases, but within the broader picture of total clinical image workflow. I think the real interesting and challenging use-cases are where community hospitals start sharing images across hospitals and practices. Then Radiologists can tap into that image pool from anywhere using grid technology.

So that also brings some benefit, not only in terms of workflow, but also in data safety and data security. First of all, you have to understand that medical images, especially for the clinical domain, are required to be stored for a certain period of time. Usually that's ten years. Maybe there's some variations between state to state, but - so the hospital is obliged to store the data for a long time, and that causes a big problem because you have to create backups of image data, and because image data is such a large scale, it is a hassle.

So with grid technology and data replication, you have the mechanism on hand which you can really easily leverage on cheap replicated data storage devices, where we show that you can get about a terabyte for $1000, deployed on very cheap, very readily available PC hardware using Linux operating system, and obviously GridFTP and the rest of the Globus suite, to create replication of data. That's a very interesting concept, and that catches the eye of some of the PACS vendors, actually. And we teamed up with Fuji Medical Systems to do a demonstration at this year's 2006 RSNA - Radiology Society of North America meeting. And we demonstrated what we call the "Next generation enterprise Grid PACS system", which uses grid as the federation of storage devices and replicating data for clinical use.

You have to understand that it's really mission critical for a hospital or medical practice, that when images - if an image flow is interrupted because of failure, there has to be a plan B scenario, where another device can take over. With grid, you can just do that - you find a new service who will provide you that data and take over that faulty device. So using, for instance, the RLS service, replication location service, you can have multiple instances of a single image being represented in the index, and the DICOM grid interface service can find another replication of that data in an instant and then get the image from there and failover a faulty storage device. That's very key and very critical in that healthcare industry.

GCJ: You mentioned a number of times medical devices. Can you give some kind of flavor for what types of devices you're talking about?

Erberich: If I'm talking about a medical devices, I'm talking about medical imaging devices. Medical imaging devices like magnetic resonance imaging, MRIs, or computer tomography, CT imaging devices, they all create a large number of images, and they're used daily. A typical practice creates about 20 MRI cases a day. And that's about, as I said, about in the 10 gigabyte data scale per day for the whole practice. So they have to ship off that data. Usually the larger hospitals, like 400 beds and plus, they have their own PACS system, the picture archive and communication system. And that software market alone is about a $1.6 billion annually in the United States.

What that means is that grid can actually be an intelligent partner to that PACS system. These PACS operate locally, they're completely enclosed within the radiology department and nobody else outside the hospital can ever see these images. So imagine now, when you leave the hospital and you have to be treated somewhere else, the procedures have to be repeated unless you have a CD-ROM with your data with you. You can try to recover the images from the previous hospital, but it's cumbersome and unlikely to happen in busy radiology operations. They just will repeat the image examination and get the new images. That's not really good for the patients, because that means, in the case of CT, for instance, another dose of radiation for you. But it also is not smart in terms of cost, because it's basically reproducing the same image data set. This is really something which happens.

So by aggregating the images with grid technology, you are able to share these images with different medical facilities. And this is something which is able now with MEDICUS. We are kind of the first project tapping into that potential, using a pure standards based Globus Toolkit as the underpinning infrastructure.

GCJ: And how has MEDICUS been received by the physicians and healthcare providers and researchers?

Erberich: Well, the researchers like it because they don't see the grid, they don't have to change their environment. They can still use their own display workstations. Before, typically in a clinical trial, you have to install separate software, you had to use a special viewer that did not have the capabilities of professional display workstation software. And you have to understand, radiology it's not only a science, but it's also an art, so every radiologist prefers to have his own set of tools to read images. The software which radiologists decided on, it's really his workbench. And if you can deliver the images to that workbench, then this radiologist is more efficient, this is obvious. I would assume that keeping the existing environment in place also impact the diagnostic value, especially for research cases.

Because of that, I think that's one of the biggest benefits which we see from the response of radiologists today. They think that they see a huge PACS system, a huge data pool with all images available at their finger tips. And that's what is provided by MEDICUS. MEDICUS basically links the local database with the grid, and everybody on the grid can trade the images which are available on the grid.

GCJ: That brings up the question of platforms for a grid application like MEDICUS. What are the platforms you're seeing folks use MEDICUS on, or what's supported?

Erberich: Right now, the current development release of MEDICUS is geared towards the Linux platform. That's because of the Globus Toolkit is primarily developed for the Linux environment. But it can be used on Windows platforms and Macintosh platforms as well. MEDICUS deployed on a Linux computer, small or server-class, serves the whole hospital. So the deployment is actually very easy with a light-weight footprint. And that's how we deployed MEDICUS at these 40 hospitals, they have a laptop with MEDICUS and it translates all the DICOM queries and requests from the DICOM devices, and translates that into grid requests.

We don't actually interfere with the hosting environment of the doctors. And that's one really critical component, because it's very difficult to get any software deployed within a hospital because of the strict security concerns about patients' privacy and operational interference.

GCJ: And the follow-up question there is for hospital IT staff is this something they could install and get running themselves, or do they need external outsourced help to do it?

Erberich: Well, as far as the MEDICUS, it's actually straightforward and very simple. But as you might know, to install the whole backend infrastructure, like the Globus Toolkit, as well as the data services like data storage providers, the meta catalogues, and the security mechanisms, that's probably something which somebody has to have experience with grid installation.

So I think one would say if there is a provider of the underlying or underpinning grid infrastructure, the deployment of the MEDICUS instance is very simple and can be done from what typically would be a PACS administrator's job to install that. So every radiology film library or radiology department already has a dedicated IT staff. So that would be the right person to train in using MEDICUS. But we have to see - 2007, it will be interesting for MEDICUS, because it'll be the first year where we have it out there as an open source project. It's available for people to tap into.

GCJ: I'm glad you brought up 2007. Greg Nawrocki, our president, is calling for 2007 to be the year of the grid application. I'd like, from a higher level, to get your thoughts on what that means, how it happens, and how MEDICUS and other projects fit into that kind of calling for this year to be the year of the grid app.

Erberich: Yeah, I think that this goes smoothly from your previous discussion. I think that Globus, at this point, has a very well-documented installation procedure, and that allows more and more people to actually utilize grid technology, meaning the Globus Toolkit. Saying that, you will see many more applications coming up now, because you have a very rich set of core infrastructure components, like data storage, data management, execution management, and discovery and publication tools, within the Globus Toolkit as the core infrastructure. That's pretty sorted out and pretty settled, mature services.

Now, I think the MEDICUS project is kind of an interesting project, too, because it fits 100% this application level, because it vertically integrates all these core services, like MyProxy security management, Shibboleth for role-based attributes, OGSA-DAI service as database backend for the DICOM patient and study information, as well as the GridFTP for data storage, and data replication location service for indexing. Using the core infrastructure of pure standards based Globus Toolkit in application projects like MEDICUS helps not only the application development cycle, but it also supports continued development and maturation of the core Toolkit. This is very essential to open-source software like Globus Toolkit, that adopters do not reinvent the wheel, as we have seen in many recent Grid projects.

So I think - to answer your question, I think that 2007 is probably a good year to call it an application year, because I think that there are more applications out there besides MEDICUS who really don't try to reinvent the grid wheel, but to leverage on what has been developed so far within the Globus Alliance and bring higher-level application based on these core services. And MEDICUS is just one of these applications which tap into the potentials of grid technology.

GCJ: Right. Another big theme, more of an industry theme - it was for 2006 and will be for 2007 - is the whole notion of virtualization and SOA, service-oriented architecture. How do those technologies fit into the MEDICUS mix? How does MEDICUS support them? Is there any thought there?

Erberich: Well, that was one of the schemes which we tried to see as the future point within the MEDICUS project, and we demonstrated that and discussed that with this industry and academia leaders at the Radiology Society of North America meeting about three weeks ago in Chicago - is that if you use an open source and standardized environment like the grid, like the Globus Alliance - Globus Toolkit, then you come naturally to the point that you provide web services as services based on for-fee at the commercial endpoint or for free in research. But the underpinning Grid infrastructure must remain open source and standards based.

Right now, all these hospitals and medical practices have their own PACS system and whatever vendors they have selected for providing that storage service, they pay a lot of money for that. And it's actually better for the patient, as well as for the radiologist, to open up and share these images, and then these hospitals contribute this data into a larger grid, in a kind of a healthcare grid. And then from a commercial side, you can provide on that grid a web service. For instance, for storage or for image processing or for data mining, just to name a few.

I like to compare that to our roads, for instance. You like to drive on streets for free. Nobody wants to use toll roads unless it's necessary, and then you make a clear decision if you want to use toll roads. So the infrastructure has to be public domain and thus is paid by everybody through taxes. Now given this public domain infrastructure people use shops along these public streets, they can provide services to you. And that makes total sense in the healthcare, because that will consolidate a lot of very pricey services which are currently billed to every single hospital, which can be consolidated very nicely.

There's one example to SOA is utilization of installed equipment - and I got actually a lot of response to that idea from healthcare vendors. What they see right now is that most of the hospitals, especially the larger hospitals - 400-plus bed hospitals - they already have their PACS system installed, and they have their own staff, and it's very expensive for them, but they have to do it anyhow. Now the smaller hospitals apparently are a market to conquer and installing big equipment is not feasible, because of high cost vs. low utilization. So a service based business model is clearly favorable and a win-win situation for both industry and healthcare provider.

And they saw that grid technology is maybe the way to do that, because then they can provide, using an open source standards domain - the grid - and then provide services on a per case basis. That's a very interesting concept, I think, because at that point it's commercially feasible, and then it's probably pushed by the industry. So that's some new development which we see happening.

GCJ: Speaking of new developments, what would be next for MEDICUS after imaging? Is there some other application of the technology that is planned or could be used?

Erberich: Actually, I like the Incubator project definition from Globus Toolkit so much, because it opens up the concept of that everybody can contribute to the MEDICUS project or other projects. Everybody can become a committer. And we really want to encourage other people to think about it. We are not the experts on electronic health record domain, which is what you refer to. But that's - from a grid perspective, that's a logical next step, not only to have medical images, which we are focusing on, but also have lab reports, medical history, more - other diagnostic information available within the grid. And we are talking about using mostly text information, which is from a scale perspective, very easily to cover with the current grid technology.

But there are other leaders in that field, which we want to encourage to commit either to MEDICUS or start their own projects within the grid domain, so that these entities can be blended within a grid. I think the grid concept, and especially web services, have the potential to be at the right paradigm, because people have tried to develop the electronic health record as one monolithic data set. And that's very difficult, because you're crossing so many disciplines and sociological barriers. And that was probably the reason why there hasn't been much movement in the past years to come up with a final monolithic data structure for describing electronic health record.

Web services, on the other side, breaks that apart and says, OK, we can start with medical imaging, and then we have another web services come out for pathology data. And then so on. In this way, you break it up, but you still federate all this information entities within the grid. It's probably the more feasible approach to that complex data which you find in the medical domain.

GCJ: Are there any big announcements or developments we'll see in calendar year 2007 for MEDICUS? What should we be on the lookout for?

Erberich: I think the roadmap for MEDICUS at this point is to stabilize some of the development, which we have done. We are targeting in the coming weeks to have the RSNA 2006 release available, which brings some changes in the security model. We want to add the SAML attributes provided from the Shibboleth project using the GridShib mechanism, and add that to MEDICUS. That's a very important step, because right now our use case is, as I said, at clinical trials, and in clinical trials you have de-identified images. But if you want to go for clinical images, then you have the patient identifiers on the images, and you have to remove these - you have to remove these patient identifiers in order to share the information.

The way that some people be able to see the identifiers, because they are physicians who read the case and write a report on that, but there are other people who should not see the identifiers, but provide a service. Like the company which provides an image processing service only needs to see the raw image data, but not the patient identifiers, but still need to reference the case. They don't need that to see. And there's a compliance which all manufacturers and all healthcare professionals have to follow. It's called the HIPAA regulation. That's a government regulation which prohibits private data to be seen by unauthorized personnel. So that's one of the issues which we will address in the next release.

And as for 2007, I think we probably will provide an image processing service model, so at least a kind of reference implementation for other people to go into and look how they can utilize grid technology to provide web services which can manipulate and create new derived data sets from medical images which are then available using the MEDICUS project. So, from our point of view, is bringing in the clinical use cases, federating clinical hospitals and provide clinical data access is our mission for 2007.

GCJ: You mentioned HIPAA. Does working within this strict set of regulations complicate things?

Erberich: Yeah, we already have the solution for that. We just have a prototype version of implementation. And the way we do that is it uses the Shibboleth method, where you get authorization based on assertions about you like which institution you belong and which department you belong to. And these assertions about you can be trusted within the grid domain, which is a very, very important feature that's totally critical. And with DICOM data, there's no credential mechanism attached. These images represent a flat data model where everybody connects with everything.

So what we added to that is, using the Shibboleth method, we can distinguish that only a specific person can see specific data sets. So when I send data into the grid, this image data set gets attributed with my assertions. So for instance, this belongs to Institute ABC, and I'm the owner of the data, and this is preserved within the grid. So if somebody comes with other credential, he might be able to see the data portion of the images - so the pixels - but because that credential doesn't have the assertions to belong, for instance, to this institution I belong to, they don't get the private identifiers, which is patient name, patient ID and so on. But they get de-identified information. All this is hidden from the user and the DICOM specific end-points.

I think that's very critical - the most critical component whenever you want to federate hospitals. And again, we don't reinvent the wheel. We take what the Globus Alliance is providing here within the Toolkit and integrate that on the application level.

One of the projects which we want to try with - this is actually with the colleagues in Chicago and the USC group, is to set up a test bed and demonstrate how clinical workflows can handled in a Grid. What we are up to is a real life clinical scenario where a hospital turns the image workflow over to the Grid using multiple storage Grid service providers. This would be a compelling but challenging use-case, not only from a technical point of view.

You have to always understand that there is a great deal of sociology involved. The radiologist is not very interested to share his images, because that means competition. And the second thing is the hospital is not very interesting to share the images, because that means reduced revenue. But at the end of the day, it's the patient and it's the patient advocacy group who will probably be a big partner here, because they want to make these images available, because they belong to the patient. And if the patient is moving, as you see a lot of movement, especially in this country, these images cannot be redone. It's not medically feasible, it's not ethical, and from a cost perspective, it doesn't make any sense.

And if you think about the scale, what we're talking about, if you bring even only 1% of all the hospitals into a nationwide medical grid, these are thousands of instances. So if you think about the scale and the potential within the overall radiology domain only, not even talking about the electronic healthcare records with all the other medical information that you have asked before. I'm just looking at the medical imaging domain. This is a huge, huge data deluge that you have to handle, and I think grid technology is the only feasible technology at this time to implement this.

close window