Mesoscale Meteorology and Grid Computing - the L.E.A.D. Project
When we discuss Grid computing's role in mass-scale data crunching and predictive analysis, many of us associate the discussion with the financial services industry - where Grid powers complex Monte Carlo models to help large brokerages glean new insights into capital markets outcomes / scenarios.
Today, a group of mesocale meteorologists ("mesoscale" referring to the a range from 1-1,000 kilometers) are applying similar Grid principles to making sense of weather patterns - and specifically, predicting and analyzing major weather anamolies (like tornados, hurricanes, earthquakes, etc.).
Predictive analysis and data crunching are certainly not new phenemoena for meteorologists - but what we're seeing today is an increasing sophistication in tools at their disposal. The next generation of Doppler radar is pushing the envelope in terms of their sensitivity. The volume of data they are retrieving and the granularity of that data is continually on the rise. Thus, the sheer amount of data that meteorologists must crunch / analyze on the back end is getting greater every day. In addition, this next-gen Doppler technology have both increased configurability and the ability to react to new weather patterns on the fly. Hence the need to analyze data quickly and simulate possible outcomes, so these instruments can make active adjustments is also a concern.
Currently the way weather prediction is most commonly approached (from an IT infrastructure perspective) is to run fixed simulations on static data from recent (often outdated) radar feeds. These are often very brittle, one-time fixed predictions. And today's approaches also require a high degree of human touch - it takes some individual in the loop to notice something interesting and manually reconfigure the simulation to focus on these specific areas of interest.
The LEAD (Linked Environments for Atmosopheric Discovery) project out of the University of Oklahoma is driving the effort for a new Grid cyberinfrastructure that can take best advantage of the next generation of Doppler radar - and make the simulations much more dynamic. The stated goals of the program are to: (a) change configuration rapidly and automatically in response to weather; (b) continually be steered by new data; (c) respond to decision-driven input from users; (d) initiate other processes automatically; and (e) steer remote observing technologies to optimize data collection for the problem at hand.
"The idea here, and the connection to Grid, is that we would like to be able monitor these radars in real time and have data-mining services that are looking at their output," said Dennis Gannon, Professor of Computer Science at Indiana University (one of 9 institutions participating in the program). "The data mining, if it's well done, can actually take that radar stream and pinpoint interesting things developing, automatically. And when that happens, it will trigger a particular simulation workflow on the Grid. That workflow will do a number of what we call 'data assimilation tasks' and assimilate all the observable weather data that they've got in a given region and prepare it for a set of simulations. So we then might use TeraGrid or a large grid to launch maybe a 100 different simulation scenarios for a single county in Oklahoma."
In order to get to this new level of autonomic computing to support weather prediction, the LEAD project is developing a Grid service infrastructure, built on the Globus Toolkit and GGF standards and protocols.
"In order to do all the required computation, you've got to move data around all over the place and grab the computing power on the fly," said Gannon. "You also have to be able to handle the Grid policy negotiation to automatically kick other users off of compute resources when you need them for your prediction. You have to be able to dynamically allocate these Grid resources. The Grid service infrastructure plays that role for us."
LEAD is also working extensively with GridFTP. As radar data is dumped into file directories, data mining tools using
GridFTP pulls that data to the machine(s) running the simulation service, where it can get scheduled into the analysis.
If an anomaly (or something otherwise interesting) is detected, the data mining services take that data and push it to
other services, which assimilate that data from other radar data. Other sets of weather data might include humidity
sensor data, barometric pressure data, terrain map data, lightening strike data - there's an enormous range of data in
weather prediction, and the assimilation of those different types of data is an ongoing challenge. Ultimately, after the
assimilation has occurred, GridFTP might take the output from the simulations and move it to a visualization service - which generates a picture for a meteorologist to analyze.
While this near-term analysis has the potential to provide an important life saving service for storm prediction, there are
long-term analysis benefits as well. Better modeling to predict long-term weather patters more accurately are of keen
interest to the energy sector. When energy companies can better determine their future electrical generation requirements or the amount of natural gas or fuel oil they will be selling in the coming months, a prediction based
almost entirely on future weather patterns, more efficient use of available resources and therefore increased profits are
gained.
close window |
|