Performance Prediction Studies and Adaptive Refinements for Local Area Weather Models
by Kläre Cassirer, Reinhold Hess, Wolfgang Joppich and Hermann Mierendorff
The new generation of numerical weather prediction models is designed for parallel machines. Since the development of such a code can take several years of time, the increase of computer power during this time frame has to be considered in the design. But there is not only an increase of computational power, further on also research on new and faster algorithms takes place in order to predict the weather even more accurate and reliable in the future. Current activities within the theme cluster METEO of GMD Institute for Algorithms and Scientific Computing concentrate on performance prediction studies for meteorological models and algorithmical research on local adaptive refinements.
Numerical weather forecasting and climate prediction require enormous computing power to achieve reliable results. Six of the twenty-six most powerful computer systems in the world are dedicated to weather research (November 1997). All of them are parallel architectures with the number of nodes ranging between 64 and 840.
In order to exploit this high computing power existing codes have to be adapted to parallel computers, and new parallel algorithms are developed.
After the year 2001 the operational Local Model (LM) of the Deutscher Wetterdienst (DWD) will run in a horizontal resolution of approximate 3 km mesh size by 50 vertical layers and with time steps of only 10 seconds. Since a one-day prediction on about 800x800x50 grid points has to run within half an hour of real time in the operational mode, enormous computing power is demanded.
Currently no available machine in the world can fulfill these computational requirements, and direct run-time measurements for the LM with the operational resolution are not possible. Therefore, a study was initiated to predict the performance of the LM and to define specifications of adequate parallel systems.
The computational complexity was modelled with series of LM-runs on an IBM SP2 with up to 32 nodes using different spatial and temporal resolutions and different numbers of processors. In a least square approximation the coefficients for approximation functions were determinated. With calibration runs of the LM with lower resolution the model could be scaled for various existing machines. These run-time measurements are presented in the figure 1. Note, that different parts (computation and physics) of the LM scale in a different way, a model for each individual part was set up therefore.
The communication requirements were determined by a structural analysis of the LM. The communications, the number and of sizes of the messages to be exchanged, were parameterized by the size of the global grid and its partitioning. For existing machines the communication model is based on communication benchmarks. However, standard benchmarks measuring ping-pong, etc., were not fully sufficient and a special communciation benchmark measuring the special data exchange of the LM had to be implemented. Hypothetical architectures can be modelled by assumed characteristics for computation and communication, as sustained flop rate, latency and bandwidth.
As a result of the study, the predicted required computation demands are tremendous. Indeed, no currently available machine meets the requirements mentioned above. A hypothetical parallel computer at least requires 1024 nodes with 8-10 GFlops each and a network with about 250MB/s bandwidth. However, it can be assumed, that in the year 2001 machines of this style will be available on the market.
From these requirements for an operational system it becomes obvious, that algorithmical improvement of meteorological models is very important. Since also new numerical models have to run on parallel systems, parallelism becomes a very important factor for modern algorithms beside numerical efficiency.
A promising idea is to apply dynamically adaptive local refinements to numerical weather simulations. The computational costs could be essentially reduced, when high resolutions are provided only where it is necessary (eg weather fronts, strong low pressure areas). Calm regions could be calculated with a lower mesh size. However, since the weather situation is changing during the simulation, the refinement areas have to be adapted in time.
For the Shallow Water Equations, which build the dynamical core of numerical weather predictions, a parallel local model with dynamically adaptive local refinements has been developed and implemented. On a structured global grid, refinement areas are composed of adjacent rectangular patches, which are aligned to the global grid with refinement ratio 1:2. With a suitable mathematical criterion the refinement areas are dynamically adapted to the calculated solution.
Major problems for parallel computers with distributed memory are the organization of data and the dynamical load distribution. Asynchronous, non-blocking communication is used in order to combine adaptivity and parallelism best in this approach.
Reinhold Hess - GMD
Tel: +49 2241 14 2331