3 Questions: Sean FitzGerald on application scaling

Rogue Wave Software has acquired debugging technology leader TotalView Technologies. With its other 2009 acquisition of Visual Numerics, Rogue Wave is now the largest independent provider of cross-platform, embeddable development tools and libraries for supercomputing application design, testing and deployment. As supercomputing moves to the mainstream, the company's tools and components will enable software developers to take advantage of the performance and scalability that multi-core and parallel technologies offer. Sean FitzGerald, Rogue Wave’s Senior VP of Engineering and CTO, discusses the trends and the potential disruptions, and offers insights into how these drivers could change the marketplace.

SC Online: Are there more appropriate performance metrics for supercomputing applications available? Most users seem to agree that HPL, percent of peak and parallel efficiency are not good, so why do people use them? You can't translate time to solution across applications. Coming up with universal benchmarks is very difficult. No one wants to pay for them, and if you work in metrics you don't get published.

FitzGerald: Since parallel performance can vary by such large amounts, that measurement of a single number that is an average of performance is somewhat meaningless.

There is a need to have other tests of a systems capability that measure variations of performance as a function of the computing environment, size of the problem, and number of processors. It will have to be similar to the RAPS (Real Applications on Parallel System) benchmark effort the ECMWF headed up in Europe where evaluation of hardware was based on real weather forecast application performance.

I believe LINPACK and the Top 500 are not evident of a systems entire capability, yet still provide value as a benchmark in terms of:

  • a systems comparable performance for a particular benchmark;
  • its independent view of performance as a standard test;
  • its comparative and market value in driving competitive development; and
  • the visibility of current trends in hardware architecture deployments amongst other market data.

SC Online: Similar to disruptive technologies thought to be needed in many hardware road maps to reach exascale, are there looming challenges whose resolution will be game changing over the next decade? What is the role of accelerators like GPUs in the next three to five years? Will NVIDIA CUDA architecture dominate the supercomputing field?

FitzGerald: Computational science fuses three elements: algorithms, computer and information science, and the computing infrastructure. To date, the largest challenge in achieving exascale will be developing the awareness and coordination of the software, hardware, and computing/communications infrastructure of these massively parallel systems. Each element of the system must not only be aware of the computing architecture and various computational elements, but will need to adapt to changes in problem sizes/types to maximize the entire system most efficiently. Today applications on large parallel systems are hand-tuned; in the future the coordination between the hardware and software interfaces will need to be more aware and adaptable. 

Over the next 3 – 5 years, the role of accelerators (GPUs) will remain predominately as accelerators in areas where large amounts of streamed data and/or graphics need to be analyzed or rendered. Advances in vendor-supplied BLAS such as CUDA BLAS and higher performance double precision accelerators will attract more science and research adoption in large systems where the class of problem is well-suited for the GPU.

There is a good chance that NVIDIA will be the dominant player in the market. Whether CUDA or OPENCL dominate as a choice of language is TBD. Will GPUs dominate the HPC market? If domination means displacing CPUs, then this is unlikely, given the limitation on the types of problems that are either best-suited for many-core vs. GPUs processing. There will certainly be more and more interplay between the architectures. 

SC Online: If you make errors for hurricane tracking, that could kill a lot of people. Should the development of applications continue with current programming models? Are there more evolutionary approaches to debugging applications on hybrid x86/GPU systems?

FitzGerald: The current programming models will continue until new programming models are proven for critical systems. Evolutionary approaches to developing, optimizing and debugging parallel distributed applications will emerge as adoption of hybrid architectures increase. Unfortunately (and as I stated earlier), the challenge of tool/application interplay with hardware (architectures), software (algorithms), and infrastructure/communications limits the ability to develop and debug applications in a managed way. The TotalView debugger is a good example of a tool that supports a managed infrastructure and communications layer that also links into the debugging interfaces of the compilers and operating systems, as well as the applications execution. Continued advancements in hybrid systems (accelerators (GPUs), many-core (SMP), and distributed computing (MPI)) will result in evolutionary - and even some revolutionary - changes in the SDLC and the tools to support developers.

{slideshare}[slideshare id=4160028&doc=roguewavecorporatevisionp51910-12743048324027-phpapp02]{/slideshare}

SC Online wishes to thank Sean FitzGerald for his time and insights.