PRACE evaluated additional prototypes for next generation architectures

In addition to the well-known PRACE prototypes for the first generation of European Tier-0 centers, the PRACE work package for “Future Petaflop/s computer technologies beyond 2010” has evaluated 12 additional prototypes. The in-depth assessment of prototypes has been a perfect complement to the continuous technology survey established by PRACE.

The investigated next generation architecture prototypes are full systems, system components or software prorotypes. Additionally, several research activities have been carried out. Both, prototype assessments and research activity results, are summarized in the PRACE deliverable D8.3.2. The following hardware, software and research activities have been assessed:

Systems

  • CINES and LRZ have jointly evaluated a hybrid system architecture containing thin nodes, fat nodes and compute accelerators within a shared file system with components from SGI and ClearSpeed/PetaPath.
  • FZJ has extended the communication capabilities of their Cell-based QPACE system to estimate its suitability for a wider range of applications. This enhancement allowed running Linpack on the full system which made it no.1 on the Green500 list of energy-efficient supercomputers.
  • NCF has assessed a system composed of ClearSpeed/PetaPath accelerator boards together with the ClearSpeed programming language Cn.

Software

  • CEA has studied the performance of GPUs using CAPS hybrid multicore parallel programming (hmpp) workbench on NVIDIA Tesla.
  • CSC studied the maturity of OpenCL and performance improvements for multi-GPU programming on NVIDIA Tesla and AMD Firestream cards.
  • CSCS evaluated the ease of use of the PGAS programming model by using the Cray Compiler Environment for UPC and CAF.
  • EPCC evaluated the HARWEST Compiling Environment for developing programs on their FPGA-based supercomputer “Maxwell”.
  • LRZ assessed code and performance portability of the RapidMind multicore development platform across architectures (Cell, Tesla & Nehalem-EP).


Tools

  • BSC did an in-depth performance analysis and performance prediction for full PRACE application codes to show the capabilities of their tools Paraver and Dimemas.

Components

  • CINECA evaluated the performance of I/0 and the Lustre file system, and assessed the advantages of SSD technology for metadata handling.

Energy efficiency

  • PSNC and STFC have jointly assessed the power efficiency of different hardware solutions together with the power consumption profile of HPC servers.
  • SNIC-KTH studied the achievable energy efficiency of commodity parts and commodity interconnects for cost efficiency and a minimal impact on the programming model.
The results show that some hardware accelerators have indeed the potential to substantially increase performance and/or power efficiency of traditional HPC systems. But software environments for hardware accelerators are not tailored to the demands of the scientific computing community. They need to become more stable, easier to use and better supported by debugging and optimization tools.

A good software solution which enables highly scalable codes is substantial in the new era of multi- and many-core chips. Scalability will soon be the most limiting factor for application performance. Concerning system architecture, both homogeneous and accelerated clusters with ten thousand compute nodes as well as massively parallel systems with several hundred thousand low power compute nodes seem to be the dominating architectures for the next five years.

PRACE will bring together all people actively involved in the assessment of the prototypes, the programming languages and the research activities in a workshop on March 1st and 2nd, 2010. The “New Languages & Future Technology Prototype” will be held at the Leibniz Supercomputing Centre in Garching near Munich, Germany.

More information: Iris Christadler, e-mail: christadler(at)lrz.de, Dr. Herbert Huber, e-mail: huber(at)lrz.de