Japan's RIKEN designs supercomputer on a chip

RIKEN, Japan's Research Institute of Physical and Chemical Research, gave details on the MDGrape 3 processor yesterday at the Hot Chips conference taking place at Stanford University. Makoto Tanji, a researcher with RIKEN's high-performance computing group, said the processor will become the base for a supercomputer capable of operating at a petaflop. Samples of the chip, which was designed for life sciences research, can now perform 230 gigaflops, while running at 350MHz, better than standard general-purpose chips. In a worst-case scenario, the chip performs 160 gigaflops at 250MHz. The computational power comes because the chip is specialized for workloads that involve numerous, similar calculations on a comparatively small data set. This sort of workload is common in the life sciences, where researchers need to examine, for example, how a single protein interacts with thousands of different molecules. "We can obtain about a 100 times better performance through specialization. The number of operations are more limited on a general purpose computer," Tanji said. For the MDGrape 3 to shine, "the amount of computation must be much larger than the data," he added. The University of Tokyo initiated the MDGrape project 15 years ago to develop a chip for astrophysics. RIKEN, which is one of the world's largest biosciences institutes, has worked over the last several years to extend the chip's architecture to life sciences and molecular dynamics because the range of applications is wider, Tanji explained. The group will create computers based on the chip for its Protein 3000 project to determine the characteristics of 3,000 proteins. Those machines should appear sometime in 2007. Commercial systems using the MDGrape 2, which can churn at 16 gigaflops and run at 100MHz, are currently on the market, he said. The MDGrape 3, also know as the Protein Explorer, should start to be used to run applications in 2006. Research also continues at the University of Tokyo to develop a quasi general purpose chip capable of 1 teraflop. IBM and the University of Texas have a similar teraflop-on-a-chip project. Architecturally, the MDGrape 3 differs substantially from most other chips. It comes with 20 pipelines for calculations, the equivalent of an assembly line for a processor. Commercial chips typically have one or two. The chip also features what RIKEN calls a broadcast memory architecture, where data is force-fed to the different pipelines simultaneously. Parallelization, a design convention that aims to cut down on redundant or parallel calculations, is optimized in the chip's design. Despite the differences from other chips, the MDGrape 3 is built on the 130-nanometer process, a manufacturing convention that has been in place for the past few years. The enhancements lead to huge advantages over general purpose processors. Tanji said the 350MHz Grape 3 can provide a gigaflop of computing power for $15, compared with $400 per gigaflop for a Pentium 4, $640 per gigaflop for the chips inside IBM's Blue Gene/L and a whopping $4,000 per gigaflop from NEC's Earth Simulator. In terms of power consumption, the 350MHz MDGrape 3 consumers 14 watts of power, or 0.1 watts per gigaflop. A 3GHz Pentium 4 runs at 82 watts, or 14 watts per gigaflop, he said. The Blue Gene/L chip and Earth Simulator come in at 6 and 128 watts, he said. RIKEN is also designing the computer that will house the MDGrape 3. Twelve chips will fit on a board, while two boards will fit into a 2U-high box (3.5 inches). The chips are all connected to each other through an 81-bit bus, and the boards are connected to the rest of the computer through PCI Express. The petaflop computer will consist of 6,144 processors on 512 boards clustered together. In all, the system will fit into 32 boxes that will stand on 19-inch pedestals.