Jean Gonnord, CEA, discussed the Bull 60 TeraFlop/s acquisition

By Uwe Harms -- In December 2004 CEA/DAM - the Military Applications Direction of the French Nuclear Agency decided to acquire a 60 TeraFlop/s peak supercomputer from Bull, France. The computer is based on 6160 Intel Itanium - Montecito - computing nodes. Quadrics will deliver the high-speed interconnect. The acquisition process in Europe is very formal and based on laws. I never heard anything about the acqusition of the ASCI systems. I had the chance to talk to Jean Gonnord. He is the Program Director for Numerical Simulation & Computer Sciences at CEA. In detail he described the acquisition process, the decision and the role of open source software. Uwe Harms (UH): Probably in America the readers don't know that a public acquisition in Europe underlies very specific and hard rules. The decision can be controlled by a court, if a loosing vendor calls it. Can you please describe this process. Jean Gonnord (JG): The purchase is the result of a call for tender, following the French rules for public market called "request for procurement on performances". This very strict procedure is controlled - at posteriori - but before the signature of the contract - by a CEA independent "commission market" which took place mid November. Thus all the vendors are treated equally. The procedure includes the following steps : - a call for interest was published in the official EEC journal. This was done end of January 2004. Eight vendors answered it in the beginning of March: BULL, CRAY, DELL, HP, IBM, Linux Networx, SGI, and SUN. - then a call for procurement (RFP) was issued mid March, requesting for an answer beginning of May. The specifications, written by CEA, have been sent to the last 8 vendors. They had to answer their best technical proposal with its cost but not knowing the CEA budget target. The complete technical specifications file can be summarized in a table of 258 criteria: o 205 are functionalities where they have to answer by YES/NO or give some figures, o 53 measurements, made on CEA benchmarks, run on machine as near as possible of the final one, and extrapolated by the vendor who has to explain how he does it, and to commit on the result. Only five vendors answered finally to the RFP : BULL, DELL, HP, IBM, Linux Networx. In the next step we communicated the budget to vendors and DELL gave up. Then we had three sets of discussions with them. This occurred in May/June 2004. The goal of these discussions was to get from the initial proposal to a final proposal respecting the budget by better adapting the answer to the request. Eventually we could diminish the level of specification. No "commercial" reduction is allowed and this procedure guarantees an equal treatment for each competitor. At the end of this stage four vendors, BULL, Hewlett-Packard, IBM and Linux Networx, delivered a final answer in the beginning of July. The answer included a risk study and a commitment to demonstrate before the CEA decision - in September - some technologies that CEA thought to be critical for the answer. Examples are the existence, at least as a prototype of the proposed processor, the existence, at least as a prototype of the proposed board and chipset, the development status of critical software components like the parallel file system. We analyzed the four offers all the summer. Clearly there have been different proposals, but we got four good answers in front of our criteria. All the vendors fulfilled most of our demands. UH: Did the vendors demonstrated their technology and what about the risk assessment? JG: As we asked in the RFP, the vendors had to do it in the first two weeks of September. They showed prototype processors - one year before the processors were running. They did or did not with the boards. They also showed their level of development in the software and all other requested criteria. Some vendors have been good - others not as good. Now we had the choice with the study of risks. By the way, Bull demonstrated a 2 TeraFlop/s machine of and identical architecture using INTEL Madison processors. With the TERA benchmark, Intel showed the performance of the dual core Montecito. We got results, naturally undisclosed as part of the technological demonstration. We will have direct access to Montecito machines very soon. The big milestone will be the demonstration, end of this year; of a sustained performance of 10 TeraFlop/s on the TERA benchmark. We are quite confident in BULL to realize it and in Intel to deliver to us the Montecito on time for our demonstration. UH: What about your TERA benchmark ? JG: In order to obtain our 55 points of measurement we use about 40 benchmarks. Some of them are very specialized; some of them like TERA are complete computation codes. Most of them had to be run on one processor, one node, and large configuration. For tests on one processor or one node the measurement had to be made alone on one proc/node or on load i.e the same job repeated on all procs/nodes. As we can't disclose our applications, we wrote the TERA Benchmark which is as near as possible of them, but unclassified. This was demonstrated quite good on TERA-1. We obtain, at the acceptance, 1.13 Teraflop/s out of 5 which is a ratio of 22.6% not very far from the average measured production rate which is around 20%. We think that the ratio will be somewhat smaller (about 20%) due to the architecture of the Itanium but we hope to get significantly more than the 10 Teraflop/s from the machine, this will be the challenge for BULL and CEA team. UH: What are the reasons for the Bull decision - because it is a French company ? JG: The base of the decision is the commitment of the vendors in their final proposal on the 258 criteria, comforted (or not) by the "technologic" demonstration (risk), and the total cost. On the four final answers, using these given criteria, Bull was the best. You have to remember that we are buying a complex machine, not a single processor or a network or specified software. Bull was globally the best on all these aspects specially some of very important interest for us : - the quality and performance of the network (latency, barrier… which are of primary importance for our very large parallel applications. On these point Quadrics is far ahead. We noted that some proposal were even not better from what we got from Quadrics in 2001 on the TERA-1 machine - the quality and performance of the I/O subsystem that we consider, as a return from experiment of TERA-1, of primary importance, for which Bull made a special effort. - last but not least the commitment of Bull for Open Source All these technical reasons lead us to the choice of BULL. Naturally we are happy and proud that such a challenging RFP has been won by a set of European company, BULL and Quadrics. But we can not risk our mission, the France deterrent, for some economical reason. An other aspect, which made us proud, is that world has change in the last five years and that European industry is back in the field of computer. UH: You mentioned Open Source, what does this mean for you ? JG: CEA decided to choose open source against proprietary solutions. This was a choice for the future. The reasons are not the costs - cheaper – but that we can control the software. The machine is dedicated to classified defense program, specially the Simulation program. Thus we don't want to open the machine to vendors. We have a skilled team to debug and correct the software. We are taking a big part in open source, Linux as the operating system, Lustre as the file system and open batch systems. We expect 80% open source. In the case of our TERA-10, Bull is developing an open source HPC version of LINUX based on a standard kernel. The Cluster Management System is proprietary from Quadrics which is the exception to our open-source commitment. The global interface will be developed by Bull. UH: Such a machine is surely producing a lot of I/O, can you comment on it ? JG: First, we are not only using gigantic computing, we are also a gigantic data producer. On this point we are very similar to problem that occurs with the LHC in Geneva. Each experiment, in our case a virtual experiment, produces enormous amounts of data. As any experiment, we follow its behavior function of time. In our case we will use for example several thousands of time steps for which we will follow several tens of parameters defined on billion of meshes. To day, with the TERA-1 machine we are producing more than 3 Terabytes of data per day, which is more than a petabyte per year. This will be multiply by a factor 10 with TERA-10. Second, a large simulation could run on the machine several weeks. We must have a strong capacity of saving the computation every few time steps and restarting it. This dimension the I/O system as we don't want that more than 10% of the computation time to be used for I/Os. This specification leads to the gigantic bandwidth (100 gigabits/s) we demand on the file system. UH: Beside your classified TERA-10 do you offer open systems for the science and research ? JG: Since September 2003 DAM is also responsible of the open CEA computing centre (CCRT) that CEA shares with EDF, SNECMA and ONERA. The CCRT offers today a total of 3.6 Teraflops : - 2.4 through an HP machine absolutely similar to TERA-1 (alpha,ES45,Elan-3), - 0.8 through an HP cluster of Opteron, - 0.4 through a vectorial NEC SX6 machine. The CCRT will be renewed completely beginning of 2007 and there will be in 2006 an RFP for one or two machines with a total power of several tens of Teraflops. Perhaps it could be two machines. One could be a vector system for fundamental climate research - but this is open. Thank you very much for this interview http://www.cea.fr http://www.bull.com http://www.quadrics.com Uwe Harms