Supercomputing 'Grid' Passes Test

UK scientists at CCLRC's Rutherford Appleton Laboratory (RAL) in Oxfordshire recently joined computing centers around the world in a networking challenge that saw RAL transfer 60 million megabytes of data over a ten-day period. A home user with a 512 kilobit per second broadband connection would be waiting 30 years to complete a download of the same size. RAL is a member of the GridPP project - the UK effort by particle physicists to prepare for the massive data volumes expected from the next generation of particle physics experiments. The exercise was designed to test the global computing infrastructure for the Large Hadron Collider (LHC), the world's biggest particle physics experiment currently being built at CERN in Switzerland. To get ready for the LHC's unprecedented data rates, the worldwide collaboration is carrying out a series of "Service Challenges", the most recent of which (Service Challenge 2) has just been successfully completed. The eight labs involved sustained an average continuous data flow of 600 megabytes per second (MB/s) for 10 days from CERN. The total amount of data transmitted during this challenge (500 million megabytes) would take about 250 years to download using a typical 512 kilobit per second household broadband connection. "This service challenge is a key step on the way to managing the torrents of data anticipated from the LHC," said Jamie Shiers, manager of the service challenges at CERN. "When the LHC starts operating in 2007, it will be the most data-intensive physics instrument on the planet, producing more than 1500 megabytes of data every second for over a decade." The service challenge participants included laboratories in the US (Brookhaven National Laboratory and Fermilab), in Germany (Forschungszentrum Karlsruhe), in France (CCIN2P3), in Italy (INFN-CNAF), in the Netherlands (SARA/NIKHEF) and in the UK (Rutherford Appleton Laboratory). LHC computing aims to use a world-wide Grid infrastructure of computing centres to provide sufficient computational, storage and network resources to fully exploit the scientific potential of the four major LHC experiments. The infrastructure relies on several national and regional science grids. The service challenge used resources from the LHC Computing Grid (LCG) project, the Enabling Grids for E-SciencE (EGEE) project, Grid3/Open Science Grid (OSG), INFNGrid and GridPP. The LHC service challenges will ramp up to the level of computing capacity, reliability and ease of use that will be required by the worldwide community of over 6000 scientists working on the LHC experiments. During LHC operation, the major computing centres involved in the Grid infrastructure will collectively store the data from all four LHC experiments. Scientists working at over two hundred other computing facilities in universities and research laboratories around the globe, where much of the data analysis will be carried out, will access the data via the Grid. U.S. Fermilab Computing Division head Vicky White welcomed the results of the service challenge. "High energy physicists have been transmitting large amounts of data around the world for years," White said. "But this has usually been in relatively brief bursts and between two sites. Sustaining such high rates of data for days on end to multiple sites is a breakthrough, and augurs well for achieving the ultimate goals of LHC computing." NIKHEF physicist and Grid Deployment Board chairman Kors Bos concurred. "The challenge here is not just the inherently distributed nature of the Grid infrastructure for the LHC," Bos said, "but also the need to get large numbers of institutes and individuals, all with existing commitments, to work together on an incredibly aggressive timescale." RAL is the UK's national computer centre for the LCG project. It works with centres at UK universities to form the UK particle physics Grid, which currently consists of more than 2,000 CPUs and one million Gigabytes of storage capacity. The UK's contribution to LCG is managed by the £33m GridPP project, funded by the Particle Physics and Astronomy Research Council (PPARC). In order to meet the unprecedented data rates it expects to receive from the LHC in 2007, RAL must substantially increase its data acceptance rates from the current 70MB/s. During the next Service Challenge in July, RAL will sustain rates of 150MB/s over one month while simultaneously archiving part of the data to tape. By April 2006, rates of 220MB/s will be handled, with tests to accept 72 hour bursts of data at twice this rate. By 2007, the RAL computing centre will have to manage all the above while also serving data to its downstream UK university sites and at the same time feeding data to its vast 1000 processor computing cluster. RAL is well placed to meet this challenge having already carried out internal load tests of almost 400MB/s using only 4 of its total of 60 available disk servers. Dr Andrew Sansum, the manager of the particle physics computing centre at RAL, is enthusiastic about the implications of the Service Challenge for e-Science in the UK, saying, "As data-set sizes grow, the ability to rapidly move large datasets across the UK Grid will be of vital importance to many UK projects. Demonstrating the ability to meet the LHC Data Challenge has acted as a proof of capability that will inspire UK network providers and other UK science projects alike." RAL made use of a new national high-speed research network known as UKLight. UKLight is managed by UKERNA (United Kingdom Education and Research Networking Association) on behalf of the JISC (Joint Information Systems Committee). It complements the SuperJANET 4 IP production network and brings the UK into a global facility aimed at pioneering new ways of networking to enable research. Multi-Gbit/s connections are available between sites in the UK and many overseas locations via 10 Gbit/s circuits to Chicago and Amsterdam from a hub in London. The service challenge was one of the first research projects to make use of the network, which has also now been used by Radio Astronomy, SuperComputing and health applications. UKLight enables the UK to join several other leading networks in the world creating an international experimental testbed for optical networking. It is part of the GLIF consortium which comprises the major proponents of advanced networking. UKLight will bring together leading-edge applications, Internet engineering for the future, and optical communications engineering, and enable UK researchers to join the growing international consortium which currently spans Europe and North America. These include STARLIGHT in the USA, SURFnet in the Netherlands (NetherLIGHT), CANARIE (Canadian academic network), CERN in Geneva, and NorthernLIGHT bringing the Nordic countries onboard. UKLight will connect JANET users to the testbed and also provide access for UK researchers to the Internet2 facilities in the USA via StarLIGHT. GridPP is a six-year PPARC project with additional associated funding from HEFCE, SHEFC and the European Union. A collaboration of twenty UK Universities and research institutes and CERN, it will provide the UK's contribution to the Large Hadron Collider Computing Grid. For more information see www.gridpp.ac.uk. The GridPP Collaboration involves: The University of Birmingham; The University of Bristol; Brunel University; CERN, European Particle Physics Laboratory; The University of Cambridge; Council for the Central Laboratory of the Research Councils; The University of Durham; The University of Edinburgh; The University of Glasgow; Imperial College London; Lancaster University; The University of Liverpool; The University of Manchester; Oxford University; Queen Mary, University of London; Royal Holloway, University of London; The University of Sheffield; The University of Sussex; University of Wales Swansea; The University of Warwick; University College London.