IBM Shatters Data Management Challenge at CERN

IBM and CERN, the European Organization for Nuclear Research, today announced that IBM's storage virtualization software has achieved breakthrough performance results in an internal data challenge at CERN. The data challenge was part of a test currently going on at CERN to simulate the computing needs of the Large Hadron Collider (LHC) Computing Grid, the largest scientific computing grid in the world. The LHC is expected to produce massive amounts of data, 15 million gigabytes per year, once it is operational in 2007. The recent results represent a major milestone for CERN, who is testing cutting-edge data management solutions in the context of the CERN openlab, an industrial partnership. Using IBM TotalStorage SAN File System storage virtualization software, the internal tests shattered performance records during a data challenge test by CERN by reading and writing data to disk at rates in excess of 1GB/second for a total I/O of over 1 petabyte (1 million gigabytes) in a 13-day period. This result shows that IBM's pioneering virtualization solution has the ability to manage the anticipated needs of what will be the most data-intensive experiment in the world. First tests of the integration of SAN File System with CERN's storage management system for the LHC experiments have already obtained excellent results. "CERN has a long-standing collaborative relationship with IBM, and we are delighted that IBM is pushing the frontiers of data management in the context of CERN openlab," said Wolfgang von Rüden, Information Technology Department Leader at CERN and Head of the CERN openlab. "What we learned from these data challenges will surely influence our technological choices in the coming years, as we continue to deploy the global LHC Computing Grid." In 2003, IBM started working with CERN openlab for Data Grid Applications to collaborate in testing solutions for a massive data-management system that could support this sort of grid environment. "CERN's data challenges are the kind of man-on-the-moon project that can push IBM technology to the limit, so we can make sure that IBM TotalStorage SAN File System can satisfy our most demanding corporate clients for a long time to come," said Jens Tiedemann, General Manager, Storage Software, IBM. IBM's innovative storage virtualization software and file management technology, conceived in IBM Research and marketed under the name IBM TotalStorage SAN File System, is designed to provide scalable, high-performance and highly available management of large amounts of data using a single file namespace regardless of where or on what supported operating system the data reside. For IBM, the recent data challenge involved writing and reading over 300,000 files -- each 2GB in size -- into and out of IBM TotalStorage SAN File System managed storage. The storage consisted of 15 iSCSI targets. There were 12 IBM file system writer clients and 12 reader clients, with four reader threads and either four or six writer threads used per the respective client. All clients and iSCSI storage were connected via a 1 gigabit Ethernet network. A significant goal of the data challenge was to test the IBM virtualization solution under a range of failure scenarios, such as disconnecting iSCSI targets, to confirm the robust performance of this data management solution. CERN openlab is a collaboration between CERN and leading industrial organizations, which aims to implement and test data-intensive Grid-computing technologies that will aid the LHC scientists. Because the same issues facing CERN are becoming increasingly important to the IT industry, the CERN openlab and its collaborators -- which also includes Enterasys, HP, Intel and Oracle -- are eager to explore new computing and data management solutions based on open standards in a cooperative framework. For example the recent data challenge relied on high-speed switching technology provided by Enterasys. As part of the CERN openlab work, IBM has involved several leading storage management experts from IBM's Almaden Research Center in California, USA, and Zurich Research Lab in Switzerland in the work at CERN. In addition, through its Shared University Research (SUR) program, IBM supplied CERN with 28 terabytes of iSCSI disk storage, a cluster of six eServer xSeries systems running Linux and on-site engineering support and services by IBM Switzerland.