EGI offers scientists and researchers borderless access to distributed supercomputing infrastructures
Scientific research today is no longer conducted within national boundaries – the huge amount of scientific data from supercomputing simulations and instruments in international facilities means that scientists are becoming increasingly dependent on cross-border distributed computing and storage resources for the large-scale analysis of data.
The EGI-INSPIRE project aimed to establish a sustainable European Grid Infrastructure (EGI) by bringing together National Grid Initiatives (NGIs) and other organisations across the EU. It was a collaborative effort involving more than 50 institutions. The original impetus for EGI-INSPIRE was the heavy computational requirements of big data users at CERN, the European Organisation for Nuclear Research, where physicists adopted the paradigm of distributed computing worldwide to solve their big data problems. They used EGI to analyse the data from the Large Hadron Collider that led to the discovery of the elusive Higgs boson.
‘Soon after our initial success, we realised that our model could be replicated to serve any pan-European research community facing the problem of scalable access to large datasets – from research infrastructures to the long tail of science,’ says Tiziana Ferrari, EGI.eu’s Technical Director.
The largest distributed supercomputing infrastructure in the world
The model was widely replicated to the point where, in terms of geographic footprint, EGI is now the largest distributed supercomputing infrastructure in the world. Via EGI-INSPIRE, EGI offers users high throughput distributed data analysis by federating the computing, storage and data management capacity of 350 affiliated data centres and 21 cloud service providers worldwide.
The key to this success is federation: secure access to data, high throughput analysis, cloud supercomputing, cloud storage and a library of tools, scientific applications and software via a single set of user credentials. EGI-INSPIRE was fundamental to realizing all this. In EGI, research communities can federate their own data and computing infrastructures, scale these up to increase existing capacity, or simply get a grant from a centrally managed pool.
Not just hardware and software
A major part of EGI’s work involves service management: harmonising operational policies and processes across the members of its federation. This is where the diversity within Europe has been both a challenge and an advantage. Agreeing on a minimum set of standard policies and procedures is not easy, but once it has been done within the EU the federation can be easily extended to non-EU countries. As a result, the federation today integrates e-infrastructures from 57 countries representing nearly every corner of the globe.
EGI has grown rapidly to the point where it now federates 650000 CPU cores and 300 petabytes of disk space, serving 38000 users with its 1.5 million supercomputing jobs per day. It has also grown beyond its beginnings, providing services to scientists and researchers in the natural sciences, engineering, medicine, health, agriculture and the humanities. All this has enabled the publication of over 2000 peer reviewed scientific papers.
EGI-INSPIRE ran from 1 May 2010 to 31 December 2014 and received EUR 25 million in EU funding. The work developed during EG-INSPIRE will be continued and further developed in the EGI-ENGAGE project.
‘We observed that nearly 25% of the computational capacity that scientists and researchers access is outside their own country,’ says Ferrari. ‘To us this suggests that there is great scope for international collaboration.’