ACADEMIA
Kraken set to deliver 2 billionth CPU hour, sustains 96 percent utilization
Funded by the National Science Foundation (NSF) and managed by the University of Tennessee’s National Institute for Computational Sciences (NICS), Kraken is located at Oak Ridge National Laboratory. Kraken is one of the integrated digital resources of the eXtreme Science and Engineering Discovery Environment (XSEDE), successor to NSF’s TeraGrid project.
These noteworthy achievements highlight different aspects of NSF’s largest machine. Delivering 2 billion hours emphasizes the long-term success of the machine and the staff at NICS who work to maintain it and aid users in their scientific endeavors. Such high utilization of Kraken underscores a tremendous short-term success—high-performance users across the nation choose to use Kraken for their research on a daily basis.
“Kraken meets the user community’s needs across a broad range of scientific domains and job sizes,” said NICS executive director Sean Ahern. “NICS provides roughly 60 percent of all the hours allocated on NSF resources, and we’re able to do this while maintaining extremely high utilization and delivering billions of hours.”
Kraken supports scientific projects as diverse as astrophysics, biology, climate change, materials research, and humanities, just to name a few. And while the domains are diverse, their computational needs are even more distinct—another issue that Kraken easily manages.
“Other centers with large machines have policies stipulating that projects must use a sizeable portion of the machine to even be considered for an allocation,” said NICS system administrator Troy Baer. “Kraken runs everything from projects that need one node to teams who would run the entire machine for weeks if we would let them.” This mix of jobs permits consumption of nearly all of Kraken’s 112,896 cores as smaller jobs take advantage of unused nodes in the spaces between larger jobs.
This type of high utilization is not new to the mythically named Cray. Kraken underwent its most recent upgrade in January, bringing the peak performance speed to 1.17 petaflops (more than a thousand trillion calculations per second). Since March—the first full month of production after the upgrade—Kraken has delivered at least 70 million CPU hours each month. Adding to that impressive feat, Kraken has maintained an overall average of 91 percent utilization since November 1, 2010, without compensating for downtimes such as preventative maintenance or system outages.
“Part of the reason for our recent high utilization is because we had an incredibly high availability on the machine,” explains Baer. “Kraken was only down for about two and a half hours in October, which is less than normal. That says a great deal about us having shaken out the machine well.”
With extreme versatility and mounting accolades, Kraken is sure to preserve its long and productive history as XSEDE’s strongest computational resource.