Cornell Theory Center Opens eScience Unit

The Cornell Theory Center (CTC) announced the establishment of an eScience Unit (eSU) to provide a breadth of services to researchers with data-intensive applications and to conduct leading-edge research in related data-mining topics. Led by computer science professor and CTC associate director Johannes Gehrke, eSU provides database systems and data storage management, database programming and consulting, data curation, and data mining services; it is supported by CTC's long-standing experience in cyberinfrastructure. Gehrke's research group brings deep data management and data-mining expertise to the unit. "Many Cornell research communities are developing datasets on the terabyte and petabyte scale," said Gehrke. "CTC's capabilities in cyberinfrastructure, and particularly in data-intensive computing, will assist researchers in building new knowledge environments." Gehrke and CTC are part of a Research Infrastructure (RI) grant funded by the Computer and Information Science and Engineering (CISE) directorate at the National Science Foundation in 2004. With funding from the CISE RI grant, and interdisciplinary group at Cornell has started to build infrastructure to support three petabyte-scale data-intensive projects: Large-scale Astronomical Surveys Using the Arecibo Radiotelescope, Physically Accurate Rendering in Computer Graphics, and a study of the Structure and Evolution of the World Wide Web. CTC's expertise in data management, in particular SQL Server, acquired during its Windows HPC partnership with Microsoft during the last four years, is invaluable to the new eSU. "We have introduced database technologies into engineering fields to develop more efficient computational environments," noted Anthony Ingraffea, CTC acting director and head of the center's Computational Materials Institute. "For example, our CMI finite element analysis is supported by a SQL Server back end. Dr. Gerd Heber from CMI has worked closely and published with Jim Gray of Microsoft to create this innovative capability." CMI creates, verifies, and validates state-of-the-art computational fracture mechanics simulators for crack propagation in engineered and natural structures. "New research opportunities created by the ability to form new data archives, to analyze massive data sets, and to share data within scientific communities are impacting researchers in all disciplines," said Gehrke. "Establishment of the eSU recognizes that eScience is the new paradigm for research in disciplines as diverse as physics and agriculture." eSU is modeled after CTC's successful Computational Biology Service Unit (CBSU) which provides cyberinfrastructure support for life sciences researchers. In addition to faculty research projects, eSU is working with companies through CTC's Corporate Partnership Program.