SDSC Hosts GEON 2004 All-Hands Meeting

More than sixty participants representing over 25 institutions gathered at the San Diego Supercomputer Center (SDSC) on the campus of the University of California, San Diego in mid-August for the second annual All Hands Meeting of the NSF "Cyberinfrastructure for the Geosciences" project, known as GEON. The meeting was the centerpiece of an ambitious series of five San Diego events marking the importance of the geosciences in driving development of the emerging national cyberinfrastructure. Starting with the ESRI 2004 International User Conference on August 9-13, to which GEON contributed a moderated paper session, four more events followed in close succession at SDSC -- a two-day GEON NSF site visit, the first meeting of GEON's new Advisory Board, and the GEON All-Hands Meeting, concluding with the five-day Cyberinfrastructure Summer Institute for the Geosciences, attended by 40 researchers from 30 different institutions, held at SDSC from August 16-20. GEON, a ground-breaking five year NSF large Information Technology Research project, brings together information technology (IT) and geoscience researchers from multiple institutions in a large-scale collaboration that is building a modern cyberinfrastructure for the earth sciences and beyond. The GEON team is creating data-sharing frameworks, identifying best practices, and developing useful capabilities and tools to enable dramatic advances in how geoscience is done, democratizing access to scientific tools and data and vastly extending the scope of scientific questions that can be answered. A Growing Collaboration "This year's GEON meeting had a greater number of participants representing many more institutions than last year, reflecting the growing need for cyberinfrastructure," said Chaitan Baru, director of SDSC's Data and Knowledge Systems program at SDSC, recognized as a world leader in cyberinfrastructure for data-intensive science. "In addition to providing the face-to-face communication vital for IT-geoscience collaboration within GEON, having all of these events together created a rich environment for building partnerships with the broader geosciences and IT communities." For example, GEON researchers met with ESRI colleagues on GIS developments for the geosciences community, as well as researchers from Cal-(IT)2, the California Institute for Telecommunications and Information Technologies, CICESE, the Centro de Investigación Científica y de Educación Superior de Ensenada (Center for Scientific Research and Higher Education of Ensenada), Purdue University, the National Aeronautics and Space Administration (NASA), and potential partners of the fast-growing project. Participants in the IT research component of GEON, coordinated by SDSC's Baru, include SDSC, Penn State University, San Diego State University, and the University of Texas at El Paso. The geoscience research component of GEONis divided into two testbeds, the Rocky Mountains, coordinated by Professor Randy Keller of the UT El Paso, and the Mid-Atlantic, coordinated by Professor Krishna Sinha of Virginia Tech. The testbed science efforts include researchers from eight other universities -- Arizona State University, Bryn Mawr College, Rice University, University of Arizona, University of Idaho, University of Missouri, University of Utah, and UNAVCO. The Digital Library for Earth Sciences Education (DLESE) is coordinating the GEON education and outreach program. Other major GEON partners include the U.S. Geological Survey (USGS), the Geological Survey of Canada (GSC), the Lawrence Livermore National Laboratory (LLNL),and NASA. In addition, GEON partnerships include the earth system history project, CHRONOS; the hydrology project, Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI)Hydrologic Information System (HIS), and the Southern California Earthquake Center (SCEC). Industrial partners supporting GEON include ESRI, Hewlett Packard, and IBM. Sun Microsystems and Cal-(IT)2 generously sponsored a dinner at the recent GEON AHM. To make full use of existing resources, GEON is leveraging related research by collaborating with projects in other disciplines, for example, the NIH Biomedical Informatics Research Network (BIRN) in neuroscience; the NSF Science Environment for Ecological Knowledge (SEEK); the GRid Assessment Probes (GRASP) project; the NMI GRIDS Center; the NSF TeraGrid; and the National Laboratory for Advanced Data Research (NLADR). By providing leading-edge data integration and grid computing services that support geosciences research and collaboration on unprecedented scales, GEON is equipping geoscientists to discover new insights into the dynamics of complex, interrelated Earth systems. GEON efforts will be critical for integrating and interpreting data collected by projects such as the NSF EarthScope initiative and CHRONOS. The 2004 GEON All Hands Meeting opened with remarks by SDSC director Fran Berman discussing the importance of cyberinfrastructure for the nation. She noted that over the last couple of decades science has become a large-scale, multidisciplinary "team sport" driven by technology, with collaborating groups addressing larger and more complex problems. The NSF is pushing the development of cyberinfrastructure to support this, with technology as the "enabler" now extending beyond supercomputers to include a rich environment of networks, visualization capabilities, data storage, remote instruments, handheld devices, and more. The growing recognition that integrated software systems provide the glue for these new capabilities highlights the importance of cyberinfrastructure such as that being developed in GEON. GEON Science: Asking Big Questions In the AHM, GEON researchers reported on the project's testbed approach to scientific integration. In the Rocky Mountain region, GEON is addressing multi-disciplinary geosciences questions in the Dynamics, Structure, and Cenozoic Evolution of the Rocky Mountains (DYSCERN) project, coordinated by PI G. Randy Keller of the University of Texas at El Paso. This region is the apex of a broad, dynamic orogenic, or mountain-building, plateau bounded by the stable interior of North America and the active plate margin along the west coast. Beginning 1.8 billion years ago, new continental lithosphere formed and stabilized. During the last 600 million years of the Phanerozoic, intraplate deformation has occurred -- Ancestral Rocky Mountain building, the Laramide Orogeny, and late Cenozoic uplift and extension, which is still active today. In each case, the geological processes involved in these events remain the subject of considerable scientific debate, explains Keller, and GEON is playing a critical role by providing infrastructure that facilitates the integration of the diverse data types required to "connect the dots" and build a comprehensive picture of these complex geological processes. DYSCERN presentations included "Building Distributed Computational Environments" by Dogan Seber, GEON project manager and director of the Geoinformatics lab at SDSC; "Toward a 4D simulation of continental deformation in the Rocky Mountain Testbed and western US" by Mian Liu of the University of Missouri, Columbia; a presentation by Chuck Meertens of UNAVCO; "Active Tectonics, Digital Elevation Model Analysis, and Remote Sensing in GEON" by Ramon Arrowsmith of Arizona State University; and a kinematic model of the Northern Rocky Mountains by John Oldow of the University of Idaho. The other science focus in GEON is the US mid-Atlantic Appalachian region, in the Crustal Evolution: Anatomy of an Orogen (CREATOR) project, coordinated by PI A. Krishna Sinha of Virginia Tech. The Appalachian Orogen, or mountain-building region, is a continental-scale mountain belt that provides a geologic template for examining the growth and breakup of continents through plate tectonic processes. As a first order science question, Sinha explains, geoscientists would like to ask, What is the geologic history of accretionary orogens, or mountain-building, in this region? Such accretionary orogens play a role in the growth of continents as a major site of juvenile continental crust production at convergent plate margins, through the addition of crust (known as terranes) by accretion, and in recycling of continental and oceanic crust. Geoscientists in GEON are focusing on the Appalachian Orogen as a natural laboratory to develop methods for integration of data, tools, and models, with an emphasis on 4D management of data and knowledge. Being able to ask such broad scientific questions will give geoscientists insights not previously possible. To enable this, GEON is integrating the diversity of geologic information necessary to analyze this crustal evolution, which ranges from metamorphism and igneous activity to stratigraphy, geophysics, and more. The meeting included CREATOR science presentations on "Cybernetwork integration of chronostratigraphic data" by Emil Platon of the University of Utah; on "Fossil and sedimentary data and tools development" by Allister Rees of the University of Arizona; and on "Adapting metamorphic data for geoinformatics: a case study from the mid-Atlantic region" by Maria Luisa Crawford of Bryn Mawr College. Sinha points out that beyond the technical aspects of developing cyberinfrastructure for geoscience research, GEON is promoting leadership in geoscience education reform, and revolutionizing how earth scientists do their science by democratizing access to services and data, allowing on-line replication of results, increasing awareness of scientific knowledge "pathways," and facilitating a fundamental cultural change in the practice of the geosciences. GEON Cyberinfrastructure In addition to GEON science, researchers described IT advances in GEON, demonstrating powerful new tools that enable geoscientists to integrate, analyze, model, and visualize today's enormous multidisciplinary 4-D Earth science data sets. "Our goal in GEON is to create a services-based, distributed environment that also enables local control by applications scientists," said Seber. "It's important that we develop cyberinfrastructure to support the day-to-day practice of geoscience, not only large-scale 'hero computations.'" For example, a key capability that GEON provides is enabling researchers to have rapid, convenient access to shared data sets. To provide flexibility for sharing data in the multiple and controlled forms necessary for today's complex collaborations, GEON offers two main approaches: Researchers can register data through the GEON portal, which then hosts the data (providing long-term, state-of-the-art technical support and relieving scientists of this task), or they can register just the data schemas, or descriptions, making other scientists aware of their data, while retaining the data sets in their home environments. As part of data sharing, GEON is also providing a powerful search capability, GEONSearch. Searches in GEON can be spatial, temporal, or ontology-based, and in the future, natural language-based conceptual queries. Because data sets in GEON are accompanied by rich metadata, or descriptive information about the origin, uses, quality, and constraints of the data sets, researchers can efficiently identify desirable data sets and be confident about the quality and suitability of the data they find for their intended use. The GEON portal will also collect statistics on data use, helping improve the quality of GEON's ontologies and providing progressively better searching over time. In addition to such targeted searching, GEON will also support crawling searches for broader, though less precise searching, leveraging the National Laboratory for Advanced Data Research (NLADR) "deep Web" work. NLADR is a collaborative effort between SDSC and the National Computational Science Alliance (NCSA). To move tools efficiently into production use, GEON is following a two-tier approach: to identify commercial tools that can be leveraged, avoiding unnecessary duplication of effort; and where needed, to develop advanced technologies as open-source products. Other IT advances described in the meeting are the GEON grid system, including user registration, the GEON Portal, and the growing number and size of GEON Point of Presence computer nodes. Researchers also demonstrated data integration efforts, including the use of ontologies with data registration for improved searching, and workshops to develop ontologies for geosciences disciplines such as seismology and geochemistry. In addition, GEON is participating in the major open-source scientific workflow effort, Kepler. Break-out sessions included the Science Integration Group and the Software Integration Group. As a complex, large-scale collaboration, outreach efforts range from GEON education projects to international collaborations, as well as efforts with projects such as SEEK and BIRN. "People say that the most interesting research happens at the borders of disciplines," said Baru, "and we're finding that our collaborations between computer scientists and geoscientists are very fruitful areas for innovation." Essential to GEON is enabling both groups to learn each other's "culture" and vocabularies, and the AHM and other meetings play a vital role in this. The AHM was Webcast live, and the Webcast and further information about GEON are available through the GEON Portal at http://www.geongrid.org/. -- Paul Tooby. Related links GEON, Cyberinfrastructure for the Geosciences – http://www.geongrid.org/ San Diego Supercomputer Center (SDSC) – http://www.sdsc.edu/ U.S. Geological Survey (USGS) – http://www.usgs.gov/ Geological Survey of Canada (GSC) – http://gsc.nrcan.gc.ca/index.html Digital Library for Earth System Education (DLESE) – http://www.dlese.org/ Lawrence Livermore National Laboratory (LLNL) – http://www.llnl.gov/ National Aeronautics and Space Administration (NASA) – http://www.nasa.gov/home/index.html CHRONOS – http://www.chronos.org/ Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI), Hydrologic Information System (HIS) – http://www.cuahsi.org/, http://www.iihr.uiowa.edu/~cuahsi/his/ Southern California Earthquake Center (SCEC) – http://www.scec.org/ Biomedical Informatics Research Network (BIRN) – http://www.nbirn.net/ Science Environment for Ecological Knowledge (SEEK) – http://seek.ecoinformatics.org/ Kepler, Scientific Workflows – http://kepler.ecoinformatics.org/ GRASP – GRid Assessment Probes - http://grail.sdsc.edu/projects/grasp/ GRIDS Center - http://www.grids-center.org/ NSF TeraGrid – http://www.teragrid.org/ ESRI, leading-edge GIS software – http://www.esri.com/