INDUSTRY
NSF Awards $1.8 Million for Information Access and Analysis System
The National Science Foundation (NSF) has awarded Cornell University $1.8 million to develop an information access and analysis system that will meet the data-intensive needs of three landmark research projects at Cornell University. The award, from the NSF’s directorate of Computer and Information Science and Engineering, will support projects conducted by Cornell’s Computer Science and Astronomy departments, and the Program for Computer Graphics, with assistance from the Cornell Theory Center (CTC). Microsoft, Unisys, and Intel Corporation are also contributing to the project. Research results will be available to a world-wide community through a proposed Web-services-based infrastructure that will allow applications to interoperate across programming languages, platforms, and operating systems. “We were impressed with the level and depth of collaboration involved in this initiative,” said University Provost Carolyn Martin. “The credentials of the departments, all recognized as leaders in their fields, and the partnership with a production facility that is skilled in providing scalable, high-performance computing, represent a powerful combination of expertise and potential.” In the first year of the grant, the projects will be supported by one Unisys ES7000/430 server with 32 Intel Itanium 2 processor-based nodes and 100 terabytes of online disk space. Additional storage and networking upgrades will be purchased in subsequent years to meet the growing data storage needs of the projects. By the final year of the grant, total storage will be more than a petabyte. The large-scale information access and analysis system will be housed at and maintained by CTC and will be tightly coupled with the Center's high-performance computing complex. “With the technology that this grant will fund, research has the potential to become more data-driven and exploit modeling, an activity that is central to science and engineering,” said professor of computer science Alan Demers, who is the principal investigator of the five-year grant. “Service-oriented interfaces and easily accessible Web interfaces to data management and analysis tools will revolutionize how scientists conduct research and interact with data and data processing capabilities.” “The rapid growth in the generation of digital data is changing computational science in a fundamental way,” said David Lifka, chief technology officer of CTC. “Traditionally, the scope of computational problems was limited by the available processing power. High-performance computing’s ability to handle data- and computationally intensive problems has made this a non-issue. However, the lack of an infrastructure capable of managing, searching, and interpreting creates a new bottleneck for data-intensive problems. Modern data-intensive applications need both high-performance computing resources and an infrastructure capable of information access and analysis.” The new infrastructure supported by this grant will further research being conducted by Professor of Astronomy James Cordes, Assistant Professors of Computer Science Steve Marschner and Kavita Bala, and Associate Professor of Computer Science Jon Kleinberg and Professor Dan Huttenlocher. Large Scale Astronomical Surveys Using the Arecibo Telescope Led by Cordes, this team of researchers will analyze data from the Arecibo Telescope to find pulsars and other exotic objects in the project titled, “Large-Scale Astronomical Surveys using the Arecibo Telescope.” The Arecibo telescope is the world's largest radio telescope in terms of collecting area and thus can conduct the most sensitive surveys for point-like objects. A new multi-beam feed system has increased the power of the facility by a factor of seven. “The pulsar surveys will be the deepest (reaching to the greatest distances) ever undertaken and are expected to yield not only about 1000 new pulsars, but also exotic objects, including millisecond pulsars spinning near the break-up speed of a neutron star; neutron stars in compact binaries with orbital periods of a few hours or less; and companion stars that are other neutron stars or black holes,” said Cordes. “These discoveries are expected to provide numerous opportunities for follow-up research on the equation of state of nuclear matter, gravitation physics, and gravitational waves.” The proposed surveys for pulsars include searching the entire Galactic plane of the Milky Way visible with Arecibo and also searching further out of the Galactic plane in a shallower survey to find millisecond pulsars and binary pulsars. To analyze this amount of raw data efficiently, the astronomers on the project will collaborate with the Department of Computer Science’s database group, which has developed some of the fastest known data mining algorithms. “We anticipate that our results will be of considerable value to astronomers and astrophysicists world-wide,” said Cordes. “The service-oriented interface to the data that will be implemented will allow users from all over the world to interactively query the multidimensional search space and to allow interactive and efficient exploration of the data set.” Physically Accurate Rendering in Computer Graphics In this project, Marschner, Bala, and the team will research light reflection how light reflects from complex objects and structures and how reflection can be handled efficiently in rendering systems. The data will originate from a spherical gantry, a versatile four-axis motion system designed for optical scattering measurement. Complex three-dimensional objects, as well as complex materials such as skin, hair, and cloth, will be illuminated from thousands of directions, and the reflected light will be measured using a camera from thousands more directions. The Spherical Gantry will generate approximately 50 terabytes of data per scanned object that can be used to computationally model the actual object. This data will be used in research on the fundamental properties of materials as well as on how to represent complex objects efficiently and realistically. “This will be the first study ever undertaken at this level of accuracy,” said Marschner. “Physically accurate rendering is an important goal in computer graphics, and of interest to archaeologists, librarians, and others. One application for this kind of data is a virtual museum. Rare artifacts can be digitized into representations that are accurate enough to produce highly realistic views from any reasonable distance and under any kind of lighting. The digitized images could be placed in a single environment, creating a real-time, fully realistic, immersive experience of a museum collection that could never be assembled in reality." Currently, Marschner is using measurements of real materials to develop better reflectance models and Bala is exploring new rendering approaches that can use measured or precomputed data to produce interactive, physically accurate renderings. Both of these research thrusts require a very large and scalable storage infrastructure to achieve their full potential. The Structure and Evolution of the World Wide Web In the project led by Kleinberg and Huttenlocher, the research team will develop new and precise models for how the Web evolves with time. These models will distinguish measurement-independent properties from those that are influenced by the method of measurement. The research will include algorithms for understanding the structure and evolution of the Web, studies of the Deep Web, and the related areas of scientific publishing and digital libraries. “Our goal is to develop techniques for simultaneously studying the evolution of the full Web and of specific, highly visible Web sites,” the researchers said. “Because of current relationships with organizations such as Internet Archive and resident expertise, Cornell is extremely well positioned to examine these kinds of questions on large Web data sets, but is currently limited by the lack of availability of large disk storage coupled with the fast computing power necessary to run our algorithms on large data sets.” The research will define techniques to identify and analyze rapidly changing content on the Web, for example, detecting hot topics and determining the evolution of topics over time. These techniques can highlight portions of the Web that are undergoing rapid change at any point in time, to archive and summarize the Web content surrounding a fast-breaking news story, and to provide a means of structuring the content of emerging media like Weblogs. Background Information: The Department of Computer Science at Cornell University, which was organized in 1965, is one of the oldest departments of its kind in the country. Research areas include Architecture, Artificial Intelligence, Computational Biology, Databases and Digital Libraries, Languages and Compilation, Graphics, Operating Systems, Networks,Distributed Computing, Scientific, Computing, Security, Theory of Computing, and algorithms. It has a full-time faculty of 36, approximately 110 resident Ph.D. graduate students, 100 M.Eng students, and the undergraduate program graduates about 200 C.S. majors each year. The department is typically ranked as one of the top six in the country. It is part of the Faculty of Computing and Information Science. For more information, visit http://www.cs.cornell.edu. The Department of Astronomy of the College of Arts and Sciences at Cornell University and the Graduate Field of Astronomy and Space Sciences are associated with two research centers: the Center for Radiophysics and Space Research (CRSR) and the National Astronomy and Ionosphere Center (NAIC). Housed in the Space Sciences Building, they form one of the leading centers for astronomy in the world. Traditional areas of excellence include infrared astronomy, theoretical astrophysics, radio and radar astronomy and planetary science. The department places strong emphasis on undergraduate and graduate teaching and in the participation of students in ongoing research projects. It strives to foster an interdisciplinary approach to solving astronomical problems and maintains strong ties with other departments, including Applied Physics, Computer Science, Earth and Atmospheric Sciences, Electrical and Computer Engineering, and Physics. Contact information: http://www.astro.cornell.edu; Professor James Cordes 607 255-0608; cordes@astro.cornell.edu; Professor Joseph Veverka (Chair) jfv4@cornell.edu). At Cornell, research and teaching in computer graphics are centered in CS and the closely affiliated Program of Computer Graphics (PCG), one of the world’s leading computer-graphics laboratories and a dominant force in the international computer-graphics community for more than thirty years. The PCG is particularly famous for its work in realistic rendering, simulating environments that are physically accurate and perceptually indistinguishable from real-world scenes. The interests of the computer-graphics group are broadly centered on the topic of high-quality rendering. Current research thrusts focus on the interrelated topics of improving the models of light scattering that underlie realism, deepening our understanding of how human viewers perceive computer-generated images, and developing algorithms for high-quality rendering at interactive rates. Other areas of interest include image-based modeling and texturing, architectural modeling, animation, graphics-hardware programming, and digital photography. The PCG’s state-of-the-art facility includes many tools for advanced research, including a sophisticated light-measurement laboratory with unique capabilities for directional light measurement, a large PC cluster, and a high-resolution tiled projection display.