Cornell Offers Data Analysis on Ranger and Spur Workshop

On October 12-13 members of the Cornell Center for Advanced Computing (CAC) will present a National Science Foundation-sponsored training workshop, “Data Analysis on Ranger and Spur,” on the campus of Cornell University. There is no fee to attend this workshop.

The two-day lecture, lab, and discussion will focus on large-scale data computation and analysis on Ranger, the 3,936 node, 579.4 peak teraflop Linux Sun cluster located at the Texas Advanced Computing Center, although the concepts of this workshop will readily transfer to other high-performance computing (HPC) platforms.

The first day of the workshop will introduce attendees to the HPC environment with an emphasis on data  transfer, movement, and storage. Topics will include database formats; data analysis with The MathWorks MATLAB, Python, and R; and MapReduce with Hadoop. Second day sessions will focus on scientific visualization and code improvement. Topics will include ParaView, the open-source data analysis and visualization application; optimization techniques; and computational steering. The workshop will conclude with a presentation on scientific workflows and provenance. Cornell CAC staff members will be available during the workshop to help attendees with code porting or development.

To register for the workshop, please visit http://portal.teragrid.org/training. Questions regarding the event may be directed to “Cornell CAC Help” at help@cac.cornell.edu.

The Ranger supercomputer is funded through the NSF Office of Cyberinfrastructure “Path to Petascale” program. The system is a collaboration among the Texas Advanced Computing Center (TACC), The University of Texas at Austin’s Institute for Computational Engineering and Science (ICES), Sun Microsystems, Advanced Micro Devices, Arizona State University, and Cornell University.

Ranger is a key resource of the NSF TeraGrid (www.teragrid.org), a nationwide network of academic HPC centers, sponsored by the NSF Office of Cyberinfrastructure, which provides scientists and researchers access to large-scale computing power and resources. TeraGrid is a partnership of people, resources and services that enables discovery in U.S. science and engineering by providing researchers with access to large-scale computing, networking, data-analysis and visualization resources and expertise.

Spur is a Sun Visualization Cluster with 128 compute cores, 1TB aggregate memory and 32 GPUs. Spur shares the InfiniBand interconnect and Lustre Parallel file system of Ranger. Thus, Spur acts not only as a powerful stand-alone visualization system, it enables researchers to perform visualization tasks on Ranger-produced data without migrating to another file system.

The Cornell Center for Advanced Computing (CAC) is a leader in high-performance computing system, application, and data solutions that enable research success. CAC (www.cac.cornell.edu) is funded by Cornell University, the NSF, and other leading public agencies, foundations, and corporations.