Cornell Offers Data Analysis Training Class

On December 8-9, 2010, members of the Cornell University Center for Advanced Computing (CAC) will present a National Science Foundation-sponsored training class, "Data Analysis on Ranger" on the campus of Cornell University, Ithaca, NY.

The two-day workshop will focus on data analysis on Ranger, the 3,936 node, 579 peak teraflops Linux Sun cluster located at the Texas Advanced Computing Center, although the concepts of this workshop will readily transfer to other high-performance computing platforms. There is no fee for this workshop.

The class will include lectures, labs, and discussions on:

  • Data formats, transfer, movement, and storage
  • Data analysis with R, Python, and MATLAB
  • MapReduce with Hadoop
  • Visualization
  • Optimization
  • Scientific workflows and provenance

The Cornell University Center for Advanced Computing (CAC) receives support from Cornell University, the National Science Foundation, and other leading public agencies, foundations, and corporations.

The Ranger supercomputer is funded through the NSF Office of Cyberinfrastructure "Path to Petascale" program. The system is collaboration among the Texas Advanced Computing Center (TACC), The University of Texas at Austin's Institute for Computational Engineering and Science (ICES), Sun Microsystems, Advanced Micro Devices, Arizona State University, and Cornell University.

Ranger is a key resource of the NSF TeraGrid which provides scientists and researchers access to large-scale computing power and resources. TeraGrid is a partnership of people, resources and services that enables discovery in U.S. science and engineering by providing researchers with access to large-scale computing, networking, data-analysis and visualization resources and expertise.