ENGINEERING
Indiana University Moves Big Data Faster
From predicting the path of severe weather to creating drugs that combat disease, big data is critical to the discoveries that improve human life. However, the current production of digital data exceeds the ability to move it over computer networks. A new Indiana University-business collaboration is changing that dynamic.
A recent networking breakthrough from IU researchers, in collaboration with Orange Silicon Valley and DataDirect Networks, showed that data sharing can be faster and more efficient over wide area networks (WAN). The team performed the world's first demonstration of RDMA (Remote Direct Memory Access) over Converged Ethernet (RoCE) across a wide area network using the Lustre file system.
The advancement came at the recent Supercomputing 12 (SC12) conference in Salt Lake City. SC12 is one of the most important events in the field of advanced computing, attracting thousands of attendees from around the world.
RoCE, pronounced "Rocky," is a network protocol that enables remote direct memory access over an Ethernet network, a process that dramatically speeds up data transfer over networks. Usually, data must travel from the main memory of a source computer to the memory in that computer's network interface. It then travels over the network, to the network interface of the recipient computer and on to the main memory of the recipient computer.
RDMA removes layers of protocol and software to transfer data from server memory to client memory in the most efficient way possible. RoCE migrates this approach from specialized networks to the widely deployed Ethernet.
"RoCE is incredibly powerful, and broadens the reach of this advanced approach from a single data center to statewide, national and global networks," said Martin Swany, IU associate professor of computer science, associate director of the Data to Insight Center of the IU Pervasive Technologies Institute and director of the Indiana Center for Translational Network Research and Education (InCNTRE). "The technology behind RoCE's use of advanced networks will aid data transfer in applications as diverse as videoconferencing and the supercomputers that help us understand our world."
"This demonstration combines the speed of moving data from disk to memory that we have pioneered with IU's Data Capacitor and Professor Swany's work with network optimization," said Matthew Link, director of systems for IU's Research Technologies division and associate director for the IU Pervasive Technology Institute Center for Research in Extreme Scale Technologies. "In the past, the roadblock in moving data over long distances has been the network interface and network efficiency. RoCE solves that. RoCE and Lustre will be tremendously important in allowing us to understand the massive amounts of data all around us."
Orange Silicon Valley, a research and innovation subsidiary of telecommunications firm France Telecom-Orange, is considering RoCE over WAN for its business model. "We are always investigating the best possible mechanism for delivering maximum efficiency from our core carrier assets in a cost-effective way. Having the ability to accomplish that with an all open source stack is very interesting for us," said Christian Eychene, vice president of IT infrastructure technologies and engineering at Orange.
"This kind of university-industry collaboration is an example of the value that the Indiana University Pervasive Technology Institute creates for the United States—new technology developed by computer science and transformed into usable software," added Swany.
Read on for technical details about the RoCE over WAN demonstration
The IU demo at SC12 featured a RoCE-enabled Lustre file system constructed in Indianapolis using DDN-provided storage with four clients deployed in the IU booth. The booth featured screens that displayed multiple high-definition video files from Indianapolis.
The 40 gigabits per second (Gb/s) path used Internet2's 100 Gb/s network from Salt Lake City to Chicago, and then used Indiana's Monon100 network from Chicago to Indianapolis. The IU team reported a peak 92% theoretical maximum link capacity for one client/server pair.
At their core, IU's advanced data management techniques use the Lustre open-source file system to improve the speed with which data can be loaded from computer disk to computer memory. IU's work with Lustre was assisted by DataDirect Networks (DDN), and continues the long association between IU and DDN in advanced data management.
"This groundbreaking demonstration is the next logical step forward for the Lustre WAN work that IU began pursuing in 2006," said Stephen Simms, manager of IU's high-performance file systems group. "Lustre WAN built on RoCE will allow users to more efficiently utilize their data across geographically distributed resources."
IU's demonstration also illustrates how Software-Defined Networking (SDN) improves the efficiency of data transfer. "SDN allows fine-grained control of network forwarding behavior," said Martin Swany, IU associate professor of computer science, associate director of the Data to Insight Center of the IU Pervasive Technologies Institute and director of the Indiana Center for Translational Network Research and Education (InCNTRE). "Configuring the network to use protocols like RoCE currently takes a team of experts. We are working to use SDN to automate this process, and we believe this will be useful to larger groups in the future."