Gauging the Network Effect of Grids

By Carolyn Duffy Marsan, Network World -- As more companies pilot grid-based applications, the question for network executives is what effect grid computing will have on the design and performance of their infrastructures. Initially, the answer appears to be none. That's because most companies are being driven to grid computing by a desire to increase utilization of their LANs and WANs. But long term, the trend toward distributed processing of compute-intensive and data-intensive applications might result in the need for more redundant network designs, smarter middleware and high-availability network services. "Typically, you don't need to make significant modifications to your network infrastructure to make it work with grid computing," says Peter Jeffcock, group marketing manager for grid computing at Sun. "The primary change in adding a grid is in increased utilization and the number of systems connected." But as network utilization rises, companies might need to upgrade bandwidth or storage, Jeffcock says. "You may need to make some changes if your environment goes from running at 20% capacity to 80% capacity, because if you have a failure you'll start to lose significant functionality," he says. "You might need to rearchitect parts of the network around potential failure points." Most companies testing grid applications today are trying to increase use of their existing networks, particularly 100M bit/sec or 1G bit/sec Ethernet LANs installed in the late 1990s. Many companies also have T-3 or higher WAN connections that were built for peak loads but experience significant excess capacity. By distributing key applications across a LAN or WAN grid, network managers can increase the speed at which processing is done or the size of the data set that is processed without having to purchase additional hardware or software. "Over the past 10 years, companies have spent an enormous amount of money buying networks and computers, and most of the time they are just space heaters," says Wade Hennessey, CTO and vice president of engineering at Kontiki, a video-transmission software company that recently introduced its first grid product. "Companies are not spending tons of money on hardware anymore. They're looking for something they can use with what they've already bought." Most corporate grids run across intranets, while Internet-based grids are popular with the academic community. many companies are testing grids initially on LANs in a particular location and spreading them out across their WANs. As companies gain more experience with grids, they are finding that they might need to redesign their networks or add new services. Mission-critical grid applications require some traffic engineering to ensure prioritization. Grid applications require higher availability and redundancy because they're distributed across a network. "If you manage your own grid environment, you want high redundancy in your network, which can be fairly expensive across the globe," says Bernhard Borges, senior technologist with IBM Business Consulting Services. "You need to be very cognizant about the network. Rather than looking at it as a transport mechanism, you have to look at it as part of the operating environment. . . . With e-mail, you can apologize if something doesn't get sent. With a grid application, you can't afford to do that." Borges expects more features to be built into network hardware to accommodate grid applications. He says that routers will need additional smarts to understand the context of data being handled and how it fits into the grid. He also expects network administration and management software to require additional features for grid computing. "Grids will require more administration than your average network," he says. Grid computing also is likely to influence the future direction of the Internet, experts say. The academic organizations involved in the nation's largest grid project - the National Science Foundation's multiyear, multimillion dollar TeraGrid project - already can see some effect on the backbone network, which operates at 10G bit/sec. "Grid computing represents the most leading-edge applications that we have over the backbone," says Steve Corbato, director of backbone network infrastructure for Internet2 and a collaborator on the TeraGrid. For grids, you "clearly need bandwidth, and you need a well-engineered local network environment. One of the things our community has discovered is that most performance problems happen at the edge," he says. Corbato says TeraGrid researchers need at least gigabit-speed LANs to take advantage of the high-performance WAN connections available on the project. High-performance grids also require some tweaks to Internet protocols. Corbato says TeraGrid researchers need larger packet sizes to sustain high TCP loads. By doubling the size of the packets that the TeraGrid network can handle, researchers have found they can significantly reduce the packet number that needs to be sent, thereby increasing network performance. "Large [packet sizes] reduce the load on the network interface card and improve the ability to drive high-performance loads across the backbone," Corbato says. One challenge with TeraGrid is scheduling the network resource, says Wesley Kaplow, CEO for Qwest's Government Services division. Qwest is providing four 10G channels to the TeraGrid project, which links government-funded high-performance computer centers in Illinois and California. "When you only have four major sites to start with, scheduling is not that big of a deal," Kaplow says. "But as you increase the site number, you basically get back to the same kind of best-effort network we have on the Internet today. . . . Part of the issue with the TeraGrid is to understand the computer-to-network connectivity issue and to do resource scheduling and assignment." Kaplow says the key to solving the network scheduling challenge is the development of next-generation middleware, which is being tackled under a separate NSF-funded initiative. "The middleware must understand the network topology and components, including optical switches and routers," Kaplow says. "The challenge is to go from the Internet model of today, with thousands of small flows that aggregate up to large traffic to systems with relatively few flows that [can take full advantage] of the characteristics of the channels." Experts say that if grid computing achieves its promise, it could have a profound effect on the way that public and private networks are designed and operated. However, these changes will take place over the next decade. "I don't think people realize how pervasive [grid computing] is going to be and how big a shift it is going to be in how people get e-mail, how files are backed up and the level of communications on each machine," Hennessey says. "I really do think it's going to be the biggest shift we see over the next eight years or so."