Nanoscientists at the Gate

By Kathleen Ricker, NCSA -- As research communities make the leap to scientific computing, at some point they all face the hurdle of user-friendliness. The engineering community is no exception, according to Gerhard Klimeck, technical director of the NCN (National Science Foundation Network for Computational Nanotechnology) and professor of electrical and computer engineering at Purdue University. "I'm an engineer -- I want to get results done, and I have plenty of examples of how supercomputing can get in the way of getting results," says Klimeck. Back in the early days of the Web, the lack of application portability got in the way of results. "The theorists were writing Unix-based applications for semiconductor device modeling," Klimeck explains. "But the experimentalists who wanted to use these tools didn't have Unix systems -- they deal with PCs." Two simulations generated using NEMO3D.
Thus, in the mid-1990s came the development of PUNCH (Purdue University Network Computing Hub), middleware that allowed non-Unix-users to run Unix-based applications. Since that time, PUNCH has supplied the infrastructure for several research networks or "hubs," including the nanoHUB, a network that since 2000 has provided a Web interface that makes tools and instructional materials available to experimentalists, theorists and students in a number of areas of nanoscience, a set of disciplines which examine the physics, biology and chemistry of extremely small objects. "In a practical sense, it solves a lot of problems," says Klimeck, "especially for educators at many universities who don't have the IT staff to install all these different applications on local computers. Instead, they can just manage them from their browsers." Upgrading to the TeraGrid Now, the nanoHUB's architecture is being redesigned. Sometime in 2006, the original PUNCH middleware will be replaced entirely by middleware based on In-VIGO, a distributed environment that provides users with their own, individual, secure virtual environments in which to run applications -- all coexisting on the same physical resource. The most elegant aspect of the new and improved nanoHUB, however, is that these features are utterly invisible to the end-user. This is especially important as the nanoHUB takes on its new role as one of the science gateways being created to enable access for various research communities to the TeraGrid's computing power. The integration of In-VIGO with Condor, and particularly with Condor-G, a task manager capable of managing thousands of jobs on a distributed grid, means that researchers running applications on the nanoHUB will be able to submit these jobs to the TeraGrid, drastically reducing the amount of time it takes to run them. One of the tools available on the nanoHUB, Molecular Conduction (Toy), in action.
"We want our users to be able to run on the TeraGrid without having to be geeks, writing allocation proposals and installing certificates and having multiple logins," says Klimeck. "They may not even realize they're running on the TeraGrid ... they may say, 'What's the TeraGrid? I don't care.' This is our definite end-goal: for people to see as little of what's under the hood as possible and still be able to run their stuff." Is it real, or is it In-VIGO? Formerly, says Jose Fortes, professor of computer science at the University of Florida and principal developer of In-VIGO, "applications would never be able to run in a physical machine if they expected a different environment, and users who might have conflicting requirements would not be able to share that machine. Now we have software that makes the physical machine appear as multiple virtual machines." Although two users share the same machine, one user - -a theoretical researcher, for example -- might be running his or her applications on a Unix platform, tailored to that user's specific requirements, while the other -- an experimentalist -- might be running on Windows XP. However, they would never be aware of each other's activities, because the physical machine would appear, simultaneously and separately, as multiple virtual machines. In-VIGO creates a virtual address space which assigns each virtual machine its own IP address independent of the IP address of the actual physical machine. The virtual addresses are each mapped to the physical address and translation mechanisms are established between the physical and each of the virtual addresses that rout messages to the appropriate users. "It's like renaming the address of your house for just a few friends and then having the postman know how you did the renaming," explains Fortes. The capability for multiple, separate virtual environments also has important implications for user security. "You can think of the virtual machines as two separate machines located in the same room -- it would be impossible for what happens to one machine to affect what happens to the other," says Fortes. Thus, the damage caused by any serious but unintended errors a user makes is limited only that user's virtual machine. Likewise, in an era of devastating viral attacks that can bring down entire large machines or networks, any malicious code inserted by a rogue user is limited only to that user's environment and does not affect other users or applications. Expanding the Grid User Community The nanoHUB's success has been overwhelming: in the past year alone, it has provided the interface for more than 65,000 job submissions launched by more than 1000 users -- 70 percent of whom are students using it for university coursework. And another 6,000 members of the nanoscience community have used the nanoHUB to access online workshops and seminars. "The NCN is defining a new model to serve the computational community and the public at large," says Sebastian Goasguen, a research scientist at ITaP (IT at Purdue) who leads the deployment effort of the nanoHUB funded through the NSF National Middleware Initiative (NMI). "Each application is packaged with all the materials researchers or students need to learn and do research on specific subjects." This packaging takes the form of portable, XML-based "learning modules" that can be downloaded via Web browser and accessed in any user environment. Ultimately, the hope is that the nanoHUB's function as a TeraGrid science gateway will provide other science and engineering communities with an idea of what's possible -- that what works for one area of science can be transferred into other areas as well. Says Tim Cockerill, project manager for the TeraGrid, "TeraGrid's relationship with nanoHub is a prime example of how our Science Gateway partners utilize some of the National Science Foundation's most powerful high performance computing resources. Through Science Gateways we are able to reach out to communities of engineers, scientists, and educators who most likely would not have this opportunity otherwise." This research is funded by the National Science Foundation, Indiana 21st Century Fund, and ARO. For further information, visit its Web site.