Grid Computing: An Interview with NERSC’s William Johnston

By Steve Fisher, Editor In Chief – Grid computing will play a significant role at SC2001 and already does in high performance computing in general. As a lead-in to next week’s show, and simply to learn more, Supercomputing Online spoke with NERSC grid expert William Johnston. To hear even more of Johnston’s insights be sure to attend his presentation at the NERSC booth Wednesday at 10:45am. Supercomputing: Please tell the readers a little bit about your computational and data grid work at NERSC. JOHNSTON: I’d say probably ten years ago we became interested in how you manage lots of data on high-speed networks. Much of the motivation of this was the idea of bringing major instrumentation systems online so we could connect the major instrumentation systems directly to our high performance computing and storage systems. We were involved in almost all of the early network testbeds. Also we were involved with Pac Bell in building the first OC-3 network in the Bay Area. That was actually a partnership with Kaiser-Permanente in which we were bringing online one of their cardioengeography systems over in San Francisco. So this really has fit the background for much of the grid work that we’ve done and in fact grid technology in general I would say. But that for us, continues for us, our interest in NERSC, that is we’ve sort of focused, and by NERSC I mean there’s two flavors of NERSC here at the lab…and one is the NERSC division which houses essentially three departments, the NERSC supercomputer center, the computational science department and the distributed computing department which is the one I run. And so what we’ve done as part of the new SCIDAC program with DOE is to organize essentially five institutions around building a DOE science grid. The institutions are LBL, PNNL, Argonne and Oak Ridge and NERSC, the supercomputer center. Our focus there is somewhat on how you build virtual organizations to support these very large scientific collaborations, that characterizes DOE’s Office of Science work, how you deal with the massive data sets that come out of the kinds of instruments the DOE deals with and also how do you bring those instruments online and how you bring the supercomputers online in such a way that you can couple these instruments directly to the supercomputers. And so I would say that some of the real projects around that are things like the PPDGs, the particle physics data grid and the science grid will be supporting some of their work, working with them…the earth sciences grid where they, the climate people have to deal with lots and lots of data. One of our benchmark applications is supernova cosmology project here at LBL which is a very complicated worldwide workflow management problem and so one. So these are the kinds of thing we’re focusing on in the DOE science grid and the NERSC involvement in grids. Supercomputing: I understand you’ve been working with NASA on some of their Grid-computing efforts…would you mind telling us a little about that? JOHNSTON: Sure, I’ve been a detailee to NASA Ames for three years now, roughly half time. I’m the project manager for their information power grid project, which is the NASA grid project. And that project I would say has really from the beginning been focused on a new service delivery mechanism for supercomputer services. In particular, if there’s a single type of application that motivates NASA here it’s what I call multivismar simulations. These are essentially collections of computational simulations that are coupled together to simulate whole systems. So this is of interest in a number…these kinds of problems tend to come up more in the engineering environment perhaps than the scientific environment. But there’s a big aviation safety program in which they’re trying to do detailed computational simulations of whole passenger aircraft. That is all of the systems of operating aircraft. They’re looking at multiple simulations designed at the safety of the next generation space shuttle and so on. So much of the work at NASA is focused on how do we couple computing systems together, how do we couple the existing simulations together into larger-scale design systems that are really computationally driven. Supercomputing: This is one of those “futurist” questions that I’m sure some people hate and I apologize, but where do you see grid computing in five years or ten years? JOHNSTON: I can say where I see aspects, but I’m not going to try and say where I see the whole ball of wax in grid computing. I think in five years is what we’ll see is that grid computing will be dominated by an area which is just emerging now which is called “grid web services.” And these grid Web services are really looking at what the commercial sector is doing in Web services and saying ‘we need to make grid services and the kinds of application functions that would be useful for scientific simulations’, which is to say things like visualizations, interface builders, collaboration tools, numerical grid generators and so on. All of these things ought to be available as a Web service. And as a Web service means that it probably has a SOAP interface on it for access, it probably has auxiliary I/O mechanisms to handle the fact that most of our scientific applications have data flow requirements which are far beyond what SOAP was ever intended to do, but there are people working on that. And then they’re described in terms of WSDL which is Web Service Description Language which gives a complete description of the service and its interface so that you can, if you have the WSDL with all these different services, then you have the possibility of building these higher level portals that compose these services into complete systems. If you’re familiar with things like AVS, which is graphical network-based visualization where you pick up boxes and string them together. You can string them together because there are certain constraints on what sort of data they can pass back and forth, but that’s, this is a fairly dramatic generalization of that idea. It’s the sort of thing IBM is doing with their Websphere and Microsoft is doing with .Net and Sun is doing. I really see that the grid services will be moving into that arena. The reason I say that is because all of the tools for building the user interfaces and the portals, and by portals what I mean is the disciplined science framework. In other words the collection of tools, data and so forth with a structure imposed on them that make them useful for scientists in a particular discipline. I think that we’ll see lots of commercial tools available for building those kind of things much easier than we can do today…under the assumption that these services are packaged up as these Web services. In addition to the description there’s also discovery, so the other buzzword is typically UDDI which is what amounts to the directory service for discovering the service descriptions which then also tell you exactly what the service is and how to use it. IN some sense it’s objects for the web, so I think that’s where we’ll be going in five years. What I think we’ll see in ten years is something that I’ve seen termed as the knowledge grid. I’m indebted to some of my colleagues at the Italian research institute, their systems and information group within CNR for that term. What they mean by that is once you have these portals so that you can, like AVS, bring analysis functions together to perform a specific task, the assumption is that you know exactly how to string these things together for the particular question you want answered and that you have to do that explicitly each time. If you talk to scientists that are really involved in deriving science from data analysis for instance, they say well, we know that the information we’re collecting gives us velocity, gives us mass, gives us position, time and we have all the tools to analyze that sort of thing but we want to be able to walk up to a system that has those kind of components and ask the question, and this particular example is provided by Stu Loken who is a physicist who ran the computing sciences division here for a long time, she said, ‘I want to be able to walk up to a system that has those capabilities and ask the question, show me all of the particles that were emitted in a jet of a particular orientation and all those particles that have a particular momentum.’ So what you’re asking is, you’re not asking for anything magic, there’s nothing magical about this, what you’re saying is I want you to provide to me a set of derived quantities and I’m going to describe these derived quantities in standard terms, that is, a jet being a cone of particles of a certain momentum, of certain angular orientation and so on, and all the tools exist underneath to derive those quantities, but I want you to basically take my description of what I want to see and then assemble these tools automatically. That is, first my description and say in order to get momentum we need to know the mass and the velocity, we know the mass and velocity we know how to get the momentum so we can put together the pieces that will automatically give us momentum, then we can automatically generate filters that’ll restrict the results to this cone of a certain angle and so on. I think that’s the next big step, and this is exactly what the CNR folks are calling the knowledge grid is the mechanisms of describing and parsing such a query and then turning it into the relationships that you need to put the existing computational components together in order to get those answers. Because now what you’re doing is providing a service to the scientists at what is called the appropriate level of abstraction. That is, they are asking their question in terms of the science that they’re interested in doing. I really think that ten years from now, that’s the sort of thing we’ll be focused on, is building systems like that. Supercomputing: What do you folks have planned for SC2001 next week? JOHNSTON: Well, actually we have a collection of demos. One of the ones that will probably get the most attention is the RAGE demo. That’s the… Supercomputing: Ahh…that’s the robot. JOHNSTON: Yes, that’s the robot. Essentially it’s a mobile access grid node and it incorporates a lot of the instrument management and collaboratory work that we’ve done over the years under Maryann Scott’s very I think wisely directed funding from the MICS office. She’s the program manager who is funding the grids work in the MICS office too now. It’s called the collaboratories program still. Anyway, we took a lot of that work that was developed in her program over the last few years, combined it with, a lot of that had to do with secure access to robotically controlled cameras and things like that, and combined it with the access grid node technology together with a lot of native enthusiasm on the part of the folks here about building a robot turned it into RAGE. That will be fun to see what happens there. On a very practical nitty gritty type of thing we’ll be doing initial demonstrations of a self-configuring network monitoring project that we’ve got. Keith Jackson and Jason Lee and Deb Agrawal and I sat down with Vern Paxson and posed the question to ourselves…the program with network monitoring is that it’s extremely administratively intensive. If you want to monitor a whole bunch of intermediate nodes for a particular end to end nodes for a particular end to end traffic flow it’s hard to go in and set those things up for lots of reasons, you have to touch a lot of different boxes, you have to know what the active transit paths are and so on. So we said, ‘How can we do this automatically?’ That is, how can we set up a monitoring system that is self-configuring. You provide it with a mechanism of automatically turning on monitors, the monitors that you want along all the components of the path where your data flow is going to occur. And so we came up with a way of doing that. A very clever way actually combining, VERN wrote the BRO system which is an intrusion detection system that is used here at the lab, we’re going to use bits and pieces of that to detect packets which instead of triggering an intrusion detection alert will trigger the turn on of the monitors we want turned on. So that’s I think a very interesting project. So that’ll be, the data handling in that will be pulled in with our work on the grid monitoring architecture which is how you deal with the monitors and the data that comes out of them. We’ve got several grid projects that will be demonstrated. The DOE science grid will show up in four or five different booths around the floor. The fusion collaboratory is one of the projects were working on the DOE science grid as an application and this is a laboratory for the thermonuclear fusion, control fusion community. They have both computational and security requirements together with visualization so those are some of the things that will be demonstrated in that particular demo. There will also be some demonstrations of our work with grid workflow management stuff for the supernova cosmology project. And in moving toward the five year goal of the grid Web services that I was talking about Keith Jackson and Jason Novotny and some folks in Dennis Gannon’s group at Indiana have been working very hard on wrapping some of the existing grid services in PYTHON and then using SOAP combined with the grid security infrastructure to get remote access via the standard Web services access mechanisms. So that will be demonstrating SOAP over GSI between the PYTHON-wrapped grid services here at LBL and the Java-wrapped services at Indiana University. I think that’s pretty much a first, getting SOAP to play with GSI, grid security architecture. So that’s sort of a first step towards building the grid Web services. I think that’s probably about it. Supercomputing: You’re giving a presentation too aren’t you? JOHNSTON: I’m giving one of the booth presentations, yes. It’s sort of an overview of the DOE science grid and the kind of things we hope to be doing and what the technology and structure of the science grid is. ---------- Supercomputing Online wishes to thank Bill Johnston for his time and insights. ----------