NERSC’s Simon Examines the Developments of the Last Two Weeks

By Steve Fisher, Editor In Chief -- To see what was new and exciting at NERSC, Supercomputing Online talked to Director Horst Simon. Our conversation ranged from DTF/Teragrid and SciDAC to computing systems power at NERSC and a new perspective on Moore’s Law. Supercomputing: Concurrently with this week's SciDAC funding announcements, NERSC received DOE funding for a number of other new projects. Would you tell our readers a little about those projects? SIMON: Well, maybe I should talk about both of these together, both the SciDAC and the other ones because they can be seen in combination. All in all what is happening is SciDAC/DOE has gotten finally the office of science a very big increase and DOE is using this increase in SciDAC to really push one of its strengths, that is the focus on high end simulation using supercomputers and supporting the traditional DOE applications. So the focus is really on the supercomputing applications that were all there a long time ago and some new ones. I think that’s very important to keep that direction in mind, because as much as NSF has done great research the actual application support was not really a primary focus on ITR. So I think it’s really good for DOE and SciDAC to compliment what NSF has done in the last couple of years in ITR and focus on application research. And…so there are a couple of projects which focus directly on the application. There’s another set of projects which are so-called integrated software infrastructure centers, ISIcS which will develop the tools to improve the performance the applications or enable terascale applications. A lot of the things which principle investigators and DOE labs and researchers proposed to SciDAC turned out to be still fairly basic research and these and the other grants and the projects which get funded, they are seen to be still, as I said more on the research side than the advanced development and deployment side which is the SciDAC focus, so that’s why they got funded because they were good ideas that they could fund out of a base program. So that explains the whole swoop. So, the big focus is on the SciDAC project because that’s where the majority is going. When we get funded out of a base program it falls into two categories. One is the benchmarking project, which I’m pretty excited about because it’s an effort to introduce some concepts which make benchmarking more scientific. As you know there’s a debate that’s been going on for ten years in the supercomputing world and that’s another effort and attempt to make progress in that area. The other set of proposals which come off a base program are all tools and technologies and research in grid computing. So that’s sort of my big overview. I spent more time on the small ones, but in terms of the really important ones, that is that DOE is working in getting the terascale applications ready to run on the big platforms we will have in the next couple of years. Supercomputing: Is NERSC going to be participating in the six Lawrence Berkeley-led SciDAC projects in addition to the concurrently announced DOE projects? SIMON: NERSC the center is definitely going to be involved in SciDAC … because we are within the SciDAC program labeled the flagship facility, which means that DOE offers a scientist one large facility that is NERSC and has several, you might say, midrange or experimental facilities. For example, Oak Ridge is labeled as an advanced computing research facility, and so is Argonne. But in terms of the actual production workload, NERSC will carry the majority of the SciDAC applications. In fact, this is not really well known yet, this is independent of SciDAC (not part of SciDAC funding), it just happens to be just at the same time, but we upgraded our current machine with additional node(s) so we will have a five teraflop machine. So we actually will have by October 1st, the capability increase by a considerable amount and that means that compared to last year, that’s fiscal ‘01, fiscal ‘02 will have a five-fold increase in available computer cycles, so at this point we are clearly the leading supercomputing center with respect to what’s available in cycles. SciDAC applications will get access to those additional cycles so we’re ready to handle a large number of new applications. Supercomputing: I know you have a big workshop coming up in a month or so on advanced computational testing and simulation. Please tell the readers about the workshop. A bit of general information, who should plan on being there, etc. SIMON: This workshop is actually the second in a series on the advanced computational tools or ACTs project. This is a set of tools which were developed over the last decade. If you remember back under HPCC funding there was a lot of investments made in software and productivity of scientists on supercomputers. In the DOE world there were on the order of a dozen different tools which ranged from numerical libraries, to visualization tools, to performance monitoring and program development environment. They were put into this toolkit and one of the problems generally government funded research and university research have in high performance computing is that it’s just that, research. They’ve never really put much effort into the actual deployment and long term support, and so two years ago DOE decided to fund us for a project that would take some of the tools which were developed elsewhere and provide some more of the deployment and long-term maintenance and support funding. If you remember in the old days of supercomputing there was the model that, ‘well we write ( ) and it’s so useful that eventually some commercial company may pick it up or it will be generally supported.’ I think we’ve come now to the realization that a lot of the tools are very specific to the high end and will never have the commercial mass market appeal and therefore it is the responsibility of the spenders and the agencies that fund high performance computing to continue supporting tools even if they have reached the research stage and are in a deployment stage. That’s what the ACTs toolkit project is about. So you could think of the workshop as a training program a software vendor would put on and say, ‘Here are the things we have in our offering, and this is how you use them and we’d like to try them out and see if they can help you be more productive using terascale machines.’ So that’s what the workshop is about and it’s really aimed at users, students who are really working on the codes and try to give them the opportunity to familiarize themselves with the tools. We have student fellowships for somebody who needs travel support and things like that taken care of. Basically, everybody is encouraged to apply. Supercomputing: With the NSF Teragrid/DTF and the SciDAC announcements it's been a busy week or so for HPC. What do you think of the DTF project? SIMON: I think this is a great and wonderful project. When we talked about different roles and I mentioned SciDAC in the beginning, I said SciDAC is really focusing on the high-end applications and focusing on scientific supercomputing, so I think this is sort of the other end of the spectrum. This is really pushing some of the computer science issues forward. This is really building a facility in a real life setting that we as a community can look at what it really means to run a distributed terascale computing application where data may sit in one site and the teraflop machine is at a different site. So I think it’s really wonderful that NSF is willing to invest in some, what I would fairly look on as experimental technology, a significant amount of money and say ‘Let’s try and build it, see what are the issues we can learn from.’ … I’m very much looking forward to see how successful this will be and I hope that as a community we can all learn from what NSF is doing. I’d like to make one other point. I think we’ve seen two great projects coming to reality in the last couple of weeks, SciDAC with a very decided application focus and the terascale facility focusing, really focusing on some of the computer science issues, the networking and the grid issues. I think what’s really missing in the federal picture of high performance computing funding is vigorous new architecture and systems research activity because the fact that we’re all computing on big web servers and systems that have a strong commercial base and even PC clusters are really leveraged from commercial technology, and I think that there’s an increased recognition in the high performance computing community that we need to go back and reinvigorate architecture research, systems research and even basic technology to really make sure that in two generations in the future, by 2007 or by 2010, there will really be high performance computing, supercomputing platforms around which keep a sort of big growth track in terms of performance on terascale systems. So in other words, to go to the petaflop by the end of the decade, it is a little bit risky to assume that this will happen simply by building the same type of machine, the same type of architecture, processors, SMPs and we don’t need to do anything. I’m very skeptical that this will be the architectural approach that can get us to the petaflop. The concerns are that the systems get larger, physically larger, they get much more complex in terms of the systems software and even if Moore’s Law is holding we may reach some complexity limits with these types of systems, physical limits, so I think now is the time to go back and invest also significantly, at the same level, into base technology systems and architecture research. Supercomputing: You mention Moore’s Law there, and although it’s as a total layman I absolutely agree with you that it will hit a physical limit. SIMON: It will hit a physical limit somewhere, but there’s sort of another corollary which hasn’t really been discussed as much as the physical limit and that is the economic limit of Moore’s Law. So right now the way the reinforcement of the process cycle works this way is that semiconductor manufacturers feel compelled to produce more and more and higher and higher performance processors and technology on this performance doubling every eighteen months cycle because the market was pushing, there were still more desktop software and more gaming software that needed more and more performance; but, now what we see in 2001 is that the demand is no longer there for computer systems with the base economic push for buying more technology and buying more PCs and more servers is not there anymore. So as long as this economic driver is sort of slacking and no new economic driver is pushing Intel and other semiconductor manufacturers to produce faster and bigger chips, I think there is the potential of economic slowdown in Moore’s Law. To put it to the point, do we really need, I hate to be quoted on something like this in five years, but do we really need a 10 GHz processor to run Excel? That’s really what it is. So if we don’t need a 10GHz processor to run Excel, if we reach a point where the processing will reset some of the economic drivers that have been driving the computing market of the last decade, then the chip manufacturers don’t have a big incentive to innovate anymore. -------- Supercomputing Online wishes to thank Director Simon for his time and insights. It would also like to thank NERSC’s Jon Bashor and Zaida McCunney for their assistance. -------- To comment on this story, see the “send your comment” button below.