Supercomputing Texas Style: An Interview with Jay Boisseau

By Steve Fisher, Editor In Chief -- For six months now former SDSC Assistant Director for Scientific Computing Jay Boisseau has been the Director of the Texas Advanced Computing Center (TACC) at the University of Texas in Austin. Supercomputing Online spoke to Dr. Boisseau this week to discuss how things have been going in the Lone Star State, and also to discuss TACC’s recent acquisition of three HPC systems from IBM. Supercomputing: As a little background, please tell the readers a bit about the Texas Advanced Computing Center. BOISSEAU: TACC has been operational (by various names) for almost 16 years. It was originally a unit of the University of Texas System, but for the past 8 years has been affiliated with the main campus at UT-Austin. TACC has many longtime users at UT-Austin and has played an important role in enhancing the research programs of both individual researchers and of major research centers, such as the UT Center for Space Research and the Texas Institute for Computational & Applied Mathematics (TICAM). TACC has also been an NPACI resource partner since the beginning of the PACI program in 1997. TACC has provided more cycles to the national academic user community than any other PACI mid-range resource partner. We currently provide a large number of cycles on two Cray T3Es, a Cray SV1 and an IBM SP system to national users as well as researchers at UT. About two years ago, an external review committee made several recommendations for enhancing TACC in order to provide more powerful capabilities for the UT community. There have been several major steps forward since then, including: - TACC was assigned directly to the Vice President for Research of the University of Texas at Austin, Dr. Juan M. Sanchez. - The world-class visualization facility in the new Applied Computational Engineering & Sciences (ACES) building--both of which were donated by the O'Donnell Foundation, a tremendous supporter of computational research at UT--was transferred to TACC and new visualization staff positions were created. - Additional new positions for HPC and Grid software development were created for TACC. - The 272-processor Cray T3E at SDSC was transferred from SDSC to TACC under an NPACI arrangement. This system is a production workhorse and is used by many nationally prominent researchers. - IBM and TACC signed an agreement to install three new HPC systems at TACC. This agreement includes the systems based on both IBM's POWER4 processor and on Intel's new 64-bit Itanium processor. Thus, my first six months as the new director of TACC have been extremely busy! In this time, we have: hired 8 new staff with a ninth on the way; transferred and installed one HPC system and installed two new systems with another on the way; assumed full oversight and responsibility of a visualization facility and integrated it into TACC's infrastructure; and instituted new research and development activities in HPC, SciVis, and Grid computing. We are already achieving successes in our R&D activities, including receiving new funding from NSF and DoD, and we are very excited about future plans for both R&D activities and for new resources. We intend to make TACC the leading university-scale advanced computing center in the country, while continuing and expanding our role in the national community as an NPACI partner and through other federal programs. Supercomputing: In technical detail, and beginning with the Regatta system, please describe each of the new systems acquired from IBM. BOISSEAU: We are purchasing four Regatta-HPC eServers. Each system has 16 POWER4 1.3GHz processors and 16GB of memory (which we may increase in 2002). Each system will have 144GB of local disk, and the four nodes will share 1TB of additional disk We are actually the first US institution to purchase the Regatta-HPC configuration of these eServers. This configuration has only 16 processors instead of 32--there is only one active processor per processor card instead of two. This means there is less sharing of caches and of memory bandwidth. We are very excited about this configuration because of the higher cache-per-processor and the increased bandwidth-per-processor it will provide throughout the memory hierarchy. Coupled with the POWER4 processor, this bandwidth will facilitate leading-edge computations that might be bandwidth-starved on other systems. When these systems are coupled in 2Q02 into a single system, the aggregate system will have a third of a teraflop of theoretical peak performance, with enough bandwidth to sustain a reasonable fraction of that peak. If all goes as planned, as we expect, we are hoping to increase the size of this Regatta-HPC cluster as well. In addition to being one of the earliest customers of IBM's newest 64-bit microprocessor, we have also just installed a cluster based on Intel's new 64-bit processor. Our Itanium cluster has 40 800MHz processors, 80GB of memory, and 1.6TB of disk. The dual-processor Itanium nodes will be connected with a Myrinet 2000 interconnect and the cluster will run the NPACI Rocks cluster kit, which is developed by our partners at SDSC. The third new system, also just installed, is a 64-processor cluster based on Pentium III 1GHz processors. This cluster has 32GB of memory and 1.4TB of distributed local disk. We are installing a high-performance parallel I/O file system--IBM's GPFS--with 0.75TB on this system to facilitate I/O-intensive applications. This cluster will also run NPACI Rocks and will also use a Myrinet 2000 interconnect. Supercomputing: Please tell us about how TACC will be using each of the systems. What types of research and other functions will you focus on? BOISSEAU: The Regatta-HPC systems will be allocated 50% to the UT research community and 50% to the NPACI community. We believe the interest in these systems will be tremendous due to the excellent bandwidth of these systems relative to other microprocessor-based systems. Memory bandwidth is often the main bottleneck in the performance of scientific and engineering applications, so we expect requests for cycles on these systems from researchers in a wide range of technical fields. When the systems are integrated into a larger system in 2Q02, we expect demand to increase further. Additionally, TACC is just across the street from IBM's POWER4 division in Austin. We believe this will facilitate many interesting partnership opportunities between TACC and IBM staff in performance modeling and applications optimization. The Itanium cluster will be allocated primarily to a handful of TICAM researchers who helped prepare the proposal for funding for this system. These researchers will be exploring problems in computational engineering, especially in computational fluid dynamics, computational microelectronics, and oil reservoir and subsurface modeling. TACC staff will also use this system to help develop Grid software for two Grid activities: the NSF TeraGrid (since TACC is an NPACI partner) and a new Texas-based Grid effort that is just being initiated by UT in collaboration with Texas A&M University, Texas Tech University, Rice University, and the University of Houston. The Pentium III cluster will serve three roles: production computing; a source for accounts for training and education; and a model for campus researchers who wish to set up their own local cluster. The latter function may impact the largest number of UT researchers over time. Many UT researchers have set up clusters or plan to, but most find the effort required to configure and operate them quite challenging and distracting. Using NPACI Rocks and vendor-supported hardware, we will provide a robust cluster that researchers can 'copy' and that enables us to provide them with answers to questions (if they use our model). Supercomputing: I understand that when operational the new Regatta system will be the most powerful academic computing system in the state of Texas. Congratulations. Can you tell us what this new status means to the folks at TACC? How about you personally? BOISSEAU: Well, as everyone knows, 'bigger is better' in Texas! Seriously, it will be very important to the research community at the University of Texas at Austin to have access to more powerful HPC systems. These researchers already have access to one of the most powerful visualization laboratories in the country through TACC, but larger simulation engines are needed for many leading-edge research activities in science and engineering. Since almost every graduate science and engineering department at UT is ranked in the top 10 in the country, and since every field of science and engineering depends on computers more than ever, the need is obvious. TACC staff are excited to be working towards filling this need to ensure that UT researchers have the resources—and expertise--available locally to enable breakthrough science. Many of the staff have been here for most of TACC's history, and this new growth in TACC's staff and resources is particularly exciting for them. Personally, I am thrilled to be part of this growth in TACC's resources and activities and honored to be chosen by the largest university in the country to lead such an important activity. I want UT to be world-class in advanced computing just as it is in so many fields of science and engineering, and in fact this is necessary for those science and engineering fields to remain world-class. I am proud that I have had a small part in getting things moving in that direction and excited about our plans for the future, but obviously the bulk of the credit goes to the TACC staff and our supporters in the UT faculty and administration. Supercomputing: Like a great many of us, you were at SC2001. How did things go? What was your impression of the show in general? BOISSEAU: I always love SC. I find it interesting to hear other people's opinions of how the show compared to previous ones, but in fact I have greatly enjoyed every show I've ever attended. The main objective for me at this point is to try to identify the technologies that TACC should invest in to continue to enhance our capabilities--and through us, the capabilities of our users. I thought this year's show was particularly interesting for the diversity in the types of technology companies that were present. Storage vendors seemed to have a larger presence than ever, as did network companies. This decade was proclaimed the 'Data Decade' on the coffee mugs given out in the NPACI booth, and this claim received confirmation from the large number of companies present who provide technologies to move and store data. The presence of 'commodity' vendors continues to increase, from processor companies like Intel and AMD to PC companies to cluster configuration companies. The 'big iron' companies are still highly visible and still doing great things, but the overlap of HPC with the commodity world is obviously at an all-time high. I think the fact that two of the three new systems TACC has purchased from IBM bears this out. These are the first HPC systems TACC has purchased based on entirely on commodity technologies. Supercomputing: Is there anything else you'd like to add? BOISSEAU: I would like to ask your readers--not just from UT but also at other academic sites and at government and industrial organizations affiliated with UT (or desiring to be)--to check out our web site and let us know if they are interested in using our resources or services. I would also encourage them to contact me directly if they are interested in partnering in research projects to develop new HPC or advanced scientific visualization software, or if they are interested in our new Grid-building activities in Texas. Finally, I'd like to thank you for the opportunity to tell the readers about the Texas Advanced Computing Center. ---------- Supercomputing Online thanks Jay Boisseau for his time and insights. ----------