Teaching computers how to write fast software

Researchers at Carnegie Mellon University develop a framework to replace the human programmer in high performance numerical library development: The computer revolution has taken us to the point where we use machines for nearly every possible thing. We use them in schools, work environments, and our homes; even our cell phones, cameras, and cars have computers. Scientists use very high performance computers to collect, analyze and manage large amounts of information. To perform these complex tasks computers run software or programs written by human experts--it's the part most of us don't see.

The library generation work is part of the larger SPIRAL project, a multi-university effort that studies automation in software and hardware development in a more general context and comprises the research teams mentioned in the discovery article and the group of Professor James Hoe, CMU. A part of the larger SPIRAL team is shown in the picture. From left to right. Back row (4 people): David Padua, Aliaksei Sandryhaila, Yevgen Voronenko, Peter Milder. Middle row (7 people): James Hoe, Xiaoming Li, Maria Garzaran, Jeremy Johnson, Franz Franchetti, Srinivas Chellappa, Roland Wunderlich. Front row (3 people): Jose Moura, Marek Telgarsky, Markus Püschel., Roland Wunderlich. Front row (3 people): Jose Moura, Marek Telgarsky, Markus Püschel. Credit: Markus Püschel, Carnegie Mellon University

National Science Foundation (NSF) supported researchers at Carnegie Mellon University (CMU) have found a way to 'teach' computers to write high performance libraries for certain types of numerical functionality. Libraries are modular segments of code that perform a performance-critical task or function when called by a program. In developing software, programmers call libraries instead of having to rewrite the same code over and over. If a library runs fast, then so will the application that uses it. The team, led by Markus Püschel, an associate research professor of electrical and computer engineering at CMU, is the first to demonstrate that all the necessary tasks in writing and optimizing these types of libraries for highest performance can be fully automated. "It has become extraordinarily difficult to write libraries that run as fast as possible," Püschel says. "Particularly affected are numerical applications, those that require extensive mathematics such as any form of audio-image-video processing, communication, medical imaging, scientific modeling and many others." As an example, Püschel said that one straight-forward implementation of a mathematical routine may run 10-100 times slower than an optimal one, even though both may perform the same number of mathematical operations. "Needless to say, this is not an acceptable loss," he adds. So what is so difficult about writing fast libraries for numerical functionality? These libraries are written and optimized by hand, a time-consuming task that has to be repeated whenever a new computing platform is released. To make things worse, the complexity of modern computing platforms "puts performance optimization into the realm of 'guru programmers' who do things like restructure the computation to avoid cache misses, use special vector instructions, and parallelize for multiple processor cores," Püschel says. Automating these and other tasks in library development posed the main challenge that his team took up as part of the SPIRAL project for a domain of functions called linear transforms. The team succeeded due to a number of major ideas and techniques developed by team member Franz Franchetti, an assistant research professor, and recent PhD graduate Yevgen Voronenko who also designed and implemented the library generation system as part of his thesis work. The result: SPIRAL is the first to demonstrate that the development of highest performance libraries can be fully automated. Given only a short mathematical description of a linear transform, SPIRAL can automatically design and implement a library and perform all the necessary optimizations to achieve very high performance on a state-of-the-art computer. These optimizations include the traditionally difficult problems of parallelization, vector processing and memory hierarchy optimization. Further, benchmarking shows that these computer generated libraries are as fast as, and sometimes even faster, than their human written counterparts. Integrated Electronics Corporation (Intel)--one of the leading companies for microprocessors manufacturing--has started to use SPIRAL to generate parts of its commercial libraries. "This may mark the first time that commercial library development is done by a computer," Püschel notes. SPIRAL combines innovative techniques from different disciplines including tensor algebra, domain-specific languages, symbolic computation, and compilers. "The key idea behind SPIRAL is a novel framework that combines several techniques including the use of mathematical domain-specific languages and rewriting to raise the level of abstraction and 'teach' SPIRAL algorithm knowledge and algorithm optimization," Püschel notes. "When SPIRAL writes software, different available algorithms are considered and evaluated and undergo sophisticated optimizations so the final code is not only correct, but also runs as fast as possible. And it does, sometimes beating all competitors," Püschel continues. "SPIRAL can, to date, only implement a limited set of functionality, but one has to start somewhere and we believe that SPIRAL's approach can be carried much further. Approaches like SPIRAL may be crucial in dealing with future computing platforms," he adds. The latest version of the SPIRAL library generation system builds on an earlier prototype jointly developed by the teams of Professors Jose Moura, Püschel, and Manuela Veloso of CMU, Professor David Padua of the University of Illinois at Urbana-Champaign, and Professor Jeremy Johnson of Drexel University.

Teaching computers how to write fast software

Beamforming the future: BeammWave's 6G push signals the rise of orbital-terrestrial wireless networks

Wall Street wants to trade supercomputing power like oil

IBM’s quantum foundry gamble reveals a troubling reality about the future of computing

Pulsars as galactic scales: Supercomputer simulations reveal a new way to weigh neighboring galaxies

NVIDIA’s fiscal 2027 surge shows the new face of supercomputing

Huawei’s Tau Scaling ambition tests the limits of post-Moore semiconductor reality

Memory has become the new compute: Why Micron, SK Hynix crossing $1 trillion matters to supercomputing

AI breaks conservation barriers: Australia’s Wildlife Observatory leverages supercomputing to protect biodiversity

Snowflake's $6 Billion AWS bet signals the next phase of enterprise AI infrastructure

Dell’s fiscal 2027 surge shows supercomputing demand has become mainstream infrastructure

POPULAR RIGHT NOW