A popular compiler yields

Compilers are programs that convert computer code written in high-level languages intelligible to humans into low-level instructions executable by machines.

But there’s more than one way to implement a given computation, and modern compilers extensively analyze the code they process, trying to deduce the implementations that will maximize the efficiency of the resulting software.

Code explicitly written to take advantage of parallel computing, however, usually loses the benefit of compilers’ optimization strategies. That’s because managing parallel execution requires a lot of extra code, and existing compilers add it before the optimizations occur. The optimizers aren’t sure how to interpret the new code, so they don’t try to improve its performance.

At the Association for Computing Machinery’s Symposium on Principles and Practice of Parallel Programming next week, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory will present a new variation on a popular open-source compiler that optimizes before adding the code necessary for parallel execution.

As a consequence, says Charles E. Leiserson, the Edwin Sibley Webster Professor in Electrical Engineering and Computer Science at MIT and a coauthor on the new paper, the compiler “now optimizes parallel code better than any commercial or open-source compiler, and it also compiles where some of these other compilers don’t.”

That improvement comes purely from optimization strategies that were already part of the compiler the researchers modified, which was designed to compile conventional, serial programs. The researchers’ approach should also make it much more straightforward to add optimizations specifically tailored to parallel programs. And that will be crucial as computer chips add more and more “cores,” or parallel processing units, in the years ahead.

The idea of optimizing before adding the extra code required by parallel processing has been around for decades. But “compiler developers were skeptical that this could be done,” Leiserson says.

“Everybody said it was going to be too hard, that you’d have to change the whole compiler. And these guys,” he says, referring to Tao B. Schardl, a postdoc in Leiserson’s group, and William S. Moses, an undergraduate double major in electrical engineering and computer science and physics, “basically showed that conventional wisdom to be flat-out wrong. The big surprise was that this didn’t require rewriting the 80-plus compiler passes that do either analysis or optimization. T.B. and Billy did it by modifying 6,000 lines of a 4-million-line code base.”

Schardl, who earned his PhD in electrical engineering and computer science (EECS) from MIT, with Leiserson as his advisor, before rejoining Leiserson’s group as a postdoc, and Moses, who will graduate next spring after only three years, with a master’s in EECS to boot, share authorship on the paper with Leiserson.

Forks and joins

A typical compiler has three components: the front end, which is tailored to a specific programming language; the back end, which is tailored to a specific chip design; and what computer scientists oxymoronically call the middle end, which uses an “intermediate representation,” compatible with many different front and back ends, to describe computations. In a standard, serial compiler, optimization happens in the middle end.

The researchers’ chief innovation is an intermediate representation that employs a so-called fork-join model of parallelism: At various points, a program may fork, or branch out into operations that can be performed in parallel; later, the branches join back together, and the program executes serially until the next fork.

In the current version of the compiler, the front end is tailored to a fork-join language called Cilk, pronounced “silk” but spelled with a C because it extends the C programming language. Cilk was a particularly congenial choice because it was developed by Leiserson’s group — although its commercial implementation is now owned and maintained by Intel. But the researchers might just as well have built a front end tailored to the popular OpenMP or any other fork-join language.