Creating bespoke programming languages for efficient visual AI systems

Lauren Hinkel | MIT-IBM Watson AI Lab • May 3, 2024

A single photograph offers glimpses into the creator’s world — their interests and feelings about a subject or space. But what about creators behind the technologies that help to make those images possible? 

MIT Department of Electrical Engineering and Computer Science Associate Professor Jonathan Ragan-Kelley is one such person, who has designed everything from tools for visual effects in movies to the Halide programming language that’s widely used in industry for photo editing and processing. As a researcher with the MIT-IBM Watson AI Lab and the Computer Science and Artificial Intelligence Laboratory, Ragan-Kelley specializes in high-performance, domain-specific programming languages and machine learning that enable 2D and 3D graphics, visual effects, and computational photography.

“The single biggest thrust through a lot of our research is developing new programming languages that make it easier to write programs that run really efficiently on the increasingly complex hardware that is in your computer today,” says Ragan-Kelley. “If we want to keep increasing the computational power we can actually exploit for real applications — from graphics and visual computing to AI — we need to change how we program.”

Finding a middle ground

Over the last two decades, chip designers and programming engineers have witnessed a slowing of Moore’s law and a marked shift from general-purpose computing on CPUs to more varied and specialized computing and processing units like GPUs and accelerators. With this transition comes a trade-off: the ability to run general-purpose code somewhat slowly on CPUs, for faster, more efficient hardware that requires code to be heavily adapted to it and mapped to it with tailored programs and compilers. Newer hardware with improved programming can better support applications like high-bandwidth cellular radio interfaces, decoding highly compressed videos for streaming, and graphics and video processing on power-constrained cellphone cameras, to name a few applications.

“Our work is largely about unlocking the power of the best hardware we can build to deliver as much computational performance and efficiency as possible for these kinds of applications in ways that that traditional programming languages don't.”

To accomplish this, Ragan-Kelley breaks his work down into two directions. First, he sacrifices generality to capture the structure of particular and important computational problems and exploits that for better computing efficiency. This can be seen in the image-processing language Halide, which he co-developed and has helped to transform the image editing industry in programs like Photoshop. Further, because it is specially designed to quickly handle dense, regular arrays of numbers (tensors), it also works well for neural network computations. The second focus targets automation, specifically how compilers map programs to hardware. One such project with the MIT-IBM Watson AI Lab leverages Exo, a language developed in Ragan-Kelley’s group.

Over the years, researchers have worked doggedly to automate coding with compilers, which can be a black box; however, there’s still a large need for explicit control and tuning by performance engineers. Ragan-Kelley and his group are developing methods that straddle each technique, balancing trade-offs to achieve effective and resource-efficient programming. At the core of many high-performance programs like video game engines or cellphone camera processing are state-of-the-art systems that are largely hand-optimized by human experts in low-level, detailed languages like C, C++, and assembly. Here, engineers make specific choices about how the program will run on the hardware.

Ragan-Kelley notes that programmers can opt for “very painstaking, very unproductive, and very unsafe low-level code,” which could introduce bugs, or “more safe, more productive, higher-level programming interfaces,” that lack the ability to make fine adjustments in a compiler about how the program is run, and usually deliver lower performance. So, his team is trying to find a middle ground. “We're trying to figure out how to provide control for the key issues that human performance engineers want to be able to control,” says Ragan-Kelley, “so, we're trying to build a new class of languages that we call user-schedulable languages that give safer and higher-level handles to control what the compiler does or control how the program is optimized.”

Unlocking hardware: high-level and underserved ways

Ragan-Kelley and his research group are tackling this through two lines of work: applying machine learning and modern AI techniques to automatically generate optimized schedules, an interface to the compiler, to achieve better compiler performance. Another uses “exocompilation” that he’s working on with the lab. He describes this method as a way to “turn the compiler inside-out,” with a skeleton of a compiler with controls for human guidance and customization. In addition, his team can add their bespoke schedulers on top, which can help target specialized hardware like machine-learning accelerators from IBM Research. Applications for this work span the gamut: computer vision, object recognition, speech synthesis, image synthesis, speech recognition, text generation (large language models), etc.

A big-picture project of his with the lab takes this another step further, approaching the work through a systems lens. In work led by his advisee and lab intern William Brandon, in collaboration with lab research scientist Rameswar Panda, Ragan-Kelley’s team is rethinking large language models (LLMs), finding ways to change the computation and the model’s programming architecture slightly so that the transformer-based models can run more efficiently on AI hardware without sacrificing accuracy. Their work, Ragan-Kelley says, deviates from the standard ways of thinking in significant ways with potentially large payoffs for cutting costs, improving capabilities, and/or shrinking the LLM to require less memory and run on smaller computers.

It's this more avant-garde thinking, when it comes to computation efficiency and hardware, that Ragan-Kelley excels at and sees value in, especially in the long term. “I think there are areas [of research] that need to be pursued, but are well-established, or obvious, or are conventional-wisdom enough that lots of people either are already or will pursue them,” he says. “We try to find the ideas that have both large leverage to practically impact the world, and at the same time, are things that wouldn't necessarily happen, or I think are being underserved relative to their potential by the rest of the community.”

The course that he now teaches, 6.106 (Software Performance Engineering), exemplifies this. About 15 years ago, there was a shift from single to multiple processors in a device that caused many academic programs to begin teaching parallelism. But, as Ragan-Kelley explains, MIT realized the importance of students understanding not only parallelism but also optimizing memory and using specialized hardware to achieve the best performance possible.

“By changing how we program, we can unlock the computational potential of new machines, and make it possible for people to continue to rapidly develop new applications and new ideas that are able to exploit that ever-more complicated and challenging hardware.”

A collage of four pictures of a yellow robot dog.
By Alex Shipps | MIT CSAIL August 8, 2024
A new algorithm helps robots practice skills like sweeping and placing objects, potentially helping them improve at important tasks in houses, hospitals, and factories.
A man wearing glasses and a blue shirt is smiling for the camera.
By Sara Feijo | MIT Open Learning August 8, 2024
Leveraging more than 35 years of experience at MIT, Bertsimas will work with partners across the Institute to transform teaching and learning on and off campus.
Two men are standing next to each other in front of a table with a robot on it.
By Rachel Gordon | MIT CSAIL July 31, 2024
CSAIL researchers introduce a novel approach allowing robots to be trained in simulations of scanned home environments, paving the way for customized household automation accessible to anyone.
A bunch of green thermometer on a pink background.
By Adam Zewe | MIT News July 31, 2024
More efficient than other approaches, the “Thermometer” technique could help someone know when they should trust a large language model.
A bunch of dice are flying in the air in a dark room.
By Adam Zewe | MIT News July 24, 2024
Introducing structured randomization into decisions based on machine-learning model predictions can address inherent uncertainties while maintaining efficiency.
A computer generated image of a brain on a motherboard.
By Rachel Gordon | MIT CSAIL July 23, 2024
MAIA is a multimodal agent that can iteratively design experiments to better understand various components of AI systems.
A computer generated image of a molecule on a green background
By David L. Chandler | MIT News July 23, 2024
Analysis and materials identified by MIT engineers could lead to more energy-efficient fuel cells, electrolyzers, batteries, or computing devices.
A hand is touching a screen with its finger.
By Adam Zewe | MIT News July 23, 2024
A new study shows someone’s beliefs about an LLM play a significant role in the model’s performance and are important for how it is deployed.
A nurse is looking at a computer screen while a woman is getting a mammogram.
By Adam Zewe | MIT News July 22, 2024
The model could help clinicians assess breast cancer stage and ultimately help in reducing overtreatment.
A grid of colorful balls connected to each other on a white background.
By Poornima Apte | Department of Materials Science and Engineering July 18, 2024
An MIT team uses computer models to measure atomic patterns in metals, essential for designing custom materials for use in aerospace, biomedicine, electronics, and more.
More Posts