The Texas Advanced Computing Center (TACC) at the University of Austin has built some of the most powerful supercomputers in the world. Those HPC systems enable cutting-edge research devoted to solving some of society’s most pressing problems.
Jay Hartzell, president of the University of Texas at Austin, explained, “Two decades ago, UT made a big bet on TACC and supercomputing. It’s an investment that’s paid off handsomely. And, given the proliferation of data science, AI and machine learning across fields and throughout society, there’s no limit to TACC’s impact over the next 20 years.”
The TACC has been particularly involved in work related to applied science that relies on artificial intelligence. Dan Stanzione, executive director of TACC, noted, “We build large computer systems, data systems, AI systems, to support open science research. We’re mainly funded by the National Science Foundation, so we support research projects in all fields of science all around the country and all around the world, actually several thousand projects at the moment.”
Recently, Stanzione and Rajesh Pohani, vice president of the PowerEdge and Core Compute portfolio management at Dell, spoke with John Furrier of siliconANGLE about Lonestar6, one of TACC’s newest supercomputers.
The Lonestar6 supercomputer
“Lonestar6 is a Dell Technologies system that we developed with TACC,” said Pohani. “It consists of more than 800 Dell PowerEdge 6525 servers that are powered with 3rd Generation AMD EPYC processors.” It also includes NVIDIA A100 GPU accelerators, NVIDIA HDR InfiniBand networking, BeeGFS-based storage on Dell PowerEdge servers and PowerVault storage, and Green Revolution Cooling immersion.
Thanks to this powerful hardware, the Lonestar6 can perform three quadrillion operations per second. To put that in perspective, if a human being did one calculation per second, it would take 100 million years to accomplish what this supercomputer does every single second.
The need for urgent computing
So what is the TACC doing with this highly advanced high-performance computing system?
According to Stanzione, “There’s really a huge range from new microprocessors to materials design, photovoltaics, climate modeling, basic science and astrophysics, quantum mechanics, and things like that.” He added, “But I think the nearest-term impacts that people see are what we call ‘urgent computing,’ which is one of the drivers around Lonestar and some other recent expansions that we’ve done.”
The TACC defines “urgent computing” as research that requires a fast turnaround approaching real-time. For example, the very first urgent computing challenge the TACC addressed was the Deepwater Horizon BP oil spill in the Gulf of Mexico in April 2010. The TACC supercomputers helped responders figure out where the oil was likely to spread and how best to mitigate the effects of the disaster.
Since then, the TACC has been involved in numerous efforts surrounding other disaster events, including the Japanese tsunami. “We’re just ramping up a project in disaster information systems that’s looking at the probabilities of flooding in coastal Texas,” said Stanzione. “Disasters are all the time.”
TACC supercomputers have also aided in space exploration. When a piece of debris hit the space shuttle in orbit, the HPC systems helped figure out whether the spacecraft could make reentry safely — a problem that required an immediate response. As Stanzione explained, “You have until they come back to get that problem done. You don’t have months or years to really investigate that.” The TACC also helped analyze the data that confirmed the Event Horizon telescope’s first-ever image of a black hole.
In addition, the TACC has been very involved in health care research, including the response to the COVID-19 pandemic. “We got good and ready for COVID through SARS and the swine flu and through HIV work,” said Stanzione. The facility supported more than 50 different COVID research teams and enabled the creation of the first atomistic model of the SARS-CoV-2 virus. Scientists using TACC computers also created models that enabled daily forecasts that policymakers used to inform their decisions.
Pohani noted that Dell has also been very involved in COVID research. “Even at Dell, we’re making cycles available for Covid research using our Zenith cluster that’s located in our HPC & AI Innovation Lab,” he said.
The human factor in high-performance computing
Throughout the interview, Stanzione repeatedly pointed out that having High Performance Computing systems isn’t enough to solve these urgent computing problems on its own — you also need personnel with the right skills and experience to use the systems.
“We’ve really changed our preparedness and our operational model around urgent computing in the last ten years,” said Stanzione.
He explained, “We can do the computing very fast, but you need to know how to do the work.” And he added, “The trick is going to be not only growing the computation, but also growing the people and the software along with it.”
Stanzione concluded, “TACC’s growth has been remarkable and is a testament to the people who work here and the organizations that have supported us, notably UT Austin, UT System, the National Science Foundation, the O’Donnell Foundation, and Dell Technologies — our longest and most consistent champions.”
With the Lonestar6 and its other HPC systems, the TACC is prepared to solve the challenges of tomorrow — whatever they may be.