Accelerating Science with AI in HPC

By Louis Vistola, Technology Evangelist

October 16, 2023

Sponsored Content by Intel

High-performance computing (HPC) has played a major role in advancing scientific research for decades using extremely large datasets and sophisticated modeling that mimics the physical world. Rapidly advancing is the ability to complement the power and capabilities of HPC with artificial intelligence (AI) to accelerate innovations and deliver faster outcomes.

I had the opportunity to talk with Radhika Rao, Senior Director of Data Center GPU Product Management at Intel, to discuss this AI transformation and its impact on the HPC landscape.

Q: With the rise of AI, what do HPC leaders and developers need to consider around AI? Why now?

Rao: In the last year alone, we’ve seen an explosive growth in using of AI in all industries. In HPC, we have reached a pivotal moment where we’re seeing a true HPC and AI convergence. We have been talking about for a long time, but we can see that happening now. AI is helping advance models and codes in physics, weather, manufacturing, and many more areas.

What is making this so relevant now is that AI has become mainstream due to the popularity and wide-scale use of ChatGPT (large language models) and generative AI. This trend is making it more important to view HPC and AI as a converged space to drive advances in science.

Q: What are the evolving requirements HPC leaders should consider when looking to invest in the next-gen environments for accelerating HPC and AI?

Rao: HPC workloads have traditionally had a very specific CPU-to-GPU ratio and compute profile. In the last couple of years, we’ve seen models change and become far more dynamic with increasing compute and scale requirements. As a result, the need for architecture flexibility to run these data-intensive workloads across heterogeneous environments has become critical, as well as the need for increased memory bandwidth and memory capacity. Another area to consider is sustainability requirements with respect to power and environmental impact. When building out large clusters to solve such problems as climate change and sustainability, you don’t want to be part of the problem. We must consider the sustainability footprint of the data center and new technology investments to ensure they are not creating a negative impact on the environment (see June article on Top Considerations for HPC, AI and Sustainability (hpcwire.com)).

Q: How has Intel’s portfolio advanced to address this HPC and AI convergence?

Rao: CPUs, and particularly Intel x86 process technologies from Intel, have been the backbone of HPC systems for decades. And now we are seeing powerful AI capabilities being infused into every aspect of compute, including in the HPC space. Intel’s CPUs are now complemented with a variety of built-in and discrete accelerators and GPUs. For example, the built-in Advanced Matrix Extensions (AMX) built into the 4^th Gen Intel® Xeon® Scalable processors deliver 10x higher inference and training performance.¹ Intel® Data Center GPU Max Series delivers up to 2x performance gain on HPC and AI workloads over competition.² Recent MLPerf AI inference results spotlight the Intel® Gaudi®2 accelerator as the only viable alternative on the market for dedicated AI compute needs. Additionally, Intel is the only vendor to submit public CPU results on 4^th Gen Intel Xeon and Intel Xeon Max Series with industry-standard, deep-learning ecosystem software.

Our portfolio is supported by a full suite of AI and HPC software development tools. Developers have traditionally been required to be use proprietary software to code and run AI and HPC models specific to each platform. With a new suite of open-sourced software, such as the Intel oneAPI toolkit, developers now have freedom of choice. They can program once and then run the code on different hardware, even shifting the underlying hardware mix over time to suit the needs of a particular workload. The oneAPI programming model supports Intel’s full hardware portfolio, as well as solutions from competitors.

Q: What are examples of some of how Intel is working across the ecosystem on this HPC and AI convergence?

Technology adoption is the key to converging HPC and AI into one system to advance scientific research. One example is the work we are doing with the Aurora Exascale Supercomputer at the Argonne Leadership Computing Facility (ALCF), a Department of Energy Office of Science User Facility at Argonne National Laboratory, and Hewlett Packard Enterprise. Aurora, being built on the full Intel® Max Series CPUs and GPUs, will offer researchers high computing speed and artificial intelligence capabilities to enable science that is not possible today. Earlier this year, Intel and Argonne National Lab announced the full Aurora specifications and efforts (with partners) to bring the power of generative AI and large language models (LLM) to science and society.

Beyond Aurora, there is much work being done to bring HPC and AI together. We have several software partners that are using oneAPI on Intel hardware to bring AI into some of the places that are very specific to HPC use cases. One example is Ansys who is combining the power of both the Intel Max Series GPUs and 4^th Gen Intel Xeon processors to add AI capabilities into their applications. We are also deeply engaged with the AI and HPC software ecosystem, optimizing popular developer tools like Pytorch and Tensorflow.

Q: What’s one piece of advice you have for HPC leaders looking to invest in AI?

Rao: When adding AI capabilities to an HPC environment, the last thing anyone wants is to incur more costs or incur delays due to complex codes having to be ported from one programming model to another. Intel has made significant investments in both the hardware and software needed to run, scale and protect investments in modern HPC centers. The convergence of HPC and AI is making it even more important to adopt open standards, like oneAPI, so researchers can focus on delivering scientific breakthroughs faster and with greater precision.

One of the newest ways HPC technologists and developers can build, test and optimize AI and HPC applications is on the newly launched Intel® Developer Cloud. The Intel Developer Cloud provides developers access to the latest Intel HPC and AI technologies, including Intel Gaudi2 processors for deep learning, and the latest Intel hardware platforms, such as the 5th Gen Intel® Xeon® Scalable processors and Intel® Data Center GPU Max Series 1100 and 1550.

Learn more about how Intel’s HPC and AI portfolio is helping customers achieve outstanding results for demanding workloads and the complex problems they solve here.

¹See [A16] and [A17] at intel.com/processorclaims: 4th Gen Intel® Xeon® Scalable processors. Results may vary.
²Visit intel.com/performanceindex (Events: Supercomputing 22) for workloads and configurations. Results may vary.

Four Steps to Ensure GenAI Safety and Ethics

June 27, 2024

With the deployment of generative artificial intelligence (GenAI) happening at a rapid pace, organizations of all sizes are tasked with navigating the challenges around implementation, especially regarding ethics and Read more…

AI-augmented HPC and the Inflation of Science and Technology

June 27, 2024

Everyone is aware of the inflationary model of the early universe in which the volume of space expands exponentially then slows down. AI-augmented HPC (AHPC for short) has started to expand creating new space in the scie Read more…

Top Three Pitfalls to Avoid When Processing Data with LLMs

June 26, 2024

It’s a truism of data analytics: when it comes to data, more is generally better. But the explosion of AI-powered large language models (LLMs) like ChatGPT and Google Gemini (formerly Bard) challenges this conventional Read more…

Summer Reading: DARPA Showcases Quantum Benchmarking Progress

June 25, 2024

Last week, the Defense Advanced Research Projects Agency (DARPA) issued an interim progress update from the second phase of its Quantum Benchmark (QB) program. Begun in 2021 the QB effort has the ambitious “goal of rei Read more…

What We Know about Alice Recoque, Europe’s Second Exascale System

June 24, 2024

Europe officially announced its second exascale system, Alice Recoque, and you can expect to see that name on the Top500 supercomputer list in a few years. Alice Recoque is the new name for a supercomputer with the opera Read more…

Spelunking the HPC and AI GPU Software Stacks

June 21, 2024

As AI continues to reach into every domain of life, the question remains as to what kind of software these tools will run on. The choice in software stacks – or collections of software components that work together to Read more…

AI-augmented HPC and the Inflation of Science and Technology

June 27, 2024

Everyone is aware of the inflationary model of the early universe in which the volume of space expands exponentially then slows down. AI-augmented HPC (AHPC for Read more…

Summer Reading: DARPA Showcases Quantum Benchmarking Progress

June 25, 2024

Last week, the Defense Advanced Research Projects Agency (DARPA) issued an interim progress update from the second phase of its Quantum Benchmark (QB) program. Read more…

Spelunking the HPC and AI GPU Software Stacks

June 21, 2024

As AI continues to reach into every domain of life, the question remains as to what kind of software these tools will run on. The choice in software stacks – Read more…

HPE and NVIDIA Join Forces and Plan Conquest of Enterprise AI Frontier

June 20, 2024

The HPE Discover 2024 conference is currently in full swing, and the keynote address from Hewlett-Packard Enterprise (HPE) CEO Antonio Neri on Tuesday, June 18, Read more…

Slide Shows Samsung May be Developing a RISC-V CPU for In-memory AI Chip

June 19, 2024

Samsung may have unintentionally revealed its intent to develop a RISC-V CPU, which a presentation slide showed may be used in an AI chip. The company plans to Read more…

Qubits 2024: D-Wave’s Steady March to Quantum Success

June 18, 2024

In his opening keynote at D-Wave’s annual Qubits 2024 user meeting, being held in Boston, yesterday and today, CEO Alan Baratz again made the compelling pitch Read more…

Argonne’s Rick Stevens on Energy, AI, and a New Kind of Science

June 17, 2024

The world is currently experiencing two of the largest societal upheavals since the beginning of the Industrial Revolution. One is the rapid improvement and imp Read more…

Under The Wire: Nearly HPC News (June 13, 2024)

June 13, 2024

As managing editor of the major global HPC news source, the term "news fire hose" is often mentioned. The analogy is quite correct. In any given week, there are Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium

May 30, 2024

Consider the GPU. An island of SIMD greatness that makes light work of matrix math. Originally designed to rapidly paint dots on a computer monitor, it was then Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

June 10, 2024

Nvidia had an explosive 2023 in data-center GPU shipments, which totaled roughly 3.76 million units, according to a study conducted by semiconductor analyst fir Read more…

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium. The chip, essentially a TPU v6, is the company's l Read more…

Intel’s Next-gen Falcon Shores Coming Out in Late 2025

April 30, 2024

It's a long wait for customers hanging on for Intel's next-generation GPU, Falcon Shores, which will be released in late 2025. "Then we have a rich, a very Read more…

AMD Clears Up Messy GPU Roadmap, Upgrades Chips Annually

June 3, 2024

In the world of AI, there's a desperate search for an alternative to Nvidia's GPUs, and AMD is stepping up to the plate. AMD detailed its updated GPU roadmap, w Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

How the Chip Industry is Helping a Battery Company

May 8, 2024

Chip companies, once seen as engineering pure plays, are now at the center of geopolitical intrigue. Chip manufacturing firms, especially TSMC and Intel, have b Read more…

Click Here for More Headlines

HPCwire is a registered trademark of Tabor Communications, Inc. Use of this site is governed by our Terms of Use and Privacy Policy.

Reproduction in whole or in part in any form or medium without express written permission of Tabor Communications, Inc. is prohibited.

Leading Solution Providers

Off The Wire

Industry Headlines

June 28, 2024

June 27, 2024

June 26, 2024

June 25, 2024

Subscribe to HPCwire's Weekly Update!

Four Steps to Ensure GenAI Safety and Ethics

AI-augmented HPC and the Inflation of Science and Technology

Top Three Pitfalls to Avoid When Processing Data with LLMs

Summer Reading: DARPA Showcases Quantum Benchmarking Progress

What We Know about Alice Recoque, Europe’s Second Exascale System

Spelunking the HPC and AI GPU Software Stacks

AI-augmented HPC and the Inflation of Science and Technology

Summer Reading: DARPA Showcases Quantum Benchmarking Progress

Spelunking the HPC and AI GPU Software Stacks

HPE and NVIDIA Join Forces and Plan Conquest of Enterprise AI Frontier

Slide Shows Samsung May be Developing a RISC-V CPU for In-memory AI Chip

Qubits 2024: D-Wave’s Steady March to Quantum Success

Argonne’s Rick Stevens on Energy, AI, and a New Kind of Science

Under The Wire: Nearly HPC News (June 13, 2024)

Atos Outlines Plans to Get Acquired, and a Path Forward

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium

Nvidia H100: Are 550,000 GPUs Enough for This Year?

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

Choosing the Right GPU for LLM Inference and Training

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

Synopsys Eats Ansys: Does HPC Get Indigestion?

Leading Solution Providers

Contributors

Tiffany Trader

Editorial Director

Douglas Eadline

Managing Editor

John Russell

Senior Editor

Kevin Jackson

Contributing Editor

Ali Azhar

Contributing Editor

Alex Woodie

Contributing Editor

Addison Snell

Contributing Editor

Drew Jolly

Assistant Editor

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

Intel’s Next-gen Falcon Shores Coming Out in Late 2025

AMD Clears Up Messy GPU Roadmap, Upgrades Chips Annually

The NASA Black Hole Plunge

How AMD May Get Across the CUDA Moat

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

How the Chip Industry is Helping a Battery Company

The Information Nexus of Advanced Computing and Data systems for a High Performance World

Share

Copy short link