Nvidia’s Next-Gen AI Chips Are Coming to AWS and Google Cloud

Riding the surge of hype around ChatGPT and other artificial intelligence products, Nvidia Corp. introduced new chips, supercomputing services and a raft of high-profile partnerships Tuesday intended to showcase how its technology will fuel the next wave of AI breakthroughs.

(Bloomberg) — Riding the surge of hype around ChatGPT and other artificial intelligence products, Nvidia Corp. introduced new chips, supercomputing services and a raft of high-profile partnerships Tuesday intended to showcase how its technology will fuel the next wave of AI breakthroughs.

At the chipmaker’s annual developer conference on Tuesday, Chief Executive Officer Jensen Huang positioned Nvidia as the engine behind “the iPhone moment of AI,” as he’s taken to calling this inflection point in computing. Spurred by a boom in consumer and enterprise applications, such as advanced chatbots and eye-popping graphics generators, “generative AI will reinvent nearly every industry,” Huang said.

The idea is to build infrastructure that can make AI apps faster and more accessible to customers. Nvidia’s graphics processing units have become the brains behind ChatGPT and its ilk, helping them digest and process ever-greater sums of training data. Microsoft Corp. revealed last week it had to string together tens of thousands of Nvidia’s A100 GPUs in data centers in order to handle the computational workload in the cloud for OpenAI, ChatGPT’s developer.

Other tech giants are following suit with similarly colossal cloud infrastructures geared for AI. Oracle Corp. announced that its platform will feature 16,000 Nvidia H100 GPUs, the A100’s successor, for high-performance compute applications, and Nvidia said a forthcoming system from Amazon Web Services will be able to scale up to 20,000 interconnected H100s. Microsoft has likewise started adding the H100 to its server racks.

These kinds of chip superclusters are part of a push by Nvidia to rent out supercomputing services through a new program called DGX Cloud, hosted by Oracle and soon Microsoft Azure and Google Cloud. Nvidia said the goal is to make accessing an AI supercomputer as easy as opening a webpage, enabling companies to train their models without the need for on-premise infrastructure that’s costly to install and manage.

“Provide your job, point to your data set, and you hit go — and all of the orchestration and everything underneath is taken care of,” said Manuvir Das, Nvidia’s vice president of enterprise computing. The DGX Cloud service will start at $36,999 per instance per month, with each “instance”— essentially the amount of computing horsepower being rented — equating to eight H100 GPUs.

Nvidia also launched two new chips, one focused on enhancing AI video performance and the other an upgrade to the H100.

The latter GPU is designed specifically to improve the deployment of large language models like those used by ChatGPT. Called the H100 NVL, it can perform 12 times faster when handling inferences — that is, how AI responds to real-life queries — compared with the prior generation of A100s at scale in data centers.

Ian Buck, vice president of hyperscale and high-performance computing at Nvidia, said it will help “democratize ChatGPT use cases and bring that capability to every server and every cloud.”

More stories like this are available on bloomberg.com