Nvidia CEO Jensen Huang speaks during a press conference at The MGM during CES 2018 in Las Vegas on January 7, 2018.
Almond Ngan | AFP | Getty Images
Software that can write passages of text or draw images that look like they were created by a human has started a gold rush in the tech industry.
Companies like Microsoft and Google are fighting to integrate cutting-edge artificial intelligence into their search engines, as billion-dollar competitors like OpenAI and Stable Diffusion race to release their software to the public.
Many of these applications are powered by a roughly $10,000 chip that has become one of the most important tools in the artificial intelligence industry: the Nvidia A100.
The A100 has become a “workhorse” for AI professionals at the moment, said Nathan Benaich, an investor who publishes newsletter and message covering the AI industry, including a partial list of supercomputers using the A100. Nvidia has 95% of the market for GPUs that can be used for machine learning, according to New Street Research.
The A100 is ideally suited for the kind of machine learning models that power tools like ChatGPT, Bing AIor Stable diffusion. It is able to perform many simple calculations simultaneously, which is important for training and using neural network models.
The technology behind the A100 was initially used to render sophisticated 3D graphics in games. It’s often called a graphics processor or GPU, but these days the Nvidia A100 is configured and focused on machine learning tasks and runs in data centers, not inside glowing gaming PCs.
Large companies or startups working on software like chatbots and image generators require hundreds or thousands of Nvidia chips and either buy them themselves or get access to computers from a cloud provider.
Hundreds of GPUs are required to train artificial intelligence models such as large language models. Chips must be powerful enough to quickly compress terabytes of data and recognize patterns. Then, GPUs like the A100 are also needed for “inference,” or using the model to generate text, make predictions, or identify objects inside photos.
This means that AI companies need access to many A100s. Some entrepreneurs in this space even see the number of A100s they have access to as a sign of progress.
“A year ago we had 32 A100s,” said Stability AI CEO Emad Mostaque he wrote on Twitter in January. “Dream big and pile on the GPUs kids. Brrr.” Stability AI is the company that helped develop Stable Diffusion, an image generator that attracted attention last fall, and reportedly has an award over $1 billion.
Now Stability AI has access to over 5,400 A100 GPUs for one estimate from the State of AI report, which maps and tracks which companies and universities have the largest collection of A100 GPUs — though it doesn’t include cloud providers that don’t publicly release their numbers.
Nvidia is riding the AI train
Nvidia is benefiting from the AI hype cycle. During Wednesday’s fiscal fourth quarter earnings reportalthough overall sales fell 21%, investors pressed on stock about 14% on Thursday, largely because the company’s AI chip business — billed as data centers — grew 11% to more than $3.6 billion during the quarter, showing continued growth.
Nvidia shares are up 65% so far in 2023, outperforming the S&P 500 and other semiconductor stocks.
Nvidia CEO Jensen Huang couldn’t stop talking about AI on a call with analysts on Wednesday, suggesting that the recent boom in artificial intelligence is central to the company’s strategy.
“The activity around the AI infrastructure we’ve built and the activity around inference using Hopper and Ampere to influence large language models has just flown in the last 60 days,” Huang said. “There’s no doubt that whatever our views on this year as we go into it have been changed quite dramatically as a result of the last 60, 90 days.”
Ampere is Nvidia’s codename for the A100 chip generation. Hopper is the codename for the new generation, including the H100, which recently started shipping.
More computers are needed
Nvidia A100 processor
Compared to other kinds of software, such as serving a web page that uses computing power occasionally in microsecond bursts, machine learning tasks can take up an entire computer’s computing power., sometimes hours or days.
This means that companies that find themselves with a successful AI product often need to get more GPUs to handle spikes or improve their models.
These GPUs are not cheap. In addition to a single A100 on a card that can be plugged into an existing server, many data centers use a system that includes eight A100 GPUs working together.
This system, Nvidia DGX A100, it has an MSRP of almost $200,000, even if it comes with the necessary chips. On Wednesday, Nvidia said it will directly sell cloud-based access to DGX systems, likely to lower entry costs for DIYers and researchers.
It’s easy to see how the costs of the A100 can add up.
For example, an estimate from New Street Research found that the OpenAI-based ChatGPT model of Bing search could require 8 GPUs to deliver an answer to a question in less than one second.
At this rate, Microsoft would need more than 20,000 8-GPU servers to deploy this model to everyone at Bing, suggesting that the feature could cost Microsoft $4 billion in infrastructure spending.
“If you’re from Microsoft and you want to scale at the scale of Bing, it’s maybe $4 billion. If you want to scale at the scale of Google, which is serving 8 or 9 billion queries every day, you really need to spend $80 billion on DGX.” said Antoine Chkaiban, technology analyst at New Street Research. “The numbers we came up with are huge. But they’re simply a reflection of the fact that any individual user working with such a large language model requires a massive supercomputer to use it.”
The latest version of Stable Diffusion, an image generator, was trained 256 A100 GPUsor 32 machines with 8 A100s each, according to information posted online by Stability AI, totaling 200,000 computing hours.
At market price, training the model alone cost $600,000, Stability AI CEO Mostaque said on Twitter, suggesting that tweet exchange the price was unusually cheap compared to competitors. This does not count the cost of “derivating” or deploying the model.
Huang, Nvidia’s CEO, told CNBC’s Katie Tarasova that the company’s products are actually cheap for the amount of computing these kinds of models need.
“We took what would otherwise be a $1 billion data center with a CPU and scaled it down to a $100 million data center,” Huang said. “Now if you put it in the cloud and it’s shared by 100 companies, $100 million is next to nothing.”
Huang said Nvidia’s GPUs allow startups to train models at a much lower cost than if they used a traditional computer processor.
“Now you could build something like a large language model like GPT for something like $10, $20 million,” Huang said. “That’s really, really affordable.”
Nvidia isn’t the only company making GPUs for use with artificial intelligence. AMD and Intel they have competing GPUs and big cloud companies like Google and Amazon they develop and deploy their own chips specifically designed for AI workloads.
However, according to State of AI, “artificial intelligence hardware remains firmly consolidated at NVIDIA.” calculate message. As of December, more than 21,000 open-source AI papers reported using Nvidia chips.
Most researchers included in the State of AI Compute Index used the V100, an Nvidia chip that came out in 2017, but the A100 has grown rapidly in 2022 to become the third most used Nvidia chip, behind consumer graphics chips priced at $1,500 or less that originally intended for playing games.
The A100 also has the distinction of being one of the few chips to have export controls placed on it for national defense. Last fall, Nvidia said in an SEC filing that the US government had imposed a licensing requirement barring exports of the A100 and H100 to China, Hong Kong and Russia.
“The USG said the new licensing requirement will address the risk that covered products may be used or diverted to a ‘military end use’ or ‘military end user’ in China and Russia,” Nvidia said. stated in his submission. Nvidia previously said it modified some of its chips for the Chinese market to comply with US export restrictions.
The toughest competition for the A100 may be its successor. The A100 was first introduced in 2020, an eternity ago in chip cycles. The H100, due in 2022, is starting mass production — in fact, Nvidia saw more revenue from the H100 chips in the quarter ending in January than the A100, it said Wednesday, even though the H100 is more expensive per unit.
The H100, Nvidia says, is the first of its data center GPUs to be optimized for transformers, an increasingly important technique used by many of the latest and most cutting-edge AI applications. Nvidia said Wednesday it wants to speed up AI training by more than 1 million percent. This could mean that AI companies won’t need as many Nvidia chips after all.