Nvidia’s latest chips could help transform the commerce industry by enabling AI applications to run faster and more efficiently, experts say.
The B200 GPU, announced Monday (March 18) at Nvidia’s annual developers conference during a keynote by CEO Jensen Huang, is a computer chip that can deliver up to 20 petaflops of FP4 processing power, thanks to its 208 billion transistors.
Furthermore, Nvidia noted that a GB200, which integrates two B200 GPUs with a single Grace CPU, can increase performance by up to 30 times for large language model (LLM) inference tasks. The GB200 is the debut chip of Nvidia’s Blackwell series of AI graphics processors. This setup is also designed to be significantly more efficient, potentially reducing costs and energy consumption by up to 25 times compared to the previous H100 model.
“With the B200’s ability to analyze vast amounts of data, businesses can more accurately predict customer demand,” Lars Nyman, chief marketing officer at CUDO Compute, told PYMNTS in an interview. “This allows for better inventory management, reducing the risk of stockouts and overstocking.”
According to Nvidia, training a 1.8 trillion parameter model previously required 8,000 Hopper GPUs and 15 megawatts of power. However, only 2,000 GPUs are needed with the new Blackwell architecture, consuming just four megawatts. In a GPT-3 LLM benchmark with 175 billion parameters, the GB200 demonstrated a seven-fold performance increase over the H100 and a four-fold improvement in training speed.
Benjamin Lee, an engineering professor at the University of Pennsylvania, told PYMNTS in an interview that the B200 will change AI through its improved power efficiency. He noted that training the largest AI models is expensive because teams must pay for the GPUs and then pay for the power to run them.
“Improved power efficiency translates directly into lower operating costs,” he said.
The new chip can also perform twice the number of calculations per second over previous generations by halving the precision of those calculations.
“Researchers have long studied how varying precision affects efficiency and performance — the B200 is a fairly significant demonstration of this idea,” Lee said.
According to Lee, the B200 constructs the largest feasible chips and then interconnects a pair of these chips using a high-speed network. This network facilitates more efficient coordination of computations between the two chips. The effectiveness of the GPUs in receiving data for their calculations is heavily dependent on the performance of this interconnecting network.
Lee stated that Nvidia’s primary edge over its rivals persists in its software ecosystem, which is designed to execute AI workloads on its GPU hardware. Compilers within this ecosystem enable researchers to deploy their models on the latest generation of GPUs rapidly. This innovation allows them to take advantage of the energy efficiency improvements.
“Nvidia’s other significant advantage continues to be its high-performance network that permits GPUs within a large cluster to communicate quickly and efficiently with each other,” he added. “Together, these advantages permit efficient AI for the largest trillion-parameter models.”
The new chip’s capabilities could translate into real-world results.
Nyman said the B200 can provide a more personalized, efficient and secure shopping experience for both businesses and consumers. He noted that the processor’s real-time price optimization feature allows for dynamic price adjustments based on factors like demand, competition and customer behavior, enabling businesses to maximize profits while remaining competitive.
The B200 could also be crucial in enhancing security and fraud detection, Nyman said. By analyzing transactions in real time, the processor can identify suspicious patterns that may indicate fraudulent activity, helping to prevent financial losses for businesses and protect customers from scams.
Nyman highlighted the B200’s high processing speed, which opens up new possibilities for personalization and customer profiling. The processor can also enable real-time analysis of a customer’s browsing behavior, offering merchants a window to engage with shoppers.
“The B200’s capabilities can be used to create highly targeted advertising assets that reach the right customers with the right message at the perfect time, adjusting based on their browsing behavior,” Nyman said.
Nyman said he also envisions the B200 powering virtual shopping companions. The processor may make it possible to create a virtual assistant that accompanies customers through the online store, helping them curate outfits, compare products and answer questions, providing a more personalized and engaging shopping experience.
Thanks to the new chips, AI products could become cheaper and more widely available in the future. Abdullah Ahmed, founder of Serene Data Ops, told PYMNTS that this is because of something called “cheaper inferencing.”
From 2024, more companies might start using basic AI models, powered by chips like B200, that work well enough for their needs, Ahmed said. They can then focus on improving their products and selling them.
This means that even small companies might be able to turn to AI products without needing as many costly GPUs. As a result, developing and using AI tools could cost less, making them more affordable for businesses and consumers.
“However, this depends on Nvidia’s ability to keep lead times reasonable in the face of overwhelming demand,” he said. “The B200 is laser-focused on the large language model craze such as OpenAI’s GPT4, which fuels ChatGPT.”