NVIDIA Unveils New Generative AI Platform for Trillion-Parameter Models

At the GPU Technology Conference 2022 (GTC), NVIDIA unveiled its new generative AI platform, NVIDIA Blackwell, along with a new GPU architecture, Tensor Cores, and NVLink technologies.

The platform is designed to enable organizations to build and run real-time generative AI on trillion-parameter large language models (LLMs) at up to 25 times lower cost and energy consumption than its predecessor.

The Blackwell GPU architecture features six transformative technologies for accelerated computing, including:

  • World's Most Powerful Chip: Manufactured using a custom-built 4NP TSMC process, the Blackwell GPUs are packed with 208 billion transistors and connected via 10 TB/second chip-to-chip links into a single, unified GPU.
  • Second-Generation Transformer Engine: Powered by new micro-tensor scaling support and NVIDIA's advanced dynamic range management algorithms, Blackwell will support double the compute and model sizes with new 4-bit floating point AI inference capabilities.
  • Fifth-Generation NVLink: Delivering groundbreaking 1.8TB/s bidirectional throughput per GPU, NVLink ensures seamless high-speed communication among up to 576 GPUs for the most complex LLMs.
  • RAS Engine: Blackwell-powered GPUs include a dedicated engine for reliability, availability, and serviceability. Additionally, the architecture adds capabilities at the chip level to utilize AI-based preventative maintenance to run diagnostics and forecast reliability issues.
  • Secure AI: Advanced confidential computing capabilities protect AI models and customer data without compromising performance, with support for new native interface encryption protocols.
  • Decompression Engine: A dedicated decompression engine supports the latest formats, accelerating database queries to deliver the highest performance in data analytics and data science.

In addition to the Blackwell platform, NVIDIA also announced its new GB200 Grace Blackwell Superchip, which connects two B200 Tensor Core GPUs to the NVIDIA Grace CPU over a 900GB/s ultra-low-power NVLink chip-to-chip interconnect.

Cloud service providers, including AWS, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure, will offer Blackwell-powered instances, as will NVIDIA Cloud Partner program companies.

Sovereign AI clouds will also provide Blackwell-based cloud services and infrastructure.

The GB200 Superchip is a key component of the NVIDIA GB200 NVL72 system, a multi-node, liquid-cooled, rack-scale system for the most compute-intensive workloads. It combines 36 Grace Blackwell Superchips, 72 Blackwell GPUs, and 36 Grace CPUs interconnected by fifth-generation NVLink.

Additionally, NVIDIA offers the HGX B200, a server board that links eight B200 GPUs through NVLink to support x86-based generative AI platforms.

In the coming years, data processing, on which companies spend tens of billions of dollars annually, will be increasingly GPU-accelerated.

Blackwell-based products are expected to be available from partners starting later this year.

Read more