A Single Rack With More Compute Than Entire Data Centers
NVIDIA’s GB300 NVL72 represents the most aggressive consolidation of AI compute ever shipped. A single rack integrates 72 Blackwell Ultra GPUs and 36 Arm-based Grace CPUs into one fully liquid-cooled unit, delivering 1,440 petaflops (1.44 exaflops) of FP4 Tensor Core performance. To put that in perspective, the entire Summit supercomputer, which was the world’s fastest machine in 2018, delivered 200 petaflops.
The system connects all 72 GPUs through fifth-generation NVLink Switch fabric, providing 130 TB/s of all-to-all bandwidth within the rack. Each GPU gets 1.8 TB/s of NVLink bandwidth and 288 GB of HBM3e memory, giving the full rack 37 terabytes of fast memory. This allows trillion-parameter models to fit entirely within a single rack domain, eliminating the multi-rack communication overhead that historically bottlenecked large model training.
What Changed From GB200 to GB300
The GB300 NVL72 succeeds the GB200 NVL72 with substantial upgrades across every dimension. Per-GPU memory increased from roughly 192 GB to 288 GB HBM3e. Attention acceleration doubled, directly benefiting transformer workloads. NVIDIA claims approximately 2x performance improvement in LLM training tasks versus the GB200, with even larger gains in inference through optimized FP4 and FP8 execution.
Power consumption increased modestly, from around 120 kW per rack for the GB200 to 132-140 kW for the GB300, with peaks up to 155 kW depending on workload. The performance-per-watt ratio improved significantly despite the higher absolute power draw.
The Blackwell Ultra architecture also added native support for reasoning-heavy workloads. NVIDIA designed the GB300 specifically for the shift from simple prompt-response inference to multi-step agentic AI, where models chain together multiple reasoning passes before producing output.
Advertisement
The Liquid Cooling Mandate
Every GB300 NVL72 ships as a fully liquid-cooled system. There is no air-cooled option. The rack uses a hybrid architecture where GPUs, CPUs, and NVSwitch components receive direct-to-chip liquid cooling, while OSFP modules and storage drives are air-cooled. Approximately 90% of heat goes to liquid, 10% to air.
NVIDIA claims the liquid cooling system is 25x more energy-efficient and 300x more water-efficient than traditional air-cooled approaches. Because the coolant runs in a closed loop, no water evaporates during operation. For a 50 MW hyperscale data center, NVIDIA estimates annual savings exceeding $4 million from the cooling efficiency gains alone.
This design choice is forcing the data center industry through a generational transition. Facilities built for air cooling cannot house GB300 racks without retrofitting, creating a bottleneck in available deployment locations even as demand surges.
Who Is Building With GB300
Microsoft deployed the first large-scale production cluster, integrating more than 4,600 GB300 NVL72 racks connected through NVIDIA’s InfiniBand network for OpenAI workloads. CoreWeave was the first cloud provider to offer GB300 NVL72 instances, with AWS following with EC2 P6e-GB300 UltraServers.
HPE, Lenovo, and Supermicro have all launched their own GB300 NVL72 configurations. Cloud pricing ranges from $2.90 per hour for spot instances to $18 per hour per GPU for on-demand capacity. Full rack purchases are estimated above $5 million, while NVIDIA’s own DGX Station desktop workstation variant starts at approximately $275,000.
The customer base reveals where AI compute demand is concentrating. Hyperscalers are purchasing thousands of racks for foundation model training. Enterprises are evaluating DGX configurations for on-premises inference. Cloud providers are racing to offer GB300 instances before competitors, creating a supply chain pressure that extends from TSMC’s fabrication capacity all the way to liquid cooling infrastructure providers.
The Infrastructure Gap Widens
The GB300 NVL72 crystallizes a growing divide in the AI industry. Organizations with access to these systems can train and deploy models at scales that were physically impossible two years ago. Those without access are increasingly dependent on API providers who operate these racks.
The 140 kW power requirement per rack means a modest 100-rack deployment consumes 14 megawatts, equivalent to a small town’s electrical load. The liquid cooling mandate eliminates most existing data center facilities from consideration. And the estimated $500 million cost for a 100-rack cluster puts direct ownership beyond reach for all but the largest technology companies and sovereign wealth funds.
For the broader AI ecosystem, the GB300 represents both a capability leap and a concentration risk. The most powerful AI infrastructure is consolidating into fewer hands, at facilities that require purpose-built power and cooling infrastructure that takes years to construct.
Frequently Asked Questions
What makes the NVIDIA GB300 NVL72 different from previous GPU clusters?
The GB300 NVL72 integrates 72 Blackwell Ultra GPUs into a single liquid-cooled rack delivering 1.44 exaflops of FP4 performance and 37TB of fast memory. Its 130 TB/s NVLink bandwidth allows trillion-parameter models to run within one rack, eliminating the multi-rack communication bottleneck that slowed previous systems by up to 40%.
How much does a GB300 NVL72 rack cost?
Industry estimates place the GB300 NVL72 above $5 million per rack. Cloud access is more affordable, with spot instances starting at $2.90 per GPU per hour and on-demand pricing up to $18 per GPU per hour. NVIDIA’s DGX Station desktop variant starts at approximately $275,000 for organizations wanting on-premises AI compute without a full rack deployment.
Why does the GB300 require liquid cooling and what does that mean for data centers?
Each GB300 NVL72 rack consumes 132-140 kW of power, far beyond what air cooling can handle efficiently. NVIDIA’s direct-to-chip liquid cooling captures 90% of heat through liquid, achieving 300x better water efficiency than traditional cooling. This mandate forces data centers to retrofit or build new facilities, creating a temporary bottleneck in available deployment locations.
Sources & Further Reading
- NVIDIA GB300 NVL72 Product Page — NVIDIA
- NVIDIA GB300 NVL72 on Azure — Microsoft Azure Blog
- Microsoft Azure Unveils World’s First GB300 NVL72 Cluster for OpenAI — NVIDIA Blog
- Blackwell Platform Water Efficiency — NVIDIA Blog
- How Much Power Does a GB300 NVL72 Need — Sunbird DCIM
- NVIDIA B300 Blackwell Ultra Specs and Pricing — Spheron






