NVIDIA GB300 NVL72: 72 GPUs, 1.44 Exaflops in One Rack

Published April 14, 2026 · by ALGERIATECH Editorial

⚡ Key Takeaways

NVIDIA’s GB300 NVL72 packs 72 Blackwell Ultra GPUs into a single liquid-cooled rack delivering 1.44 exaflops of FP4 performance and 37TB of fast memory. Microsoft has deployed over 4,600 racks for OpenAI, with cloud pricing starting at $2.90/hour per GPU. The system delivers 50x the output of Hopper platforms while requiring mandatory liquid cooling at 132-140kW per rack.

Bottom Line: Cloud architects should evaluate GB300-class instances from Azure, AWS, or CoreWeave for AI workloads, as the 2x performance gain over GB200 at comparable pricing makes older GPU instances increasingly uneconomical.

Read Full Analysis ↓

🧭 Decision Radar

Relevance for Algeria
Medium
▾

The GB300 NVL72 is too expensive and power-intensive for direct Algerian deployment, but understanding this infrastructure is essential for anyone building on cloud AI services that run on these racks. Algeria’s AI strategy will interact with GB300 capabilities through cloud APIs.

Infrastructure Ready?
No
▾

Algeria lacks the liquid cooling infrastructure, 140kW-per-rack power density, and data center capacity required for GB300 deployment. Even the $275,000 DGX Station requires specialized cooling and power.

Skills Available?
Limited
▾

Algerian engineers can build applications that run on GB300-powered cloud instances, but operating and optimizing the physical infrastructure requires specialized skills not currently available domestically.

Action Timeline
12-24 months
▾

As cloud providers expand GB300 availability and pricing drops, Algerian organizations should evaluate cloud-based access for AI workloads that require exascale compute.

Key Stakeholders
IT directors, cloud architects, AI researchers, data center operators

Decision Type
Educational
▾

This article explains the hardware foundation that powers the AI services Algerian organizations consume, helping technical leaders make informed cloud procurement decisions.

Quick Take: Algerian organizations should plan to access GB300 capabilities through cloud providers like Azure, AWS, or CoreWeave rather than purchasing hardware directly. IT directors should evaluate whether current cloud spending on older GPU instances could be redirected to GB300-class compute for better performance-per-dollar. Track liquid cooling infrastructure developments as they determine when North African data centers could host next-generation AI hardware.

A Single Rack With More Compute Than Entire Data Centers

NVIDIA’s GB300 NVL72 represents the most aggressive consolidation of AI compute ever shipped. A single rack integrates 72 Blackwell Ultra GPUs and 36 Arm-based Grace CPUs into one fully liquid-cooled unit, delivering 1,440 petaflops (1.44 exaflops) of FP4 Tensor Core performance. To put that in perspective, the entire Summit supercomputer, which was the world’s fastest machine in 2018, delivered 200 petaflops.

The system connects all 72 GPUs through fifth-generation NVLink Switch fabric, providing 130 TB/s of all-to-all bandwidth within the rack. Each GPU gets 1.8 TB/s of NVLink bandwidth and 288 GB of HBM3e memory, giving the full rack 37 terabytes of fast memory. This allows trillion-parameter models to fit entirely within a single rack domain, eliminating the multi-rack communication overhead that historically bottlenecked large model training.

What Changed From GB200 to GB300

The GB300 NVL72 succeeds the GB200 NVL72 with substantial upgrades across every dimension. Per-GPU memory increased from roughly 192 GB to 288 GB HBM3e. Attention acceleration doubled, directly benefiting transformer workloads. NVIDIA claims approximately 2x performance improvement in LLM training tasks versus the GB200, with even larger gains in inference through optimized FP4 and FP8 execution.

Power consumption increased modestly, from around 120 kW per rack for the GB200 to 132-140 kW for the GB300, with peaks up to 155 kW depending on workload. The performance-per-watt ratio improved significantly despite the higher absolute power draw.

The Blackwell Ultra architecture also added native support for reasoning-heavy workloads. NVIDIA designed the GB300 specifically for the shift from simple prompt-response inference to multi-step agentic AI, where models chain together multiple reasoning passes before producing output.

The Liquid Cooling Mandate

Every GB300 NVL72 ships as a fully liquid-cooled system. There is no air-cooled option. The rack uses a hybrid architecture where GPUs, CPUs, and NVSwitch components receive direct-to-chip liquid cooling, while OSFP modules and storage drives are air-cooled. Approximately 90% of heat goes to liquid, 10% to air.

NVIDIA claims the liquid cooling system is 25x more energy-efficient and 300x more water-efficient than traditional air-cooled approaches. Because the coolant runs in a closed loop, no water evaporates during operation. For a 50 MW hyperscale data center, NVIDIA estimates annual savings exceeding $4 million from the cooling efficiency gains alone.

This design choice is forcing the data center industry through a generational transition. Facilities built for air cooling cannot house GB300 racks without retrofitting, creating a bottleneck in available deployment locations even as demand surges.

Who Is Building With GB300

Microsoft deployed the first large-scale production cluster, integrating more than 4,600 GB300 NVL72 racks connected through NVIDIA’s InfiniBand network for OpenAI workloads. CoreWeave was the first cloud provider to offer GB300 NVL72 instances, with AWS following with EC2 P6e-GB300 UltraServers.

HPE, Lenovo, and Supermicro have all launched their own GB300 NVL72 configurations. Cloud pricing ranges from $2.90 per hour for spot instances to $18 per hour per GPU for on-demand capacity. Full rack purchases are estimated above $5 million, while NVIDIA’s own DGX Station desktop workstation variant starts at approximately $275,000.

The customer base reveals where AI compute demand is concentrating. Hyperscalers are purchasing thousands of racks for foundation model training. Enterprises are evaluating DGX configurations for on-premises inference. Cloud providers are racing to offer GB300 instances before competitors, creating a supply chain pressure that extends from TSMC’s fabrication capacity all the way to liquid cooling infrastructure providers.

The Infrastructure Gap Widens

The GB300 NVL72 crystallizes a growing divide in the AI industry. Organizations with access to these systems can train and deploy models at scales that were physically impossible two years ago. Those without access are increasingly dependent on API providers who operate these racks.

The 140 kW power requirement per rack means a modest 100-rack deployment consumes 14 megawatts, equivalent to a small town’s electrical load. The liquid cooling mandate eliminates most existing data center facilities from consideration. And the estimated $500 million cost for a 100-rack cluster puts direct ownership beyond reach for all but the largest technology companies and sovereign wealth funds.

For the broader AI ecosystem, the GB300 represents both a capability leap and a concentration risk. The most powerful AI infrastructure is consolidating into fewer hands, at facilities that require purpose-built power and cooling infrastructure that takes years to construct.

Follow AlgeriaTech on LinkedIn for professional tech analysis Follow on LinkedIn

Follow @AlgeriaTechNews on X for daily tech insights Follow on X

Frequently Asked Questions

What makes the NVIDIA GB300 NVL72 different from previous GPU clusters?

The GB300 NVL72 integrates 72 Blackwell Ultra GPUs into a single liquid-cooled rack delivering 1.44 exaflops of FP4 performance and 37TB of fast memory. Its 130 TB/s NVLink bandwidth allows trillion-parameter models to run within one rack, eliminating the multi-rack communication bottleneck that slowed previous systems by up to 40%.

How much does a GB300 NVL72 rack cost?

Industry estimates place the GB300 NVL72 above $5 million per rack. Cloud access is more affordable, with spot instances starting at $2.90 per GPU per hour and on-demand pricing up to $18 per GPU per hour. NVIDIA’s DGX Station desktop variant starts at approximately $275,000 for organizations wanting on-premises AI compute without a full rack deployment.

Why does the GB300 require liquid cooling and what does that mean for data centers?

Each GB300 NVL72 rack consumes 132-140 kW of power, far beyond what air cooling can handle efficiently. NVIDIA’s direct-to-chip liquid cooling captures 90% of heat through liquid, achieving 300x better water efficiency than traditional cooling. This mandate forces data centers to retrofit or build new facilities, creating a temporary bottleneck in available deployment locations.