
Nvidia CEO Jensen Huang took the stage at CES 2026 in Las Vegas on Monday to announce the Vera Rubin platform is now in full production, with customer deployments beginning in the second half of 2026. The announcement addresses mounting concerns about AI infrastructure economics as enterprises demand proof that massive capital expenditures will deliver returns.
The Rubin platform represents what Nvidia calls "extreme co-design" across six chip types working together as a unified system. At its core sits the Vera Rubin superchip combining one Vera CPU and two Rubin GPUs in a single processor. The architecture introduces the Vera CPU with 88 custom Olympus cores and 227 billion transistors, while each Rubin GPU contains 336 billion transistors.
Huang's most striking claim addressed the AI bubble debate directly. The Vera Rubin NVL72 rack-scale system promises to reduce token generation costs to one-tenth that of the previous Blackwell platform while delivering five times greater inference performance. If accurate, this cost reduction could fundamentally alter enterprise AI economics and accelerate adoption of resource-intensive applications like autonomous agents and advanced reasoning models.
The platform's technical specifications support ambitious performance claims. NVLink 6 scale-up networking boosts per-GPU fabric bandwidth to 3.6 terabytes per second bidirectional. Each NVLink 6 switch provides 28 terabytes per second of bandwidth, with nine switches per Vera Rubin NVL72 rack delivering 260 terabytes per second of total scale-up bandwidth. Nvidia claims the system "provides more bandwidth than the entire internet."
Huang emphasized the platform arrives as AI computing demand for both training and inference surges. The timing matters because investors increasingly question whether current infrastructure spending levels are sustainable. Oracle and Broadcom shares dropped recently on warnings about margin pressure from AI infrastructure costs, while concerns about circular funding arrangements have intensified.
Major cloud providers will deploy Vera Rubin systems first. AWS, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure will offer Rubin-based instances in the second half of 2026. Microsoft plans to integrate Vera Rubin NVL72 rack-scale systems into next-generation AI data centers including future Fairwater AI superfactory sites. AI-focused cloud providers CoreWeave, Lambda, Nebius, and Nscale will also deploy the platform.
The platform introduces AI-native storage through the Nvidia Inference Context Memory Storage Platform, designed specifically for the key-value cache tier that enables long-context inference. Nvidia claims this delivers five times higher tokens per second, five times better performance per total cost of ownership dollar, and five times better power efficiency compared to previous approaches.
Huang addressed funding concerns in his opening remarks, countering AI bubble narratives by explaining that companies are shifting research and development budgets from classical computing methods to artificial intelligence. "People ask, where is the money coming from? That's where the money is coming from," he stated, referring to the estimated ten trillion dollars of computing infrastructure being modernized.
The announcement extends beyond data centers. Nvidia revealed the Alpamayo family of open reasoning models for autonomous vehicles, demonstrating the company's push into physical AI applications. The first passenger car featuring Alpamayo built on the Nvidia DRIVE platform will be the all-new Mercedes-Benz CLA with AI-defined driving coming to United States markets.
Competition intensifies as AMD launched its Instinct MI500 series claiming 1,000 times more performance versus MI300X, while also unveiling its Helios rack-scale AI architecture to compete directly with Nvidia's NVLink systems. Google and Amazon continue developing custom AI processors through partnerships with Anthropic, which already uses their chips to power Claude.
Nvidia maintains advantages through its CUDA software ecosystem and integrated approach spanning chips, networking, storage, and software. The company's ability to deliver annual platform updates while maintaining backward compatibility has proven difficult for competitors to match.




