News

Artificial Intelligence

Americas

NVIDIA Unveils Rubin Platform and Vera CPU to Power Next-Gen Agentic AI

NVIDIA's new six-chip architecture slashes AI training times and inference costs to accelerate mainstream adoption.

NVIDIA's new six-chip architecture slashes AI training times and inference costs to accelerate mainstream adoption.

NVIDIA's new six-chip architecture slashes AI training times and inference costs to accelerate mainstream adoption.

NewDecoded

Published Jan 6, 2026

Jan 6, 2026

4 min read

Image by Nvidia

A Giant Leap for AI Infrastructure

NVIDIA officially launched the Rubin platform at CES 2026, introducing a revolutionary six-chip architecture designed to serve as the backbone for future AI supercomputers. This next-generation system features the high-performance Rubin GPU and the energy-efficient Vera CPU, specifically engineered to handle the growing demands of agentic reasoning. By integrating hardware and software through extreme codesign, the platform achieves a ten-fold reduction in inference token costs compared to the previous Blackwell generation.

Powering Logic with the Vera CPU

The Vera CPU stands out as a core innovation, utilizing 88 custom Olympus cores to provide industry-leading power efficiency for large-scale data centers. It works alongside the Rubin GPU, which delivers 50 petaflops of compute power via its third-generation Transformer Engine. This synergy allows organizations to train massive mixture-of-experts models using four times fewer GPUs than before, significantly lowering the barrier to advanced AI research.

High-Performance Networking and Storage

High-speed communication is maintained through the sixth-generation NVLink interconnect, offering 3.6 terabytes per second of bandwidth per GPU. The architecture also incorporates the BlueField-4 DPU and Spectrum-6 Ethernet switches to create a seamless, resilient fabric for AI factories. These components work together to ensure that data flows between processing units with minimal latency and maximum reliability across massive clusters.

Enabling Long-Term Memory for AI Agents

A major addition to the ecosystem is the Inference Context Memory Storage Platform, which addresses the specific needs of agentic AI. This new class of storage allows AI systems to share and reuse context data efficiently, improving the speed and accuracy of multi-step reasoning tasks. It effectively gives AI models a persistent memory that scales to meet the requirements of complex, long-duration interactions without clogging high-bandwidth memory.

Industry Adoption and Future Deployment

Tech giants including Microsoft, AWS, and Google Cloud have already announced plans to integrate the Rubin platform into their upcoming infrastructure. Microsoft will deploy the Vera Rubin NVL72 rack-scale systems as part of its Fairwater AI superfactories to serve enterprise and consumer applications. Other early adopters such as CoreWeave and Oracle Cloud Infrastructure expect to offer Rubin-based instances starting in the second half of 2026.

Building the Standard for AI Factories

The Rubin platform is currently in full production, marking NVIDIA's third-generation rack-scale architecture. With support from over 80 ecosystem partners, the launch sets a new standard for how modern AI systems are built, deployed, and secured. As AI moves from simple queries to autonomous agents, this infrastructure provides the necessary performance to support the next frontier of digital intelligence.

Decoded Take

Decoded Take

Decoded Take

This announcement marks the end of the raw compute race and the beginning of the era of efficiency and context management. By prioritizing a ten-fold reduction in inference costs and introducing specialized context memory, NVIDIA is pivoting toward a future where AI agents are integrated into everyday business processes rather than just operating as standalone tools. The extreme codesign of the Rubin platform suggests that the next generation of machine learning breakthroughs will depend as much on how chips communicate as they do on the raw power of individual silicon wafers.

Share this article

Related Articles

Related Articles

Related Articles