News

Open-Source

Artificial Intelligence

Americas

NVIDIA Launches Dynamo 1.0 as the Distributed Operating System for Global AI Factories

NVIDIA has officially launched Dynamo 1.0, an open-source inference operating system designed to orchestrate large-scale AI workloads across data center clusters.

NVIDIA has officially launched Dynamo 1.0, an open-source inference operating system designed to orchestrate large-scale AI workloads across data center clusters.

NewDecoded

Published Mar 17, 2026

Mar 17, 2026

4 min read

Image by Nvidia

The Distributed Foundation for AI Factories

NVIDIA announced the production release of Dynamo 1.0 at GTC 2026, marking a significant shift in how artificial intelligence is deployed at scale. This open source software functions as a distributed operating system, managing GPU and memory resources to power complex generative and agentic AI workloads. By orchestrating these resources across clusters, Dynamo enables cloud providers and enterprises to deliver high-performance inference with unprecedented efficiency.

Performance Gains on Blackwell Architecture

The platform addresses the growing complexity of scaling AI in data centers, where unpredictable bursts of traffic and varying request sizes create resource bottlenecks. In recent industry benchmarks, Dynamo boosted the inference performance of NVIDIA Blackwell GPUs by up to 7x. This optimization drastically lowers the cost per token and increases the revenue potential for millions of GPUs already deployed in the field.

Intelligent Resource Management

Dynamo achieves these gains by splitting inference work across multiple GPUs and implementing smarter traffic control mechanisms. It can move data seamlessly between active GPU memory and lower-cost storage, which is critical for long-running agentic AI systems that require persistent context. This ability to route requests based on existing memory caches reduces wasted computation and helps operators bypass traditional hardware limits.

Widespread Industry Adoption

Major cloud service providers including AWS, Microsoft Azure, Google Cloud, and Oracle Cloud Infrastructure have already integrated the platform into their managed environments. NVIDIA has also ensured compatibility with popular open source frameworks like LangChain, vLLM, SGLang, and LMCache to accelerate community development. Global enterprises such as PayPal, Pinterest, and ByteDance are currently using the software to power real-time, multimodal AI experiences for hundreds of millions of users.

Leadership Perspectives

Jensen Huang, founder and CEO of NVIDIA, described Dynamo as the engine of intelligence that powers every query and application in the modern data center. Partners like CoreWeave and Together AI noted that as AI moves into continuous production, the software orchestration layer becomes as vital as the hardware itself. They emphasized that Dynamo provides the durability and high-performance routing required for the next wave of agentic AI workloads.

Availability and Integration

Developers can access Dynamo 1.0 today to begin building more resilient AI infrastructure using standalone modules like KVBM for memory management. The software includes NVIDIA NIXL for rapid data movement and NVIDIA Grove for simplified Kubernetes scaling across complex topologies. More information and technical guides for implementation are available on the official NVIDIA Dynamo webpage.

Decoded Take

Decoded Take

Decoded Take

NVIDIA is effectively redefining the hardware-software stack by treating inference not as a simple task, but as a complex resource management problem analogous to a computer operating system. By abstracting the orchestration of memory and compute across massive GPU clusters, they are ensuring that their hardware remains the indispensable foundation for the agentic era of AI. This move commoditizes the orchestration layer via open source while simultaneously locking users into the high-performance optimizations only possible on NVIDIA silicon. It signals that the future of AI profitability lies in squeezing maximum efficiency out of every watt and byte across the entire data center.

Share this article

Related Articles