News
Mar 6, 2026
News
Enterprise
Artificial Intelligence
Americas
NewDecoded
3 min read

Image by Coreweave
CoreWeave, Inc. has entered a multi-year strategic partnership with Perplexity to support its inference workloads on the specialized CoreWeave Cloud. Announced on March 4, 2026, the deal moves Perplexity’s search engine traffic to dedicated clusters designed for high-performance AI. This collaboration marks a shift for both companies as they scale production-grade AI services globally.
Under the new agreement, Perplexity will power its next-generation workloads using NVIDIA GB200 NVL72-powered clusters. These rack-scale systems are optimized for trillion-parameter model inference, providing the memory bandwidth required for real-time search responses. This ensures that Perplexity can maintain low latency for its Sonar and Search API ecosystem.
The partnership includes a reciprocal arrangement where CoreWeave will deploy Perplexity Enterprise Max across its own organization. CoreWeave employees will gain access to advanced research tools and internal knowledge search capabilities within a single platform. This internal rollout highlights the utility of Perplexity’s AI agents for enterprise data analysis and deep research.
Operationally, Perplexity has already begun running workloads using the CoreWeave Kubernetes Service. The deployment also leverages W&B Models to manage the lifecycle of AI models from experimentation to production. This technical stack allows Perplexity to scale resources dynamically based on its 1.5 billion monthly queries.
Max Hjelm, senior vice president of revenue at CoreWeave, noted that production AI requires a cloud platform designed specifically to simplify compute operations. Perplexity’s Chief Business Officer, Dmitry Shevelenko, praised CoreWeave’s technical aptitude and partner-first mindset. Both leaders emphasized that the partnership aims to accelerate the growth of AI-native companies.
This agreement follows CoreWeave’s successful public listing on the Nasdaq in early 2025. It also builds on recent partnerships with other AI leaders like Runway and the Department of Energy. By securing Perplexity’s high-demand workloads, CoreWeave continues to differentiate itself from traditional hyperscalers through AI-first infrastructure.
This partnership signifies a pivot in the AI economy from model training to large-scale, production-ready inference. While much of the early AI boom focused on the race to train massive models, the current challenge lies in delivering those models to millions of users with sub-second latency and cost efficiency. By choosing specialized GB200 NVL72 clusters over generic cloud instances, Perplexity is betting that hardware specificity is the only way to sustain a consumer-grade search experience. For CoreWeave, securing one of the most demanding inference workloads in the world serves as a critical validation of its post-IPO strategy, positioning the company as the primary alternative to traditional hyperscalers for AI-native enterprises.
Related Articles