CoreWeave debuts ‘Sandboxes’ to accelerate AI reinforcement learning and agent development

CoreWeave, Inc., the Roseland-headquartered AI hyperscaler known as “The Essential Cloud for AI,” on Thursday announced the launch of CoreWeave Sandboxes. This new execution layer provides AI researchers and platform teams with secure, isolated environments specifically designed for high-stakes workflows like reinforcement learning (RL), agent tool use, and model evaluation.

As AI transitions from simple text generation to “agentic” systems that take real-world actions, the need for safe, scalable code execution has become a critical bottleneck. CoreWeave Sandboxes addresses this by integrating these isolated environments directly into a customer’s existing compute infrastructure.

Traditionally, organizations have relied on fragmented, custom-built systems or third-party products to run code safely during the training process. These disconnected approaches often lead to reliability issues and operational sprawl as workloads scale.

CoreWeave Sandboxes provides a unified solution through two distinct models:

On-Cluster: For platform teams using the CoreWeave Kubernetes Service (CKS) to run training alongside their primary AI jobs.
Serverless: In partnership with Weights & Biases (W&B), providing researchers with enterprise-grade isolation via a simple Python client, requiring no cluster provisioning or infrastructure management.

“CoreWeave Sandboxes closes the execution gap in reinforcement learning and agent workflows without requiring teams to build custom systems,” said Chen Goldberg, EVP of Product and Engineering at CoreWeave. “It behaves like the rest of their infrastructure—governed, observable, and close to the workflows already running on CoreWeave.”

The platform is already seeing adoption from global leaders in the AI space, including IBM Research and Mistral.

“Our reinforcement learning workflows spin up thousands of sandboxes in parallel per training step,” Brian Belgodere, senior technical staff member at IBM Research said. “CoreWeave Sandboxes solves a real gap in our stack: secure, isolated code execution at scale directly in our existing compute.”

Similarly, Roman Soletskyi, an AI scientist at Mistral, noted that the platform eliminated the need to manage separate clusters for different node types. “We now run hundreds of concurrent sandboxes on CPU nodes alongside Slurm training jobs on GPU nodes, all through a single setup,” he said.

The launch comes at a pivotal moment for the industry. According to Holger Mueller, VP and principal analyst at Constellation Research, enterprises are under immense pressure to build agentic automation quickly. “Purpose-built execution that stays inside existing training infrastructure reduces operational sprawl and removes the fragility of homegrown systems,” Mueller observed.

By keeping the “sandbox” inside the core infrastructure, CoreWeave ensures that failure or memory spikes in one environment cannot affect other processes, providing a more resilient path for developing advanced AI agents.

CoreWeave debuts ‘Sandboxes’ to accelerate AI reinforcement learning and agent development

Get the Latest News

Get our Print Edition

RELATED ARTICLES

Latest Posts

Get the Latest News

Get our Print Edition

BINJE

CONTACT US