Tech Updates

Startups

Artificial Intelligence

Americas

Cohere Launches Transcribe: The New Global Standard in Open-Source Speech Recognition Accuracy

Cohere has released Transcribe, a state-of-the-art 2B parameter speech-to-text model that outperforms industry giants.

Cohere has released Transcribe, a state-of-the-art 2B parameter speech-to-text model that outperforms industry giants.

NewDecoded

Published Mar 27, 2026

Mar 27, 2026

2 min read

Image by Cohere


Cohere today announced the release of Transcribe, an open-weights automatic speech recognition model that sets a new performance benchmark. Currently ranking first on the Hugging Face Open ASR Leaderboard, the model achieves a record-low word error rate of 5.42%. This launch marks the company's significant expansion into the audio modality for enterprise applications.

The system utilizes a 2-billion-parameter Conformer-based architecture designed specifically for high-efficiency processing. By dedicating over 90% of its parameters to the encoder, Cohere ensures that Transcribe remains fast enough for real-time production environments. The model is available under the permissive Apache 2.0 license for immediate local or cloud deployment.

In rigorous testing, Transcribe consistently outperformed established competitors such as OpenAI's Whisper Large v3 and ElevenLabs Scribe v2. It demonstrates particular strength in challenging environments like noisy boardrooms and diverse global accents. Human evaluators preferred Transcribe's output in the majority of head-to-head comparisons across multiple languages.

Beyond accuracy, the model is built for massive throughput, capable of processing up to 525 minutes of audio per minute. Its compact size allows it to run effectively on consumer-grade hardware or at the edge. Developers can access the technology via Hugging Face or through Cohere’s managed Model Vault platform. This release serves as the foundation for deeper integration with North, Cohere's agent orchestration platform. The company aims to move beyond simple transcription toward comprehensive enterprise speech intelligence. This shift allows businesses to embed high-fidelity audio processing directly into their private infrastructure.


Decoded Take

Decoded Take

Decoded Take

The release of Transcribe represents a pivotal shift where high-performance speech recognition is no longer gatekept by closed-source providers. By providing a model that is both more accurate than Whisper and permissive in its licensing, Cohere is commoditizing the entry point for multimodal AI agents. This move pressures legacy providers to justify their costs while empowering enterprises to keep sensitive audio data within their own secure perimeters. As speech becomes the primary interface for AI interaction, this launch positions Cohere as a critical infrastructure layer for the next generation of voice-activated business automation.

Share this article

Related Articles