Cohere's Open-Weight ASR Model Hits 5.4% WER, Disrupting Production
Cohere has launched Transcribe, an open-weight ASR model with a remarkable 5.42% word error rate. This model offers enterprises state-of-the-art accuracy, comparable to closed APIs, while allowing on-premise deployment to address data residency, control, and latency concerns. Transcribe currently leads the Hugging Face ASR leaderboard, outperforming Whisper and other industry leaders.

Cohere has unveiled Transcribe, an open-weight Automatic Speech Recognition (ASR) model, achieving a remarkable average word error rate (WER) of just 5.42%. Announced on March 30, 2026, this breakthrough performance positions Transcribe as a formidable contender capable of replacing existing closed-source speech APIs in demanding enterprise production pipelines.
Enterprises previously faced a difficult choice: highly accurate but proprietary APIs with potential data residency issues, or open models that often sacrificed accuracy for deployability and control. Cohere's Transcribe, licensed under Apache-2.0, aims to eliminate this compromise by offering state-of-the-art accuracy alongside the flexibility and control of an open-weight model.
Setting a New Standard for ASR Accuracy
Transcribe, accessible via Cohere’s API or within its Model Vault as cohere-transcribe-03-2026, boasts 2 billion parameters. Its average WER of 5.42% signifies fewer transcription errors compared to many similar models on the market. This focus on minimizing WER was deliberate, with Cohere prioritizing production readiness from the outset.
The model's training spans 14 languages, including English, French, German, Italian, Spanish, Greek, Dutch, Polish, Portuguese, Chinese, Japanese, Korean, Vietnamese, and Arabic. While Cohere did not specify the particular Chinese dialect, the broad linguistic coverage suggests a wide applicability for global enterprises.
Empowering Enterprise Self-Hosting and Control
A key differentiator for Transcribe is its open-weight nature, enabling organizations to deploy the model directly on their own local GPU infrastructure. This capability addresses critical concerns such as data residency, latency, and cost, which are often associated with routing sensitive audio data through external, closed APIs.
Unlike research models such as OpenAI's Whisper, which launched under an MIT license, Transcribe is commercially ready from its initial release. Early adopters have highlighted the significance of this commercial-ready, open-weight approach for enterprise deployments, particularly for teams seeking to bring audio data workloads in-house. Cohere notes that Transcribe features a more manageable inference footprint for local GPUs, achieved by extending the “Pareto frontier” to deliver high accuracy and throughput within the 1B+ parameter model cohort.
Outperforming Industry Stalwarts
Cohere’s Transcribe has quickly risen to prominence, currently topping the Hugging Face ASR leaderboard. Its 5.42% average WER outpaces several established models, including OpenAI’s Whisper Large v3, which powers ChatGPT’s voice features, recorded at 7.44% WER.
Other notable competitors like ElevenLabs Scribe v2 logged a 5.83% WER, and Qwen3-ASR-1.7B stood at 5.76%, both trailing Transcribe’s accuracy. Beyond the leaderboard, Transcribe demonstrated strong performance on specific datasets: 8.15% on the AMI dataset (for meeting understanding) and 5.87% on the Voxpopuli dataset (for diverse accent understanding), a score only narrowly beaten by Zoom Scribe.
Implications for Modern Workflows
For engineering teams developing sophisticated AI applications like Retrieval Augmented Generation (RAG) pipelines or agent workflows that rely on audio inputs, Transcribe offers a compelling path to achieving production-grade transcription without the typical data residency and latency penalties inherent in closed API solutions. The ability to deploy on-premises provides unparalleled control over data security and processing.
The model's launch marks a significant shift in the ASR landscape, providing enterprises with a powerful, flexible, and accurate tool to integrate voice capabilities deeply into their operations, ultimately driving new levels of automation and insight from audio data.
FAQ
Q: What is Cohere's Transcribe model and why is it significant?
A: Transcribe is Cohere's new open-weight Automatic Speech Recognition (ASR) model, notable for achieving a low average word error rate (WER) of 5.42%. Its significance lies in offering state-of-the-art accuracy alongside the ability for enterprises to self-host the model, addressing data residency and control issues often associated with closed-source speech APIs.
Q: How does Transcribe compare in performance to other leading ASR models?
A: Transcribe currently leads the Hugging Face ASR leaderboard with its 5.42% WER. It outperforms prominent models like OpenAI’s Whisper Large v3 (7.44% WER), ElevenLabs Scribe v2 (5.83% WER), and Qwen3-ASR-1.7B (5.76% WER), demonstrating superior contextual accuracy.
Q: What are the main benefits for enterprises adopting Cohere's Transcribe?
A: Enterprises can benefit from Transcribe’s high accuracy for critical voice-enabled workflows, alongside the flexibility of local deployment on their own GPU infrastructure. This allows for greater control over data residency, reduced latency, and potentially lower costs compared to relying on external closed APIs, making it ideal for RAG pipelines and agent workflows.
Related articles
Intel Joins Elon Musk’s Terafab Chips Project
Intel has joined Elon Musk's Terafab chips project, partnering with SpaceX and Tesla to build a new semiconductor factory in Texas. This collaboration leverages Intel's chip manufacturing expertise to produce 1 TW/year of compute for AI, robotics, and other advanced applications, significantly bolstering Intel's foundry business.
Apple’s foldable iPhone is on track to launch in September, report
Apple's first foldable iPhone is reportedly on track for a September launch alongside the iPhone 18 Pro and Pro Max, according to a new report from Bloomberg's Mark Gurman. This news mitigates earlier concerns about potential delays due to engineering complexities, suggesting Apple has made significant strides in addressing screen quality, durability, and crease visibility issues. The highly anticipated device is poised to position Apple as a strong competitor in the growing foldable smartphone market.
Tech Moves: Microsoft Leader Jumps to Anthropic, New CEO at Tagboard
Microsoft veteran Eric Boyd has joined AI leader Anthropic to head its infrastructure team, marking a major personnel shift in the competitive AI sector. Concurrently, Tagboard, a Redmond-based live broadcast production company, announced Marty Roberts as its new CEO, succeeding Nathan Peterson. Expedia Group also promoted Ryan Desjardins to Vice President of Technology, bolstering its efforts in AI integration.
in-depth: My Blissful Week as a ‘Do Not Disturb’ Maximalist: Digital
A technology journalist embarked on a week-long experiment, embracing "Do Not Disturb" (DND) maximalism to silence all smartphone notifications. The experience, though challenging socially, revealed a path to greater focus and personal boundaries, highlighting a growing trend to reclaim attention in a constantly connected world.
NASA’s Artemis II mission to fly around the far side of the Moon
NASA's Artemis II mission successfully completed its historic lunar flyby on April 6th, circling the Moon's far side and setting a new human distance record. The four astronauts are now returning to Earth, marking a critical step in the program's ambitious goal of establishing a sustainable presence on the Moon and paving the way for future lunar landings.
OpenAI’s vision for the AI economy: public wealth funds, robot taxes
In a significant move to shape the burgeoning AI economy, OpenAI has unveiled a comprehensive set of policy proposals designed to navigate the economic and social shifts brought about by superintelligent machines. The






