Back to reviews
Cohere Transcribe

Cohere Transcribe

Open-source ASR model topping HuggingFace leaderboard — free API, 14 languages, enterprise-ready

Cohere launched Transcribe on March 26, 2026 — a 2B parameter open-source (Apache 2.0) automatic speech recognition model that's currently #1 on the HuggingFace Open ASR Leaderboard with a 5.42% word error rate, beating OpenAI Whisper Large v3 and ElevenLabs Scribe v2. It supports 14 languages and is built for enterprise production — low enough to run on consumer GPUs, fast enough for real-time transcription pipelines. The free API is available now with rate limits; Model Vault offers managed inference for production workloads. Planned integration into Cohere's North enterprise orchestration platform brings speech intelligence into agentic workflows.

Panel Reviews

The Builder

The Builder

Developer Perspective

Ship

A leaderboard-topping ASR model with Apache 2.0 weights and a free API is a no-brainer for any project that needs transcription. The 2B size means I can self-host it on a single A10 without tears. Cohere finally entering audio is a big deal — they've been credible on text and this looks equally rigorous.

The Skeptic

The Skeptic

Reality Check

Skip

5.42% WER on benchmark data is good but benchmarks measure clean, lab-quality audio. Real enterprise audio — phone calls, meeting rooms, accented speakers, domain jargon — is a different world. I'd want to see numbers on domain-specific test sets before migrating anything production off Whisper or Deepgram.

The Futurist

The Futurist

Big Picture

Ship

This is Cohere planting a flag in the full enterprise AI stack — text, code, and now audio under one roof. When Transcribe plugs into North's orchestration platform, you have a fully sovereign enterprise AI pipeline. That's a genuinely compelling alternative to stitching together APIs from three different vendors.

The Creator

The Creator

Content & Design

Ship

For content creators this is a proper Whisper upgrade — free to start, better accuracy, and downloadable for offline use. Podcast transcription, video captioning, voice-memo summaries — all suddenly cheaper or free. The 14-language support is also real, not just English-centric with degraded performance elsewhere.

Community Sentiment

Overall2,300 mentions
75% positive17% neutral8% negative
Hacker News610 mentions

Beating Whisper on WER with a smaller model called out as the headline result

Reddit740 mentions

r/MachineLearning thread debating whether leaderboard WER translates to real-world use

Twitter/X950 mentions

Apache 2.0 license praised for allowing commercial self-hosting without royalties