
ElevenLabs’ new AI-powered transcription tool, Scribe, sets a new standard in speech-to-text technology with industry-leading accuracy, support for 99 languages, advanced speaker tracking, and the ability to recognize non-verbal audio cues—all at an affordable price. (Source: Image by RR)
Recognizes Laughter, Sound Effects, and Background Noise for Richer Transcriptions
ElevenLabs has launched Scribe, an advanced speech-to-text model that goes beyond traditional transcription by accurately interpreting speaker roles, non-verbal sounds, and extreme speaking speeds. The company, previously recognized for its work in voice synthesis and cloning, is now making strides in the speech recognition industry. With support for 99 languages, including underrepresented ones like Serbian and Malayalam, Scribe outperforms competitors like Google and OpenAI in benchmarks, achieving the lowest word error rates (WER) across multiple datasets.
What sets Scribe apart is its ability to analyze full audio context, recognizing background sounds, music, and non-verbal cues like laughter. As noted in the-decoder.com, one of Scribe’s most impressive features is diarization, which can accurately assign speech to specific speakers, tracking up to 32 distinct voices in a single recording. This makes Scribe particularly valuable for industries that rely on precise documentation, such as automated transcription services, subtitles for media content, and call center analytics.
Independent tests confirm that Scribe v1 achieves a 7.7% WER, surpassing competing systems in accuracy. Its structured data output, which includes detailed word-level timestamps, enhances its versatility for various applications, from legal proceedings to customer service optimization. The system’s capability to operate effectively in noisy environments further solidifies its competitive advantage in the speech recognition market.
ElevenLabs has also introduced a cost-effective pricing model, charging only $0.40 per hour of transcription—matching OpenAI’s Whisper—while offering a 50% discount for early adopters. Additionally, a low-latency, real-time version of Scribe is in development, which could revolutionize industries requiring live transcription services. With its groundbreaking features and affordability, Scribe positions ElevenLabs as a formidable force in AI-driven speech technology.
read more at the-decoder.com
Leave A Comment