60db Logo

The Most AccurateSpeech to Text AI

Transcribe audio and video with 99% accuracy. Support for 22 languages, real-time streaming, and bulk processing. Perfect for podcasts, meetings, and content creation.

Click to start transcribing

Experience real-time transcription with industry-leading accuracy

Select Language
Realtime Ready
Reset
Powered by Advanced STT API
Realtime Performance

Transcribe with
zero latency

Scribe v2 Realtime is built for conversational AI. With latency as low as 150ms, it enables fluid, natural voice interactions for any application.

Ultra-low Latency

Built for the speed of conversation, faster than human reaction time.

State-of-the-Art Accuracy

Industry-leading WER (Word Error Rate) for real-time transcription.

Transcription Speed (ms)
Scribe v2 Realtime150ms
Competitor A450ms
Competitor B850ms
3x Faster than industry average

"The speed of 60db is a game changer for our AI agents."

Scribe v2 standard

Transcribe, tag,
and caption

Perfect for long-form content. Scribe v2 provides the highest standard of accuracy for audio files, complete with speaker diarization and automated captioning.

Speaker Diarization

Detect and label multiple speakers automatically with high precision.

Automated Captions

Generate SRT and VTT files for video content in seconds.

Keyterm Prompting

Provide rare words or technical terms to guide the transcription model.

Global Languages

Support for 22 languages with localized accents and context.

"Our goal at 60db is to make audio content accessible globally."

SPEAKER 1: 00:02
SPEAKER 2: 00:08

Built for ultimate creativity

Highly accurate, performant and secure Speech to Text models designed to power the next generation of audio apps.

Keyterm Prompting

Guide the model with rare words, acronyms, or technical jargon to ensure perfect transcription.

Dynamic Audio Tagging

Automatically detect and tag type of audio—whether it's speech, background music, or noise.

Speaker Detection

Accurately separate and label different speakers in an audio file, and detect entity types.

Enterprise Grade

Secure, SOC 2 and ISO 27001 compliant infrastructure built for critical business workflows.

Timestamp Accuracy

Get word-level timestamps that are perfectly synchronized with your audio input.

Multilingual Support

One model for the whole world. Scribe v2 supports 22 languages and 70+ accents.

Frequently Asked Questions

Everything you need to know about Scribe v2 and transcription.

Scribe v2 is our most accurate Speech to Text model yet, with industry-leading word error rates (WER). It's trained on over 1 million hours of diverse audio content to handle various accents, background noise, and overlapping speech.

The most realistic voice AI platform