60db Logo
Realtime Speech to Text

Real-Time Speech to Text
With Ultra-Low Latency

Scribe v2 Realtime is the most accurate real-time transcription model with 150ms latency across 90+ languages. Available via API.

Click to start transcribing

Experience real-time transcription with industry-leading accuracy

Select Language
Realtime Ready
Reset
Powered by Advanced STT API

Introducing Scribe v2 Realtime,
built for speed and accuracy

Ultra-fast, ultra-accurate, and built for live speech. Scribe v2 Realtime delivers instant transcription for agents, meetings, and conversational AI.

High Accuracy

Trained on diverse global data and fine-tuned for natural speech, Scribe achieves industry-best Word Error Rates across major languages and accents.

Hey, I booked an appointment for Friday afternoon, but I need to reschedule it to next Tuesday instead. If that's possible please.

0msMedian Latency

Ultra-low Latency

Stream audio and receive transcripts in ~150ms, enabling real-time understanding for live agents, meetings, and conversational AI.

I'm happy to help. What's your email address?
It's
Live call
Purpose-built for Agents and voice apps

Scribe v2 Realtime is purpose-built for developers creating conversational agents...

PolishJapaneseMandarin
Capture speech accurately in 90 languages

Scribe v2 Realtime ensures consistent understanding everywhere...

Multiple audio formats

Supports PCM (8-48 kHz) and μ-law encoding for compatibility across telephony.

Voice Activity Detection

Detects when speech starts and stops, segmenting audio precisely.

Manual Commit control

Gives developers control over when to finalize transcripts.

Frequently asked questions

The Creative Platform is a comprehensive suite of tools designed for generating, editing, and localizing premium audio and video content at scale.

You can create lifelike voiceovers for articles, books, video dubbing across languages, and custom AI voices for your brand.

Yes, our Enterprise and Business plans include commercial rights and indemnification for generated content.

Our localization tools automatically transcribe, translate, and re-dub your video or audio content while preserving the original speaker's voice.

Absolutely. Creating a Professional Voice Clone (PVC) allows you to create a unique, brand-consistent voice for all your applications.

Studio is our all-in-one editor that gives you precise control over pronunciation, timing, emotion, and emphasis in your generated audio.

Yes, for enterprise clients, we offer managed services to help with large-scale implementation, custom training, and support.

The Creative Platform focuses on content generation (audio/video files), whereas the Agents Platform is for building interactive conversational AI agents.

Yes, we offer team features like shared workspaces, role-based access control, and credit pooling.

Our platform is SOC 2 compliant, GDPR ready, and supports SSO, making it fully ready for enterprise deployment.

Yes, on our Team and Enterprise plans, usage credits are pooled and can be shared across all members.

Yes, our comprehensive API allows you to integrate all our generation and editing capabilities directly into your own applications.