Language

Synthszr Charts — die großen AI-Marken im Wettkampf ums Podium

Sonic-3.5

#2 in Text-to-Speech (TTS)

cartesia · v3.5 · seit 2026-06-16 · 9× · zuletzt 30. Juni 2026

Momentum

Cartesia Sonic-3.5 is a real-time text-to-speech model released on June 16, 2026, alongside the speech-recognition model Ink-2. Built on State Space Models (SSMs), it achieves a time-to-first-audio latency of under 90 ms according to the manufacturer. Sonic-3.5 ranks #1 on the Artificial Analysis Speech Arena Leaderboard with an Elo score of 1,218 and natively supports 42 languages including 9 Indian languages. The platform supports cloud, on-premise, and on-device deployment.

Momentum trend

04.04.03.07.

Features

Latency (ms)	< 90 ms Time-to-First-Audio (standard); approx. 82 ms end-to-end per Cartesia/Artificial Analysis; Turbo variant approx. 40 ms TTFB
Multilingualism (Dialects)	Accent localization available (e.g., Irish, New Zealand, South African, Belgian); 2026 changelog lists 94 new voices across 17 locales; automatic language adaptation to input text
On-Device Execution	Yes – Cartesia supports cloud, on-premise, and on-device deployment; Sonic On-Device (private beta) for real-time streaming on mobile devices and embedded hardware via SSM architecture
Languages	42 languages natively (including English, Hindi, Spanish, French, German, Japanese, Hebrew, and 35 more), incl. 9 Indian languages
TTS/STT Quality (Score)	Elo 1,218 on the Artificial Analysis Speech Arena Leaderboard (rank 1; based on 1,144 arena comparisons, ±16)

Sonic-3.5

Features

Sources (9)

Subscribe free. Unsubscribe the second it sucks.