Fish Audio

Freemium

Hanabi AI · Audio & Voice

Expressive TTS and 15-second voice cloning with emotion control

4.5$15/mo

Supports ArabicAPI

Visit website

Overview

Fish Audio (by Hanabi AI) is an AI voice platform built around expressive, real-time text-to-speech models: emotion tags like [angry], [sad] and [whispering] make narration genuinely lively, voice cloning needs just 15 seconds of audio, speech-to-text includes multispeaker detection, and a community library offers 2,000,000+ voices across 30+ languages including Arabic. The S1 and S2 research models are open-sourced, and a low-latency streaming API serves developers. Plans: Free (8,000 credits ≈ 7 minutes/month, personal use only), Plus at $15/month (250K credits ≈ 200 minutes with commercial use), Pro at $100/month (2M credits, 3 team seats) and Max at $999/month; each generated minute costs roughly 600–625 credits.