Are there any free/open-source TTS options out there that are on the same level as Google Cloud’s? I tried a lot of free ones, but they are absolutely awful and still sound like my Amiga did 30 years ago. With LLMs being available as open source, I am hoping there’s also a good TTS offering I just haven’t found yet.

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    8 months ago

    Festival – not cutting edge – will definitely be better than your Amiga, and can handle long text. Last time I set it up, IIRC I wanted some voices generated by Tokyo University or something, which took some setting up. It’ll probably be packaged in your Linux distro.

    You can listen to a demo here.

    https://www.cstr.ed.ac.uk/projects/festival/onlinedemo.html

    It’s not LLM-based.

    For short snippets, offline, one can use Tortoise TTS – which is LLM based. But it’s slow and can only generate clips of a limited length. Whether it’s reasonable for you will depend a lot on your application. It will let one clone – or make a voice sounding more-or-less similar – a voice using some sound samples from them speaking.

    https://github.com/neonbjb/tortoise-tts

    Examples at:

    https://nonint.com/static/tortoise_v2_examples.html

    I haven’t used Google’s, but I’d assume, given that Google is paying people to work on it full time, that whatever they’ve done probably sounds nicer. But, then not open source, so…shrugs