Are there any free/open-source TTS options out there that are on the same level as Google Cloud’s? I tried a lot of free ones, but they are absolutely awful and still sound like my Amiga did 30 years ago. With LLMs being available as open source, I am hoping there’s also a good TTS offering I just haven’t found yet.
Festival – not cutting edge – will definitely be better than your Amiga, and can handle long text. Last time I set it up, IIRC I wanted some voices generated by Tokyo University or something, which took some setting up. It’ll probably be packaged in your Linux distro.
You can listen to a demo here.
https://www.cstr.ed.ac.uk/projects/festival/onlinedemo.html
It’s not LLM-based.
For short snippets, offline, one can use Tortoise TTS – which is LLM based. But it’s slow and can only generate clips of a limited length. Whether it’s reasonable for you will depend a lot on your application. It will let one clone – or make a voice sounding more-or-less similar – a voice using some sound samples from them speaking.
https://github.com/neonbjb/tortoise-tts
Examples at:
https://nonint.com/static/tortoise_v2_examples.html
I haven’t used Google’s, but I’d assume, given that Google is paying people to work on it full time, that whatever they’ve done probably sounds nicer. But, then not open source, so…shrugs