Why isn't everyone talking about AI generated audiobooks?

PumpkinDrama@reddthat.com · 11 months ago

Why isn't everyone talking about AI generated audiobooks?

Turun@feddit.de · 11 months ago

I expect the data size to be a problem. Stable diffusion defaults to 512x512px, because it simply requires a lot of resources to generate an image. Even more so to train one. Now do that times 30 to generate even one second of video. I think we need something that scales better.

I fully expect this to work decently in a few years though, no matter how hard the challenge is, ai is moving really fast.

Hexarei@programming.dev · 11 months ago

Stable diffusion can do arbitrary sizes now, as long as you have the VRAM for it iirc

Turun@feddit.de · 11 months ago

Of course, but that is precisely the problem. It gets expensive really really fast.