Parler-TTS: Natural language guidance of high-fidelity TTS

column · on April 11, 2024

I've tried it on my laptop and it is about as slow/fast as xtts. But as far as I can there's no way of keeping a consistent voice from generation to generation. If so, I don't really get the appeal. If there was a way to get consistent, then that's great for NPCs.

yjftsjthsd-h · on April 11, 2024

If we're talking about https://huggingface.co/coqui/XTTS-v2 , that appears to be explicitly licensed only for non-commercial use, while this appears to be pure Apache 2.0; that's a good enough reason to prefer this for some people.

IronWolve · on April 11, 2024

Lots of these are nice sounding, but still far from quality of simply importing a text file ebook and getting a nice sounding audiobook.

josephh · on April 11, 2024

Does anyone know of a good text normalization (?) library that converts symbols and initialisms into plain English before feeding them into a TTS model? All the models that I've used so far do a horrible job at synthesizing speech for them and I'm wondering whether this is the missing piece in the pipeline.

jamifsud · on April 11, 2024

I’ve found GPT 3.5 to do a good job of this, not perfect but I bet with some more prompt engineering it could get really good.

josephh · on April 11, 2024

Thanks! It never occurred to me that I can just tweak the system prompt to make sure the LL model never outputs symbols and initialisms as is.

mdrzn · on April 12, 2024

All the "Voice cloner" TTS I tried only work in English language, whenever tried with Italian language it doesn't mimic the original voice at all.

Y_Y · on April 11, 2024

There are two hard problems in computer science; naming things.

Unfortunate that this shares a name with a much-maligned microblogging site. Probably it's not a good idea to take unmodified everyday words[0] from a widely spoken language as your product name, see also e.g. "Triton".

[0] In this case "parler" is French for "to speak"

zarmin · on April 11, 2024

wait til you hear about Parler