Update on that #rust #tts #grpc service. TTS is far more complicated than I imagined even using #ai (machine learning) models. I assumed I'd have to process the text, for the model, but it's coming up that I need more processing than expected.
1. Split it up into sentences
2. Pass it through a phonemizer (phonetic/sound versions of the text)
3. Process the phonemes for the model
4. Run the model to actually generate the speech
I'm gonna have to write a blog post about this when I get done