Pocket #TTS: A high quality #TTS that gives your CPU a voice | #pockettts #voicecloning https://kyutai.org/blog/2026-01-13-pocket-tts
Pocket #TTS: A high quality #TTS that gives your CPU a voice | #pockettts #voicecloning https://kyutai.org/blog/2026-01-13-pocket-tts
Pocket TTS: A high quality TTS that gives your CPU a voice
https://kyutai.org/blog/2026-01-13-pocket-tts
#HackerNews #PocketTTS #TTS #Voice #Technology #HighQuality #CPU #VoiceAssistant
I create my teaching materials as #OER with #EmacsReveal [1]. For my course on IT Systems [2] in summer term 2025, I switched to #Kokoro [3] as #TextToSpeech model, and students generally liked the quality (see README of emacs-reveal for evaluation results). Teaching resources are video-like, interactive HTML presentations with audio, generated from #OrgMode text files using GitLab CI/CD pipelines.
This holiday season, I found the time to release #EmacsReveal 9.54.0, which includes the settings I used for IT Systems. I also updated the #TTS Howto [4] to use Kokoro.
Feel free to reuse my course materials and emacs-reveal! All the best for 2026!
[1] https://gitlab.com/oer/emacs-reveal/
[2] https://oer.gitlab.io/oer-courses/it-systems/
[3] https://github.com/hexgrad/kokoro
[4] https://oer.gitlab.io/emacs-reveal-howto/tts-howto.html
#Emacs #Org #RevealJS #CICD #FLOSS #FOSS #FreeSoftware #Education
@daltux não é bem isso que eu queria, meu problema não é o significado da palavra mas a pronúncia
Justamente o que eu quis dizer: o mecanismo de #TTS (text-to-speech / texto para voz) pode servir para isso mesmo!
Segue exemplo com isto selecionado e compartilhado com #TranslateYou: 猫
Eu não tenho o idioma japonês baixado e configurado aqui, senão ele provavelmente leria "neko" para mim, ao pressionar o botão 🔊
I create my teaching materials as #OER with #EmacsReveal [1]. For my course on IT Systems [2] in summer term 2025, I switched to #Kokoro [3] as #TextToSpeech model, and students generally liked the quality (see README of emacs-reveal for evaluation results). Teaching resources are video-like, interactive HTML presentations with audio, generated from #OrgMode text files using GitLab CI/CD pipelines.
This holiday season, I found the time to release #EmacsReveal 9.54.0, which includes the settings I used for IT Systems. I also updated the #TTS Howto [4] to use Kokoro.
Feel free to reuse my course materials and emacs-reveal! All the best for 2026!
[1] https://gitlab.com/oer/emacs-reveal/
[2] https://oer.gitlab.io/oer-courses/it-systems/
[3] https://github.com/hexgrad/kokoro
[4] https://oer.gitlab.io/emacs-reveal-howto/tts-howto.html
#Emacs #Org #RevealJS #CICD #FLOSS #FOSS #FreeSoftware #Education
TextAudio is now open source!
A privacy-first text-to-speech platform that converts documents into audiobooks without Big Tech surveillance. Features voice cloning, 23 languages, and production-ready microservices architecture.
Your documents, your data, your control. Built on sovereignty principles, released under MIT license.
Feel free to continue.
🔗 https://github.com/Pariatorn/textaudio
#OpenSource #Privacy #TTS #TextToSpeech #SelfHosted #Sovereignty #VoiceCloning #Python #FastAPI #SvelteKit #FOSS
TextAudio is now open source!
A privacy-first text-to-speech platform that converts documents into audiobooks without Big Tech surveillance. Features voice cloning, 23 languages, and production-ready microservices architecture.
Your documents, your data, your control. Built on sovereignty principles, released under MIT license.
Feel free to continue.
🔗 https://github.com/Pariatorn/textaudio
#OpenSource #Privacy #TTS #TextToSpeech #SelfHosted #Sovereignty #VoiceCloning #Python #FastAPI #SvelteKit #FOSS
Update on that #rust #tts #grpc service. TTS is far more complicated than I imagined even using #ai (machine learning) models. I assumed I'd have to process the text, for the model, but it's coming up that I need more processing than expected.
1. Split it up into sentences
2. Pass it through a phonemizer (phonetic/sound versions of the text)
3. Process the phonemes for the model
4. Run the model to actually generate the speech
I'm gonna have to write a blog post about this when I get done
Update on that #rust #tts #grpc service. TTS is far more complicated than I imagined even using #ai (machine learning) models. I assumed I'd have to process the text, for the model, but it's coming up that I need more processing than expected.
1. Split it up into sentences
2. Pass it through a phonemizer (phonetic/sound versions of the text)
3. Process the phonemes for the model
4. Run the model to actually generate the speech
I'm gonna have to write a blog post about this when I get done
I've started a new #rust project that may or may not end up seeing use at work. I'm trying to do it on my own as #foss if possible.
Idea is to have a simple containerized service that accepts text and streams back the audio using #tts models like KittenNanoTTS.
Anyone have specific rust advice on using grpc to stream, or on consuming models in rust?
I've started a new #rust project that may or may not end up seeing use at work. I'm trying to do it on my own as #foss if possible.
Idea is to have a simple containerized service that accepts text and streams back the audio using #tts models like KittenNanoTTS.
Anyone have specific rust advice on using grpc to stream, or on consuming models in rust?
Handy, https://handy.computer/.
A free, open source, and extensible speech-to-text application that works completely offline.
> Handy is a cross-platform desktop application built with Tauri (Rust + React/TypeScript) that provides simple, privacy-focused speech transcription. Press a shortcut, speak, and have your words appear in any text field—all without sending your voice to the cloud.
Handy, https://handy.computer/.
A free, open source, and extensible speech-to-text application that works completely offline.
> Handy is a cross-platform desktop application built with Tauri (Rust + React/TypeScript) that provides simple, privacy-focused speech transcription. Press a shortcut, speak, and have your words appear in any text field—all without sending your voice to the cloud.