This article is machine translated
Show original
Google's new Gemini TTS model is amazing!
It can directly control the speaker's gender, tone, intonation, pronunciation of specific words, and age using prompts.
It can control almost anything you can think of, without needing to switch to a separate speech model!
My AI interactive comic book app is finally complete!
It uses Nano Banana Pro to dynamically generate images for each scene, and the new TTS to generate unique voices for each character.
twitter.com/op7418/status/1999...
When speaking Chinese, some old issues remain, such as a slight foreign accent.
However, it's much better than previous TTS systems for speaking Chinese. Hopefully, they can fix this strange intonation problem in Mandarin.
I've adjusted the prompts; try it again if it sounded wrong before:

歸藏(guizang.ai)
@op7418
12-12
通过提示词调整了一下角色的音色效果好了非常多!
刚才觉得有问题的可以再听一下,Gemini 这个 TTS 如果中文再好点真的又要起飞了
AI 漫剧:Nano Banana Pro + Gemini 2.5 TTS + Gemini 3.0 Pro x.com/op7418/status/…
From Twitter
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments
Share
Relevant content






