Musk x AI launches "Rapid Voice Cloning" feature: Create your own Grok voice actor in just one minute with natural speech.

This article is machine translated
Show original

In the field of generative AI, specifically voice recognition, Elon Musk's xAI has launched a strong offensive against competitors such as OpenAI.

On April 30, 2026, xAI released an official announcement , declaring a major update to its AI platform—the full launch of "Custom Voices" and the new "Voice Library" features, allowing individuals and businesses to seamlessly integrate "their own voices" into various AI application scenarios with extremely low barriers to entry.

Record in less than 1 minute and instantly generate your own AI voice.

According to xAI, creating a personalized AI voice model is now simpler than ever before. Users simply need to record a short, natural speech clip of just a few seconds to a minute in the xAI console, and the entire model creation process can be completed in less than two minutes .

Once generated, this customized voice can be immediately invoked in Grok's Text-to-Speech (TTS) service and Voice Agent API. xAI officially outlines five core application scenarios for this technology:

  • Brand Customer Service Agent: Businesses can enable AI customer service to use a brand-specific, consistent voice to enhance their corporate image.
  • Content creators and podcasts: Creators can use their own voices to narrate videos or generate audiobooks on a large scale without having to go into the recording studio to record each time.
  • Cross-language speaking: Enables multinational CEOs to deliver key speeches in their own voice, seamlessly switching between multiple languages ​​(such as Chinese, English, Japanese, French, etc.).
  • Games and Entertainment: Quickly voice NPC characters in the metaverse or games.
  • Accessibility assistance: Permanently preserve the original voice characteristics of patients with rare diseases such as ALS who are about to lose their ability to speak.

Beware of Deepfakes! Uploading audio files is prohibited; two-factor authentication is required.

With the widespread adoption of voice duplication technology, the use of deepfake technology to impersonate celebrity voices and commit telecommunications fraud has become increasingly common. To prevent the malicious abuse of this technology, xAI has implemented extremely stringent security measures.

xAI emphasizes that the system "absolutely cannot use existing recordings to copy audio." Users must record themselves in real time, and the system will ask them to read aloud a randomly generated "passphrase." The AI ​​then verifies the content through speech-to-text conversion and compares the speaker embedding vectors to ensure that the person recording the passphrase is the same person as the original recording. This dual verification mechanism fundamentally prevents hackers from using other people's audio files to "steal audio."

The voice library is now online; use your own custom voice without extra charge.

In addition to its powerful customization features, xAI has also launched a "Voice Library," allowing development teams to manage all custom and built-in voices in a unified manner. The Voice Library currently includes over 80 high-quality voices and supports up to 28 languages ​​for users to preview freely.

What excites developers and businesses most is that xAI announced that using custom voice functionality will be "completely free of charge" and fully supports all advanced features of the original TTS system (such as voice tagging, real-time streaming, etc.). Users can easily call it by simply specifying a unique voice_id in the API, which will undoubtedly significantly reduce the cost barrier for enterprises to implement custom voice AI.

加入動區 Telegram 頻道

📍 Related reports📍

Musk refuses summons from French prosecutors; investigation into Grok's Deepfake erotic image generation and X algorithm continues.

xAI secretly launched Grok 4.3: directly generating Word, PPT, and Excel files, eroding Microsoft's competitive advantage.

Prefer Tesla's voice? xAI officially opens its Grok voice API, achieving $4.2 per million characters in TTS and surpassing ElevenLabs in recognition rate.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments