Voice Cloning tool from OpenAI

• OpenAI has built a voice cloning tool called Voice Engine that can generate synthetic speech matching any voice from a 15-second sample.

• Voice Engine powers the voice capabilities in ChatGPT and OpenAI's text-to-speech API, and has been used by Spotify for podcast dubbing.

• The model was trained on a mix of licensed and publicly available speech data, though details are not provided.

• Voice Engine generates speech on-the-fly without building custom models, allowing cheap pricing around $1 per hour of audio.

• It lacks controls to adjust characteristics like pitch and tone, though it aims to mimic the expressiveness of the sample voice.

• The tool could commoditize voice acting work, though OpenAI is exploring actor compensation models.

• Voice cloning carries risks like harassment, fraud, and election interference via deepfakes.

• OpenAI is limiting initial Voice Engine access and use cases while exploring mitigations like watermarking.

• Future plans include security by having users read randomized text to prove consent.

• OpenAI is reluctant to commit to a general release until safety issues from the pilot are understood.

2 comments

skool.com/generativeai

Learn and Master Generative AI, tools and programming with practical applications at work or business. Embrace the future – join us now!