OpenAI has introduced ‘Voice Engine’, a voice-cloning tool that it intends to keep under strict control to avoid the spread of audio forgeries designed to deceive listeners.
According to an OpenAI blog post detailing the findings of a small-scale trial of the tool, Voice Engine can replicate someone’s speech based on a mere 15-second audio snippet.
Acknowledging the significant risks associated with creating speech that mimics individuals’ voices, particularly in the context of an election year, the San Francisco-based company emphasized its commitment to collaborating with various stakeholders, including governmental bodies, media outlets, and civil society, to address concerns and integrate feedback into the tool’s development process.
Given the widespread concern among disinformation researchers regarding the potential misuse of AI-driven technologies during critical electoral periods, OpenAI emphasized its cautious and deliberative approach to introducing Voice Engine to a broader audience, mindful of the potential for synthetic voice manipulation.
This cautious approach follows an incident a few months earlier, wherein a political consultant associated with a Democratic presidential campaign admitted to orchestrating a robocall impersonating then-US President Joe Biden during the New Hampshire primary. The incident underscored the growing threat of AI-generated deepfake disinformation campaigns, prompting heightened vigilance among experts ahead of the 2024 White House race and other pivotal elections worldwide.
To mitigate the risks associated with Voice Engine, OpenAI outlined safety measures, including the requirement for explicit and informed consent from individuals whose voices are replicated using the tool. Additionally, the company emphasized the importance of transparently informing audiences when they are listening to AI-generated voices and implemented watermarking to trace the origin of generated content.
