OpenAI, renowned for innovations like ChatGPT and image generator DALL-E, disclosed its experimentation with a text-to-video model named Sora, designed to empower users in creating lifelike videos from simple prompts.
The tech giant, backed by Microsoft, unveiled that Sora was currently undergoing testing, accompanied by a preview of its capabilities through released videos generated from provided prompts.
According to OpenAI’s blog post, Sora has the capacity to produce videos of up to one minute in duration, while preserving high visual fidelity and adhering closely to user inputs.
Moreover, the model demonstrates the ability to transform static images into dynamic video content, as highlighted by the company.
OpenAI’s CEO, Sam Altman, shared that the platform was being offered to a select group of creators for testing purposes.
Additionally, he encouraged users to submit prompts, showcasing the convincing results shortly afterward on the platform.
Examples of these prompts included a short clip featuring two golden retrievers podcasting atop a mountain and another depicting a hybrid creature, part duck and part dragon, soaring through a picturesque sunset with an adventurous hamster riding on its back.
Acknowledging the model’s current limitations, such as occasional confusion with spatial directions and challenges in maintaining visual coherence throughout video sequences, the San Francisco-based startup emphasized the importance of safety.
They announced plans for rigorous adversarial testing, involving dedicated users attempting to uncover vulnerabilities, produce inappropriate content, or disrupt the system.
OpenAI underscored its commitment to engaging with policymakers, educators, and artists worldwide to address concerns and explore constructive applications for this cutting-edge technology.
Notably, other tech giants like Meta, Google, and Runway AI are also actively developing text-to-video AI technology, showcasing similar glimpses of their progress.
