OpenAI unveiled revolutionary new software that can produce high-caliber video in response to a few simple text queries — a dazzling breakthrough from the ChatGPT maker that could also take concerns about deepfakes and rip-offs of licensed content to a new level.
The technology, called Sora, uses its “deep understanding of language” to create clips of up to one-minute long that include “compelling characters” and “multiple shots within a single generated video,” the company said on a website dedicated to the new tech.
“Sora is able to generate complex scenes with multiple characters, specific types of motion and accurate details of the subject and background,” OpenAI said. “The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”
The Sam Altman-led firm provided a few stunning examples from prompts that were seemingly written for a Hollywood script, according to tech outlet Wired, which was given a sneak peek at Sora’s capabilities.
“Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes,” the prompt read.
Sora turned the three sentences into a vibrant 17-second video — well short of the one-minute limit — that rendered a nondescript couple holding hands while walking along a snow-covered street lined with pagoda-topped shops with the Tokyo skyline in the distance.
Cherry blossoms (sakura) were in full bloom as snow fell from the overcast sky.
There were a few bugs, like the sidewalk coming to a dead end, but overall it was “a mind-blowing exercise in world-building,” Wired wrote.
“The current model has weaknesses. It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect,” OpenAI said.
“For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark.”
However, another jaw-dropping example came from a prompt that requested “an animated scene of a short fluffy monster kneeling beside a red candle” who had “wide eyes and open mouth.”
The result was a mashup of a Furby with a gremlin that created a cuddly creature suitable for Pixar’s “Monsters, Inc.” franchise. The ease with which Sora rendered the character belied the time-consuming efforts it usually takes experienced animators — raising concerns about the impact the technology will have on the movie industry.
A future enhancement will be the ability to generate video from a still image, the company said.
“This will be another really cool way to improve storytelling capabilities,” Bill Peebles, a researcher on the project, told Wired.
“You can draw exactly what you have on your mind and then animate it to life.”
It wasn’t immediately clear when Sora will become available to the general public, or if it will be free for users.
Representatives for OpenAI did not immediately respond to The Post’s request for comment.
Currently, the software was released to select creators and security experts who will “red-team” the product for security issues.
Red-teaming is a process where a group pretends to be an enemy and attempts a physical or digital intrusion against an organization.
Sora’s generative power not only threatens to upend Hollywood in the future, but in the near term the short-form videos pose a risk of spreading misinformation, bias and hate speech on popular social media platforms like Reels and TikTok.
The company has vowed to prevent the software from rendering violent scenes or deepfake porn, like the graphic images of a nude Taylor Swift that went viral last month.
Sora also won’t appropriate real people or the style of a named artist, but its use of “publicly available” content for AI training can lead to the type of legal headaches OpenAI has faced from media companies, actors and authors over copyright infringement.
“The training data is from content we’ve licensed and also publicly available content,” the company said.
OpenAI said it was developing tools which can discern if a video was generated by Sora — placating growing concerns about threats like GenAI’s potential influence on the 2024 election.
The company — which has a $10 billion “multiyear” agreement with Microsoft, expanding upon a partnership that began in 2019 with just $1 billion from the Big Tech firm — also ensured that it’s taking “several important safety steps ahead of making Sora available in OpenAI’s products.
AI’s ability to meddle with elections have ramped up after the company released ChatGPT, which can mimic human writing convincingly, and DALL-E, whose technology can be used to create “deepfakes,” or realistic-looking images that are fabricated.
Altman testified in Congress last May that he was “nervous” about generative AI’s ability to compromise election integrity through “one-on-one interactive disinformation.”
The San Francisco-based company said it is working with the National Association of Secretaries of State, an organization that focuses on promoting effective democratic processes such as elections.
ChatGPT will direct users to CanIVote.org when asked certain election-related questions, it added.
News of Sora’s forthcoming deployment follows rival Meta’s move to beef up its image generation model Emu last year, when it added two AI-based features that can edit and generate videos from text prompts.
Google and startups like Runway have also launched text-to-video AI projects.
With Post wires.
Source