Follow ZDNET: Add us as a preferred source on Google.
ZDNET’s principal highlights
- Google Omni’s ambition is to revolutionize video creation, mirroring Nano Banana’s impact on image generation.
- Users can craft videos from diverse inputs: text, stills, sound, or existing footage.
- AI avatars offer creators powerful tools, yet trigger concerns regarding trust.
In its recent announcement, Google unveiled an AI-driven video enhancement poised to streamline high-quality content production for creatives while potentially flooding YouTube with AI-generated material. The outcome will likely be a blend of both scenarios.
Gemini Omni, a novel AI-powered video creation tool, promises a significant leap in video generation capabilities for creators. Google positioned this innovation as a milestone comparable to Nano Banana’s transformative impact on AI imagery.
Also: Google I/O 2026: Everything announced
Nano Banana dramatically elevated the standards for AI-generated visuals. Omni aims to achieve similar advancements for video content. While the rollout has started, I haven’t yet had the opportunity to experiment with it.
Google characterized Omni as “the fusion of Gemini’s reasoning prowess with enhanced creative abilities.” Notably, the company highlights that Omni can “seamlessly integrate visual, audio, textual, and video elements to produce high-quality content rooted in Gemini’s comprehensive world knowledge.”
Though Omni initially “focuses on video,” Google indicated its capacity to “generate anything from multiple input formats,” suggesting future expansion into other creative mediums.
Also: 6 Android Auto apps I wish I found sooner, because they make every drive easier
Omni will be accessible in various tiers, commencing with Gemini Omni Flash. The feature will integrate into the Gemini application, Google Flow, and YouTube Shorts. Details remain unclear regarding Omni’s inclusion in the browser-based Gemini platform or its availability through Flow.
Several distinguishing features make Omni an exceptionally compelling tool.
Replicate your persona
This feature is exceptionally versatile yet raises significant privacy concerns. Google enables users to “craft videos utilizing your own voice through Avatars, generating a digital representation of you for producing authentic-sounding videos.”
Also: I used Nano Banana 2 to make perfect sketchnotes: 5 lessons learned
As a frequent YouTube creator, I’m captivating. Occasionally, when facing poor audio quality, self-consciousness about appearance, or simply feeling off on a particular day, I’ve preferred not to record.
Could I potentially delegate narration to a digital doppelgänger? Would viewers notice, react negatively, or accept it? While this warrants exploration, I likely won’t rely on it extensively.
Part of my motivation for maintaining this channel is refining my presentation skills. Shifting responsibilities to an avatar might ease my workload but diminish my practice.
Google emphasizes its integration of SynthID watermarking technology, ensuring Omni-generated content is identifiable. The company further notes, “Beyond avatars, we’re evaluating audio modification capabilities for responsible implementation.”
Realistic physics simulations
Recalling early video games, character movements resembled falling objects rather than realistic physics simulation. Modern games successfully integrated physics engines, ensuring interactions follow natural laws.
Omni incorporates sophisticated physics modeling, featuring “enhanced comprehension of forces such as gravity, momentum, and fluid behavior.” Leveraging Gemini’s extensive knowledge, it “interconnects linguistic, visual, and contextual cues transcending basic pattern recognition.”
Also: OpenAI’s new image watermarks make it easier to spot AI fakes – here’s how
Google claims Omni excels at producing detailed videos from brief prompts, particularly effective for explanatory content clarifying complex concepts. I am skeptical—NotebookLM’s audio and video summaries are remarkably effective at distilling intricate information into accessible explanations.
I supplied marketing materials and technical documents to NotebookLM, and it generated concise visual explanations for various security product features surpassing manual creation in efficiency. Although initial visuals were rudimentary, the ability to summarize complex information into polished content rapidly accelerated my product launches.
Versatile input compatibility
Nano Banana’s early success stemmed from image recontextualization. For example, I requested it to transform my park stroll into a military command scenario aboard an aircraft carrier.
Also: I turned casual selfies into professional headshots with Gemini
Omni extends these capabilities to video, converting “diverse inputs—still images, scripts, recordings—into cohesive outputs.” Currently limited solely to voice recordings, Google plans “to expand supported audio formats shortly.”
Additionally, users can design scenes, define stylistic preferences, articulate specifications using everyday language, and maintain character consistency throughout productions.
Intuitive conversational editing
Video editing is often tedious. Omni transforms this process with natural language instructions. “Gemini Omni empowers effortless video refinement through verbal commands. Each directive builds sequentially. Characters remain consistent, physics stay accurate, and narrative coherence persists.”
Google also confirms support for comprehensive alterations. Importing footage to eliminate distractions or swap elements or backgrounds seems highly advantageous. The exact scope of edits supported within subscription plans remains to be clarified.
Also: Are Sora 2 and other AI video tools risky to use? Here’s what a legal scholar says
Two further transformations offered include:
- Modify details or fully recreate scenes. Your existing footage evolves into unattainable productions.
- Capture real-world moments and request Omni to redefine events, introducing unexpected twists or additional subjects.
Google hasn’t disclosed specific video specifications or output resolutions. Will this support professional formats like 16:9 at 4K or 8K, or is it targeting short-form content?
When OpenAI debuted Sora, it seemed novel. Users manipulated it extensively (resulting in humorous outcomes like Sam Altman with blue hair, and even more
This AI platform left countless users applauding ZDNET, yet it failed to genuinely support a professional’s workflow.
While crafting AI avatar clones and swapping objects may be entertaining, I’m eager to see these features become functional within industry-standard software like Final Cut Pro, Premiere Pro, or DaVinci Resolve. At the very least, seamless integration for sharing Omni-generated projects would be a massive boon.
There is potential. Omni’s capabilities are scheduled to be released to business clients and developers through a Google API.
Additionally: OpenAI’s new image watermarks make it easier to spot AI fakes – here’s how
I’m also wondering whether Omni will feature the subtle diamond watermark in the corner of its videos, similar to Nano Banana’s generated images. While it’s positive to identify AI-generated content, these watermarks hinder the software’s use as a professional-grade tool.
Will there be premium tiers that allow for watermark removal? Or will external tools appear to eliminate the watermark, regardless of Google’s stance? Only time will tell.
Would you leverage Google Omni to produce a digital avatar of yourself for videos you prefer not to record in person? Share your thoughts in the comments below.
You can follow my daily updates on social media. Make sure to sign up for my weekly newsletter, and catch me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.



