Google launched Gemini Omni Flash, its first Gemini Omni model. This brings AI video generation and conversational editing to the Gemini app, Google Flow and YouTube Shorts.
The company said Gemini Omni can combine images, audio, video and text as input to generate high-quality videos. These videos are grounded in Gemini’s real-world knowledge. It also allows users to edit videos through natural-language prompts.
Google said Omni can preserve character consistency, scene memory and physical logic across multiple editing turns. Users can change environments, angles, styles, objects or actions in a video through conversation.
The model is designed to reason about motion, context and physics, including gravity, kinetic energy and fluid dynamics, according to Google. Additionally, the company said the system can also generate visual explainers from short prompts.
Gemini Omni supports video creation from mixed references, including images, text, video and limited audio inputs. Google said voice references will be supported first, with more audio input types planned later.
The rollout also includes Avatars, a feature that lets users create videos with a digital version of themselves using their own voice. However, Google said broader audio and speech editing remains under testing because of safety considerations.
Read: Gemini 3.5 Flash Becomes Default Model in Gemini
All videos created with Omni include Google’s imperceptible SynthID digital watermark. The company said users can verify Omni-generated videos through the Gemini app, Gemini in Chrome and Google Search.
Gemini Omni Flash is rolling out globally to Google AI Plus, Pro and Ultra subscribers through the Gemini app and Google Flow. In addition, it is also rolling out at no cost to users on YouTube Shorts and the YouTube Create App this week. Developer and enterprise API access is planned in the coming weeks.