Google launches Gemini Omni Flash, a conversational video-generation model with avatar mode held back
Back to Home
ai

Google launches Gemini Omni Flash, a conversational video-generation model with avatar mode held back

May 20, 20263 views2 min read

Google has launched Gemini Omni Flash, a multimodal video-generation model with avatar mode and default SynthID watermarking. Speech-editing features are being held back for further development.

Google has unveiled Gemini Omni Flash, the first model in DeepMind’s new Omni family, at the I/O 2026 developer conference. This multimodal AI model is designed to generate and edit video content using a combination of text, images, audio, and video inputs. The announcement marks a significant step forward in Google’s efforts to expand its AI capabilities in creative and interactive media.

Advanced Video Generation with Avatar Mode

Gemini Omni Flash stands out for its ability to process and synthesize various media types into coherent video outputs. Notably, the model supports avatar mode, which allows users to generate video content featuring lifelike digital avatars. However, Google has decided to hold back the full speech-editing feature, indicating that it is still in development or being carefully evaluated for safety and ethical implications.

Watermarking and Ethical Considerations

Another key feature of Gemini Omni Flash is its default implementation of SynthID watermarking. This technology embeds digital identifiers into generated media to help detect AI-created content. The move reflects growing industry and regulatory concerns about deepfakes and the potential misuse of AI-generated media. By enabling watermarking by default, Google is positioning itself as a responsible player in the AI landscape, aiming to maintain transparency and accountability.

The introduction of the Omni family signals a major expansion of Google’s AI vision, focusing on multimodal interaction and content creation. While still in its early stages, Gemini Omni Flash sets the stage for future advancements in video generation and editing, with potential applications in entertainment, education, and digital marketing.

Source: TNW Neural

Related Articles