3.4 C
New York

Now you can test Gemini 2.0 Flash native image output

Published:

Google’s AI news today follows Gemma 3 & Gemini Robotics.

Continue
to provide wider access to native images in Gemini 2.0 Flash, which allows for conversational editing of images alongside other capabilities.

In December, when Gemini 2.0 Flash launched, Google mentioned that audio and images would be output in addition to text. This is part of the effort to make Gemini a model that can accept different inputs and produce similar outputs.

Native output allows you to edit images by a series of natural language dialogues.

Flash 2.0 can render images better, including those with long sequences. It has been difficult to do this with today’s models.

In comparison to other standalone image-generation models, this feature in 2.0 Flash “leverages enhanced reasoning and world knowledge to create the right picture.”

It’s perfect for creating realistic imagery like illustrating a cooking recipe. It aims for accuracy but, as with all language models, it is not perfect or complete.

The example below asks: “Give a recipe for chocolate chip cookies.” Please include an image for each step.

An example of how to use the ability to output text and pictures together is to ask 2.0 Flash to tell you a story using images that keep “characters and settings constant throughout.”

In December, Gemini’s native image output only was available to trusted testers. All developers/users can now try it in Google AI Studio with the updated experimental version of Gemini 2.0 Flash (gemini-2.0-flash-exp), or the Gemini API. Go to the “preview section” in the right-hand model selector (on desktop). Set “Output Format” to Images + Text. Daily limits are in effect. Add 9to5Google news feed to your Google News.FTC: we use auto affiliate links that earn income. More.




www.roboticsobserver.com

Related articles

spot_img

Recent articles

spot_img