An image of an egg carton created by Imagen 4.
Google is launching a new version of its image generation model, called Imagen 4, and the company says that it offers “stunning quality” and “superior typography.”
“Our latest Imagen model combines speed with precision to create stunning images,” Eli Collins, VP of product at Google Deepmind, says in a blog post. “Imagen 4 has remarkable clarity in fine details like intricate fabrics, water droplets, and animal fur, and excels in both photorealistic and abstract styles.” Sample images from Google do show some impressive, realistic detail, like one showing a whale jumping out of the water and another of a chameleon.
The AI model is also “significantly better at spelling and typography,” which Collins says makes it easier to create greeting cards, posters, and comics. (When OpenAI recently added image generation to ChatGPT, the company also touted its text rendering improvements, but it’s still susceptible to typos.)
In some images provided by Google, the text does look good — it’s perfectly legible in a short comic, for example, and even a tiny font in a mock stamp is readable. But we’ll have to see how the model’s text rendering capabilities hold up in the hands of regular users.
Imagen 4 will be available on May 20th in the Gemini app, Whisk, and Vertex AI, as well as in Slides, Vids, Docs, “and more in Workspace,” Collins says. Also, Google plans to launch a “fast variant” of Imagen 4 sometime “soon,” which it says is “up to 10x faster than Imagen 3.”