ChatGPT's image processing capabilities have been significantly upgraded.

This article is machine translated
Show original

The battle between OpenAI and Google for world-class AI applications has entered a new phase of attack and defense.

Early Wednesday morning Beijing time, OpenAI announced a new version of its ChatGPT image function . Besides improved image generation quality and faster speed , the new Images model also represents a significant advancement in image editing accuracy. In short, OpenAI is not only launching a counterattack against Google's highly acclaimed Nano Banana series of models, but also taking a jab at Photoshop's very foundation.

OpenAI states that its "ChatGPT Image" feature, based on its latest flagship image generation model, enables precise editing while preserving detail , making it more likely to deliver the desired results. This precise editing while maintaining detail is up to four times faster .

Of course, the effectiveness of image generation models must be illustrated with images.

As the most crucial improvement in this upgrade, OpenAI explains that the new model excels at various types of precise editing, including adding, deleting, merging, blending, and transposing, while preserving the original characteristics of the image to achieve the desired retouching effect .

For example, here is a demo of late 1990s Los Angeles street photography generated by ChatGPT’s new image model.

→Change the character's shirt to red , hat to yellow , speed limit to 15 , and truck to fire truck ;

→Add a group of onlookers on the left, an eagle perched on the sidewalk on the right, and a spaceship flying overhead in the distance;

→ A T-shirt with the full-coverage print of this image is hanging on a clothesline;

→Put that T-shirt on the skateboarder .

As an important application of AI-generated images, ChatGPT Images has also made progress in creatively transforming existing images. For example, it can transform a personal photo of the company's CEO, Altman, into an image of an American aerobics instructor from the 1980s, or place his face into the world-famous painting "Girl with a Pearl Earring".

In addition, ChatGPT also challenges Google's traditional strength—generating text-rich diagrams. OpenAI states that the model goes a step further in text rendering, capable of handling denser and smaller text .

It should be noted that although ChatGPT can generate English fonts so realistically that they are indistinguishable from real ones, the official team also acknowledges that this new model still has limitations in generating Chinese, Arabic, and Hebrew .

Therefore, at least in terms of Chinese text image generation, Nano Banana still completely surpasses ChatGPT .

It's worth noting that the new image generation model is both more powerful and cheaper. Compared to GPT Image 1, the upgraded GPT Image 1.5 reduces both image input and output costs by 20%.

This article is from the WeChat official account "Science and Technology Innovation Daily" , author: Shi Zhengcheng, and published with authorization from 36Kr.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments