Skip to main content

OpenAI Strengthens Image Generation in ChatGPT: A New Era of Multimodal Interaction

Anthropic AI data center TPU compute infrastructure
OpenAI has just pushed the boundary between text conversation and visual creation. New updates within ChatGPT are no longer just about "commanding and generating," but about deep, multimodal collaboration, where the model understands not only your text but also the visual context you share with it. For users, this means that creating visual content becomes a natural part of the dialogue.

For a long time, we were used to ChatGPT functioning primarily as a text brain that, if needed, would "switch" to the DALL-E tool to create an image. However, according to information provided by WIRED, OpenAI is pursuing a path of much deeper integration. The goal is for image generation not to be an isolated function, but an organic part of a multimodal model, such as GPT-4o.

From Text to Visual Understanding: What's Changing?

The main benefit of the new update is a significant improvement in what is called prompt adherence, i.e., the model's ability to precisely follow user instructions. Previous versions of image generation often suffered from "hallucinations" in details – they might omit a specific object, misinterpret colors, or have trouble with text directly within the image. The new enhancement utilizes advanced semantic understanding, allowing the user to modify already created images using natural language.

Instead of having to write a completely new, complex prompt, you can now say: "Now move that character a bit to the left and change the color of their coat to dark blue." Thanks to integration with multimodal layers, the model understands the spatial relationships and visual properties of the original output. This is a fundamental shift in how people will collaborate with AI in creative work.

Technical Comparison: OpenAI vs. Competition

To understand where OpenAI stands, it is necessary to compare its approach with other players in the market. In the field of image generation, a battle of three main forces is currently unfolding:

  • OpenAI (ChatGPT/DALL-E): Their biggest advantage is the conversational interface. You don't deal with parameters like in professional tools, but rather "chat" with the model. It's ideal for rapid prototyping and brainstorming.
  • Midjourney: Still holds an edge in pure aesthetic quality and artistic expression. Midjourney is a tool for artists that requires precise control over parameters. However, OpenAI is trying to eliminate this difference through better understanding of complex instructions.
  • Google Gemini (Imagen): Google relies on deep integration into its ecosystem (Google Docs, Slides). Gemini excels in speed and data integration, but in direct creative interaction of the "chat-to-edit" type, ChatGPT currently leads due to its architecture.

In benchmark tests focused on text rendering (the ability to insert readable text into an image), new OpenAI versions surpassed previous models by approx.% and now match the best models from Adobe (Firefly), which is crucial for creating marketing materials.

Practical Impact for Czech Users and Businesses

What does this mean for you if you're sitting in an office in Prague or Brno? For small and medium-sized enterprises in the Czech Republic, this update represents a significant reduction in visual creation costs. Marketing agencies can now much faster create concepts for social networks, visuals for websites, or illustrations for blog articles directly within a single tool they likely already use for writing texts.

Availability and Language: The good news is that ChatGPT and its generative capabilities are fully available in Czech. You can enter instructions in our native language, and the model interprets them correctly. This is a huge advantage for the Czech market compared to some specialized tools that still require English.

Regulation and EU AI Act: Given that readers in the EU are also interested in the legal aspect, it is important to mention that OpenAI implements mechanisms for digital watermarking (C2PA). This is in line with the European regulation EU AI Act, which requires generated content to be clearly identifiable as created by artificial intelligence. For Czech companies, this means greater legal certainty when using these images in commercial trade.

Pricing Policy: How Much Will It Cost You?

OpenAI maintains its subscription structure, which is relatively straightforward for the Czech market:

  • Free Tier: Basic access to ChatGPT with a limited number of image generations using DALL-E. Ideal for testing.
  • ChatGPT Plus: Costs 20 USD per month (approximately 470 CZK according to the current exchange rate). Provides higher limits, priority access to new models, and full image generation integration.
  • ChatGPT Team/Enterprise: For companies with higher data security requirements and higher limits, prices range from 25–30 USD per user.

For a Czech freelancer or small studio, the Plus subscription is recouped after just a few created visuals that would otherwise have to be ordered from an external graphic designer.

Can I use images generated in ChatGPT for commercial purposes (e.g., for advertising in the Czech Republic)?

Yes, according to current OpenAI terms, you own the rights to outputs created using ChatGPT, which allows for their commercial use. However, it is always advisable to monitor the current terms of service, which may change, and to respect third-party copyrights if the prompt contains names of specific living artists.

How accurate is image generation in Czech compared to English?

Thanks to advancements in multimodal models (like GPT-4o), semantic understanding of Czech is very high. The difference in quality between an English and Czech prompt is currently minimal, although for extremely specific technical terms, English might still be slightly more precise.

How do I know if an image was created by AI to comply with EU regulations?

OpenAI embeds metadata into its images according to the C2PA standard. This information is digitally inconspicuous but allows for verification of the image's origin. For transparent business in the EU, it is recommended to state this information with the content if necessary (e.g., with a small text "Image generated by AI").