Skip to main content

Why do we talk like ChatGPT? Algorithms and language models are quietly rewriting our vocabulary and thinking

Abstract AI neural network visualization
When you write the word delve in English, it sounds almost like you just ordered a ChatGPT. And it's no coincidence. The frequency of this word in English texts has jumped by tens of percent over the past two years — and artificial intelligence is to blame. Not because it's a genius, but because OpenAI outsourced part of its model's training to Nigeria. There, the word delve is used significantly more often than in Europe or the US. And so it suddenly appeared in thousands of AI-generated texts, from where it seeped into human communication. But it's not just about one word. It's about the fact that algorithms — whether those behind social media or behind language models — are beginning to rewrite not only how we speak, but above all how we think.

Listen to this article:

The Delve Effect: When AI outsources your vocabulary

American linguist Adam Viktor Aleksic, known online as Etymology Nerd, described a phenomenon in his TED Talk that few people realize: human communication is increasingly resembling the outputs of large language models (LLMs).

The word delve is a textbook example. OpenAI used workers from Nigeria during ChatGPT's training, where this word is a common part of formal English. The model "learned" to use it, began generating it in its responses, and users — often unknowingly — started adopting it into their own expression. The feedback loop closed: AI shapes human language and humans feed it back into the training data.

And this isn't a fringe phenomenon. A study published in the journal Science in 2024 showed that LLM-generated texts exhibit significantly lower lexical diversity than human writing — they use a smaller set of "favorite" words repeatedly. When these texts flood the internet, a feedback loop emerges: models trained on AI-generated content themselves become more monotonous. Scientists call this effect model collapse — the degradation of model quality resulting from training on one's own outputs.

Algorithm as a filter on reality

But the problem isn't just in the language itself. The same mechanism operates at the level of what we even see and what we think about. Social media algorithms — from TikTok to Instagram to X (formerly Twitter) — systematically present us with the most extreme slice of reality. Why? Because controversial, shocking, or outrage-inducing content generates the most engagement. The more extreme the claim, the higher the chance it goes viral.

A typical example from the Czech environment: a loss by the national hockey team to Slovenia generates many times more activity on social media than a win against Italy. The algorithm learns this — and next time it shows you more losses, more controversies, more reasons for outrage. Neutral reality has no place in this system.

As the book Factfulness by Hans Rosling shows, human estimates on key issues — from global poverty to education levels — are systematically worse than reality. Algorithms deepen this effect even further. The problem arises when people start mistaking a distorted version of reality for the real one.

AI chatbots aren't neutral — and they can't be

A similar mechanism operates with AI chatbots. The information models share is subject to distortion — their training involved humans with their biases, cultural context, and linguistic habits. No model is "objective."

"Research has found that ChatGPT responds more conservatively when users interact with it in Persian," Aleksic said in his TED Talk. There are two reasons: the Persian language model isn't as well-trained as the English one, and the country has long trended toward more conservative attitudes, which is reflected in the training data.

Likewise, Grok by Elon Musk is not neutral. Musk adjusts the model daily to his liking, so it interacts exactly the way he wants. Users of the X platform (where Grok is integrated) thus unknowingly adopt Musk's worldview — simply because they get no other answer from the AI.

This raises a fundamental question: will Iranians think more conservatively from daily AI use? Will X users unknowingly adopt Musk's perspective? The answer isn't simple, but research suggests yes — people tend to trust AI outputs and adapt their opinions to them.

The reality algorithms themselves create

The most insidious aspect of the whole phenomenon is that algorithms actively create the reality they then only seemingly describe. Platforms feed people trends that would never have emerged without algorithmic amplification.

Think of it like this: TikTok's algorithm notices that a few users have started using a certain phrase. It starts recommending it to others. They adopt it. The phrase becomes a "trend." The algorithm amplifies it further. Eventually it reaches mainstream communication, from where language models pick it up and start using it in their responses. And the circle closes.

This isn't science fiction. It's exactly the mechanism behind the rise of the word delve. And there's no reason to think delve will be the last.

What does this mean for Czechia?

For Czech users, the situation is specific. ChatGPT, Claude, Gemini, and other models do support Czech, but their training data is predominantly English. When you communicate in Czech, the model internally "thinks" in English and translates the result. This means that not only English language patterns penetrate Czech, but also the cultural context and value frameworks of the English-speaking world.

The European Union is responding through the AI Act, which introduces requirements for transparency and labeling of AI-generated content. Starting in 2026, there will be a mandatory requirement to label deepfakes and AI-generated texts — precisely so that people know what they are interacting with. The Czech National Cyber and Information Security Agency (NÚKIB) is actively involved in implementing these rules.

At the same time, the importance of Czech language models is growing — for example, the model from the team at the Faculty of Mathematics and Physics at Charles University. The more Czechs use models trained on Czech data, the lower the risk of unconsciously adopting foreign linguistic and thought patterns.

The only effective defense: ask why

The conclusion isn't complicated, but it is demanding to execute. Everything the internet presents us — whether through social media or AI — has passed through a filter. Social media filters the feed. AI filters the answers. Neither is a neutral mirror of reality.

The most effective defense tool is a single word: why. Why am I seeing exactly this when I open Reels on Instagram? Why am I using this particular word? Why did ChatGPT answer me this way? What does the platform gain from me reacting in precisely this manner?

As Aleksic says: "The moment we stop asking these questions, their version of reality becomes ours."

Is using ChatGPT or other AI chatbots harmful?

Not in itself. The problem arises when we accept AI outputs uncritically and fail to realize they have passed through a filter — linguistic, cultural, and value-based. A healthy approach is to use AI as a tool, but to maintain awareness that the answers are not objective truth, but the result of a statistical model trained on data with specific biases.

How can I tell if a text was written by artificial intelligence?

A 100% reliable detector doesn't exist, but there are telltale signs: repetition of certain words (in English typically "delve," "tapestry," "crucial"), a uniform style without a personal tone, overly perfect paragraph structure, and an absence of concrete personal experience. The EU AI Act requires AI system providers to label generated content starting in 2026 — though in practice this mainly concerns deepfakes and visual content for now.

Does AI affect the Czech language too, or only English?

It affects both, but the mechanisms differ. In English, it involves directly adopting words and phrases from AI outputs. In Czech, the effect is indirect — Czech users consume AI-generated content primarily in English and unknowingly adopt English language patterns, calques, and stylistic techniques. Moreover, models like ChatGPT draw from English training data when generating Czech and "translate" thought patterns — the result is Czech that occasionally sounds suspiciously like a translation from English.

X

Don't miss out!

Subscribe for the latest news and updates.