AeroLogis | AI 产品展示

If you ask a modern chatbot to spin you a yarn, there is a remarkably high chance you will be introduced to Elias Thorne. Depending on the AI's mood, Elias...

If you ask a modern chatbot to spin you a yarn, there is a remarkably high chance you will be introduced to Elias Thorne. Depending on the AI's mood, Elias might be a solitary lighthouse keeper, a meticulous clockmaker, or a quiet librarian. But across almost all major large language models—including ChatGPT, Claude, and Gemini—Elias has become the undisputed default protagonist of the artificial intelligence era.

This quirk of machine behavior isn't just a coincidence; it's a symptom of how today's AI models are built. Researchers from Cornell University recently analyzed 20,000 stories generated by various top-tier chatbots. Their findings were startling: more than 88% of the stories relied on the exact same pool of 11 words, repeatedly leaning on names like Elias and Elara, alongside occupations like lighthouse keeper.

So, why does an AI with access to virtually all of human literature obsess over a fictional lighthouse keeper?

The answer lies in the incestuous nature of AI training data. The development of language models functions much like a family tree. OpenAI’s older GPT-3.5 model was used to help create "WildChat," a massive dataset of a million real ChatGPT conversations. Within that vast ocean of text, researchers found just 166 conversations featuring the name "Elias" written in a distinct, melancholic "lighthouse" style.

Because building new AI models is expensive and data-hungry, developers frequently use datasets like WildChat to train their own systems. As a result, the Elias trope was unwittingly replicated across the industry. It spread like a digital virus, passed down from one generation of models to the next.

But there is a second, more structural reason for Elias's omnipresence: safety alignment. Tech companies spend enormous resources ensuring their chatbots don't generate offensive, violent, or unsafe content. As algorithms aggressively filter out anything remotely controversial, they create a bottleneck. What survives the filter? The most innocuous, universally safe concepts imaginable. A lonely clockmaker or a lighthouse keeper gazing out at the sea fits the bill perfectly. Elias isn't chosen because he is the most creative option; he is chosen because he is the safest.

The phenomenon has already spilled over into the real world. The name Elias Thorne has "escaped" the chat interface, increasingly appearing as a fabricated author on Amazon for AI-generated books ranging from Greek mythology to potentially harmful alternative medicine guides. He also regularly features as a tragic figure in low-quality, AI-generated YouTube videos.

The ubiquitous lighthouse keeper serves as a fascinating reminder for anyone using generative AI. While these tools project an illusion of boundless imagination, their creative horizons are actually strictly confined by the recycled, heavily sanitized data they are fed. The AI isn't dreaming up a new world—it's just replaying the safest tape it knows.

Key Points

Over 88% of AI-generated stories analyzed in a Cornell study shared the exact same 11 words and character tropes.
The repetition is caused by AI developers recycling older datasets, causing specific narrative styles to spread like a digital virus.
Strict safety filters force AI models to default to universally inoffensive concepts, like a lonely lighthouse keeper.
The fictional 'Elias Thorne' is now appearing on Amazon and YouTube as a fake author and protagonist of AI-generated content.

Why It Matters

Recognizing why AI repeats certain tropes helps users understand that AI creativity is not infinite, but rather strictly bounded by recycled data and safety guardrails.

Sources:

Chatbots Keep Telling Stories About Lighthouse Keeper 'Elias Thorne'. We Might Know Why — 404 Media

The Mystery of the AI Lighthouse Keeper

Key Points

Why It Matters