What Gives AI Text Away (And How to Fix It)

You can usually tell when something was written by AI. Not always, but often enough that it matters. The text feels assembled rather than written. It has a strange smoothness to it, like someone ran sandpaper over all the interesting edges.

I spent time digging into the research on this—Wikipedia's "Signs of AI writing" guide (maintained by editors who've seen thousands of AI-generated articles), academic papers on LLM detection, practitioner writeups. Then I built a skill for Claude Code that catches and fixes these patterns.

This is what I found.

Why AI text sounds like AI text

The short version: LLMs predict the most likely next token. The most likely token is, by definition, the most statistically common one. This pulls everything toward the average.

A passage about Tampere, Finland shouldn't read like it could describe any mid-sized European city. But that's exactly what happens when you ask an AI to write about Tampere without specific instructions. It produces "rich history," "vibrant culture," "mix of industries"—phrases that could apply to hundreds of places.

The same averaging happens with vocabulary, sentence structure, and rhetorical patterns. AI gravitates toward whatever showed up most often in training data. And what showed up most often? Marketing copy, press releases, tourism brochures, corporate communications. That's why AI text often reads like an advertisement for itself.

The tells, ranked by confidence

Not all AI patterns are equal. Some are dead giveaways; others are only suspicious when they cluster. I organized them into three tiers.

High confidence (fix these always)

Significance inflation. This is the single most reliable content-level tell. AI can't resist explaining why everything matters. Every town becomes "pivotal," every event "marks a turning point," every person "leaves an indelible mark."

Before:

The Statistical Institute of Catalonia was officially established in 1989, marking a pivotal moment in the evolution of regional statistics in Spain.

After:

The Statistical Institute of Catalonia was established in 1989 to collect and publish regional statistics independently from Spain's national statistics office.

The -ing tailing clause. Models tack trailing present-participle clauses onto sentences to add fake depth. "...highlighting its importance," "...underscoring the significance," "...reflecting the community's commitment." These clauses almost always restate what the sentence already said.

Before:

Consumers benefit from the flexibility to use their preferred mobile wallet, improving convenience.

The tailing clause adds nothing. Delete it.

AI vocabulary clusters. One "delve" is coincidence. "Delve into the tapestry of this vibrant landscape" is ChatGPT. The specific words vary by model—Claude hedges more, Gemini has its own quirks—but a cluster of 3+ AI-typical words in one passage is a strong signal.

Words to watch: delve, tapestry, landscape (figurative), pivotal, underscores, multifaceted, nuanced (without actual nuance), testament, realm, leverage, foster, embark, beacon, showcasing, groundbreaking, transformative, seamlessly.

Genericness. After reading a passage, ask: "Could I swap in a different subject and the text would still make sense?" If yes, it's too generic. AI smooths over specifics because specifics are statistically uncommon. The fix is to add real details—names, numbers, dates, sources.

Hallucinated citations. AI fabricates plausible-sounding sources. DOIs that don't resolve, ISBNs with invalid checksums, journal articles that don't exist. Always verify citations from AI-generated text.

Medium confidence (fix when they cluster)

Promotional tone. "Nestled in the breathtaking region," "boasts a vibrant culture," "captivates residents and visitors alike." AI was trained on marketing copy, so it writes like marketing copy. The fix is aggressive de-puffing.

Em dash overuse. Human writers use em dashes. AI uses them more frequently, in formulaic positions, mimicking "punchy" sales writing. OpenAI tried suppressing this in GPT-5.1, but the tell just shifted to other patterns.

Rule of three. AI defaults to triplets when listing anything. "Innovation, inspiration, and industry insights." One triplet is normal. When every list has exactly three items, it's a pattern.

Synonym cycling. AI has repetition-penalty code that produces unnatural synonym rotation. A character is introduced by name, then becomes "the protagonist," then "the central figure," then "the key player." Humans either repeat the name or use pronouns.

Uniform sentence structure. Human writing has irregular rhythm—short punchy sentences mixed with longer complex ones. AI text reads like a metronome. Research calls this "low burstiness." It's detectable both computationally and by reading aloud.

Compulsive summarizing. "In conclusion," "To sum up," "Overall." AI reflexively restates what was just said, even in short passages that don't need it.

Low confidence (not standalone tells)

These only matter when they cluster with other patterns:

Excessive bold formatting
Inline-header bullet lists (bolded label + colon + sentence)
Vague attributions ("experts say," "studies show")
"From X to Y" constructions that aren't real ranges
Excessive hedging ("could potentially," "might arguably")
Generic positive conclusions ("the future looks bright")
Chatbot artifacts ("Great question!", "I hope this helps!")

The soul problem

Avoiding AI patterns is only half the job. Sterile, voiceless writing is just as obvious as slop. You can strip out all the "delves" and "tapestries" and still end up with text that feels assembled by committee.

Signs of soulless writing:

Every sentence is the same length and structure
No opinions, just neutral reporting
No acknowledgment of uncertainty or mixed feelings
No first-person perspective when it would be natural
No humor, no edge, no personality

You have to add voice, not just remove bad patterns.

Have opinions. "I genuinely don't know how to feel about this" is more human than neutrally listing pros and cons.

Vary your rhythm. Short sentences. Then longer ones that take their time. Let a fragment stand alone. Throw in a question. Research finds that uniform sentence length is one of the most reliable statistical tells of AI writing.

Acknowledge complexity. Real humans have mixed feelings. "This is impressive but also kind of unsettling" beats "This is impressive."

Be specific. Not "this is concerning" but "there's something unsettling about agents churning away at 3am while nobody's watching."

Compare:

Before (clean but soulless):

The experiment produced interesting results. The agents generated 3 million lines of code. Some developers were impressed while others were skeptical. The implications remain unclear.

After (has a pulse):

I genuinely don't know how to feel about this one. 3 million lines of code, generated while the humans presumably slept. Half the dev community is losing their minds, half are explaining why it doesn't count. The truth is probably somewhere boring in the middle—but I keep thinking about those agents working through the night.

The humanizer skill

I packaged all of this into a Claude Code skill. When you run it on AI-generated text, it scans for high-confidence patterns first, then medium, then low. It rewrites the problematic sections, but it also tries to add voice—not just removing bad patterns but putting some personality back in. At the end it runs a two-pass audit: "What still makes this obviously AI-generated?" Then it fixes whatever's left.

The skill draws from Wikipedia's "Signs of AI writing" page (maintained by WikiProject AI Cleanup), academic papers on LLM detection, and practitioner writeups from places like TechCrunch and Beutler Ink.

Installing the skill

Drop it into your Claude Code skills directory:

~/.claude/skills/humanizer/SKILL.md

Then invoke it with /humanizer or let Claude load it automatically when you ask to humanize text.

Example transformation

Here's an actual before/after from a blog post I ran through it:

Before:

I've been running this pattern across multiple projects (video generation, helpdesk apps, experiment platforms) and it's remarkably effective for well-defined tasks.

After:

I've been running this across a few projects—video generation, a helpdesk app, some experiment platforms—and honestly, it works better than I expected for well-scoped tasks.

Changes: "remarkably effective" → "works better than I expected" (less puffed up, more honest). Added "honestly" for voice. "Multiple projects" → "a few projects" (more specific, less corporate).

Before:

The quality of your issues directly impacts agent success.

After:

How you write issues matters more than you'd think.

Changes: "directly impacts" is promotional. "Matters more than you'd think" is conversational.

Before:

Tips:
Be specific about endpoints, file locations, and expected behavior
Include acceptance criteria the agent can verify
Reference existing code patterns to follow
Break large features into smaller issues

After:

What helps:
Be specific about endpoints, file locations, expected behavior
Include acceptance criteria the agent can actually verify
Reference existing code patterns
Break large features into smaller issues (the agent handles focused tasks better than sprawling ones)

Changes: "Tips:" feels instructional/AI. "What helps:" feels like a human sharing experience. Added "(the agent handles focused tasks better than sprawling ones)" for voice—it's an aside, a human observation.

Why this matters

If you're using AI to write anything that will be read by humans—blog posts, documentation, emails, reports—the AI tells undermine your credibility. Even readers who can't articulate what's wrong will feel that something is off.

The patterns exist because of how LLMs work. They predict statistically likely text, which pulls toward the average, which sounds like everything and nothing. Fighting this requires deliberate effort.

The humanizer skill turns that into a process. Instead of hoping you catch all the "delves" and "tapestries," you have a checklist of 28 patterns organized by confidence level, with specific fixes for each.

The meta-irony

I ran this post through the humanizer. The first draft had "systematically identifies and fixes" (too stiff), a numbered list with bolded inline headers (the exact pattern the skill warns about), and "systematizes that effort" (corporate-speak). Fixed now.

Writing about AI detection while trying not to sound like AI is a strange exercise. But that's the whole point. The patterns are so ingrained that you produce them without noticing. Having a second pass that specifically looks for the tells—that's what makes the difference between text that feels assembled and text that feels written.

The skill is in my research repo if you want to try it on your own writing.