arXiv Is Making Researchers Own Their AI Mistakes

Somewhere inside a paper, a fake citation survived.

That is the nightmare.

Not because one reference is sacred. Not because researchers should be banned from using AI tools. The problem is simpler and uglier: if an author cannot be trusted to check the references, why should anyone trust the experiment, the math, the benchmark, or the conclusion?

That is the line arXiv is trying to draw.

According to TechCrunch, Thomas Dietterich, chair of arXiv’s computer science section, said submissions with clear evidence that authors failed to check LLM-generated output can trigger a one-year arXiv ban. After that, future submissions would need to be accepted by a reputable peer-reviewed venue before arXiv hosts them again.

The examples are not subtle: hallucinated references, leftover assistant comments, placeholder text, or other obvious signs that a model generated content and nobody bothered to verify it.

Good.

This is not an AI ban

The important part is what arXiv is not saying.

It is not saying researchers cannot use LLMs. That would be unrealistic and probably impossible to enforce. Researchers already use these tools for editing, translation, literature search, code help, summarization, and draft cleanup.

The rule is about responsibility.

If your name is on the paper, the work is yours. If an LLM helped generate a paragraph, a table, a citation, or a summary, that does not move responsibility to the model. It stays with the human authors.

That sounds obvious until you look at how fast the garbage can spread.

A recent arXiv paper audited 111 million references across 2.5 million papers and found a sharp rise in non-existent references after widespread LLM adoption, including a conservative estimate of 146,932 hallucinated citations in 2025 alone.

That is not a cute formatting mistake.

That is pollution in the knowledge graph.

Why citations matter

A fake citation is not just a bad link.

It can send credit to the wrong place, waste reviewer time, distort literature maps, and make future AI systems train on a more broken version of the record. The damage compounds because science is recursive: each paper leans on older papers, each review leans on the citation trail, and each new model consumes the mess later.

This is why the “just let the model write it” mindset is dangerous in research.

The issue is not whether an LLM can produce fluent academic prose. Of course it can. The issue is whether the authors still did the slow, boring, necessary work of checking that the words point to reality.

That is where trust lives.

The real signal

arXiv is not solving AI-generated research abuse with one rule.

But it is creating a useful norm: tools can assist, humans remain accountable, and obvious negligence has consequences.

That matters because every professional field is about to face the same problem. Law, medicine, finance, journalism, engineering, and education will all need their version of this rule. Not a theatrical ban on AI, but a plain statement that signing your name still means something.

AI makes it easier to produce plausible work.

That makes verification more valuable, not less.

If this rule annoys careless authors, fine. That is the point.

Sources: TechCrunch, arXiv hallucinated citations paper