AI Conference Flooded with Reviews Made by Artificial Intelligence

Hello HaWkers, recent news has raised serious discussions about the integrity of scientific research. One of the world's largest artificial intelligence conferences discovered that a significant portion of paper reviews was done using AI tools like ChatGPT.

The irony is impossible to ignore: AI researchers using AI to review research about AI. But what are the real implications of this?

What Happened

NeurIPS (Conference on Neural Information Processing Systems), one of the most prestigious machine learning and artificial intelligence conferences, identified a concerning pattern in reviews of papers submitted in 2025.

Investigation Findings

Data collected:

Analysis of thousands of submitted reviews
Detection of linguistic patterns typical of LLMs
Comparison with reviews from previous years
Use of AI-generated text detection tools

Identified signs:

Repetitive formulaic phrases
Standardized feedback structure
Absence of context-specific criticism
Generic comments that could apply to any paper

Scale of the Problem

Impact estimates:

Significant percentage of reviews with signs of AI use
Exponential increase compared to 2024
Problem identified in multiple conference tracks
Reviewers from different countries and institutions

Why This Is Concerning

Integrity of the Peer Review Process

Peer review is the foundation of modern science. Specialized researchers evaluate colleagues' work to ensure quality, identify errors and validate contributions.

Peer review functions:

Verify methodology: Was the research conducted correctly?
Assess originality: Does the work bring new contributions?
Identify errors: Are there logical or mathematical flaws?
Suggest improvements: How can the work be improved?

Problems with AI reviews:

LLMs don't deeply understand content
Cannot verify experiments or reproduction
Generate generic feedback without specialized insights
May accept or reject papers arbitrarily

Impact on Academic Career

Consequences for researchers:

Papers unfairly rejected by superficial reviews
Papers accepted without adequate rigor
Inequality between those who receive human vs AI reviews
Loss of confidence in the publication system

How AI Is Being Used

Identified Scenarios

Problematic use:

Complete AI review: Reviewer copies the paper into ChatGPT and requests a review
Review editing: Human review improved by AI (gray area)
Multiple reviews: A reviewer using AI to accept more papers than they could manually

Example of generic review (typical of AI):

"This paper presents an interesting contribution to the field. The experiments are well conducted and the results are promising. I suggest the authors expand the discussion on limitations and future work. The paper is well written and well organized."

Compare with a specialized human review:

"The proof of Theorem 3.2 on page 5 assumes the distribution is i.i.d., but this contradicts the problem formulation in Section 2. Furthermore, the baselines used in Table 2 are from 2019 and there are more recent methods that should be compared, specifically [X] and [Y]. The complexity analysis also seems to ignore preprocessing cost."

Pressure on Reviewers

Why reviewers resort to AI:

Growing volume of submissions (NeurIPS receives 10,000+ papers)
Tight deadlines for review delivery
Lack of compensation for review work
Pressure to accept multiple review invitations
Review fatigue in senior researchers

Scientific Community Response

NeurIPS Measures

Actions taken:

Detection: Implementation of tools to identify AI reviews
Policies: Updated guidelines for reviewers
Consequences: Removal of reviewers who violated rules
Transparency: Public disclosure of the problem

Community Debate

Divergent positions:

Against any AI use:

Review is professional responsibility
AI does not replace human expertise
Compromises scientific integrity

Favorable to partial use:

AI can help identify grammatical problems
Can assist in organizing thoughts
Human reviewers still make final evaluation

Gray area:

Using AI to summarize long papers
Checking references and formatting
Translating papers from unknown languages

The Paradox of AI Reviewing AI

The Fundamental Irony

We are in a situation where:

Problematic cycle:

Researchers use AI to write papers
Reviewers use AI to evaluate papers
Editors use AI to make decisions
The "science" produced is a conversation between LLMs

Risks of this cycle:

Loss of human critical thinking
Research homogenization
Model biases propagated
Stagnation of real innovation

Philosophical Questions

Questions without easy answers:

If an AI can review papers, are human reviewers necessary?
What is the value of a degree if AI does intellectual work?
How to distinguish human contribution from machine contribution?
Is science still "made by humans, for humans"?

What This Means For Developers

Impact on Research Quality

For those who consume research:

Accepted papers may have less rigor
Results may not be reproducible
Tool recommendations may be biased
Benchmarks may be questionable

How to Evaluate Papers Now

Tips for critical readers:

Check reproduction: Code available? Open data?
Read methodology: Do experiments make sense?
Compare baselines: Are they using recent methods?
Seek second opinion: What do other researchers say?
Trust but verify: Implement yourself when possible

Possible Solutions

Structural Changes

Proposals under discussion:

Open review: Public reviews with reviewer name
Compensation: Pay reviewers for their work
Invitation limit: Restrict number of reviews per person
Expertise verification: Ensure reviewer knows the area
Detection tools: AI to detect AI

Technology As Solution

Tools being developed:

AI-generated text detectors specific to academics
Reviewer expertise verification systems
Review platforms with auditing
Blockchain to track review process

Cultural Change

What needs to change:

Value quality over quantity of publications
Recognize review as valuable work
Reduce pressure to publish at any cost
Educate about ethical use of AI tools

Conclusion

The discovery that a major AI conference was flooded with reviews made by artificial intelligence is a warning to the entire scientific community. The peer review system, built over centuries, faces its greatest challenge in the LLM era.

For developers who consume academic research, this means being more critical and careful when evaluating papers and their recommendations. Quality science still depends on humans committed to rigor and integrity.

If you want to understand more about how AI is transforming developers' work, I recommend checking out the article about 85% of Developers Use AI where we analyze data from JetBrains research.

Let's go! 🦅

💻 Master JavaScript for Real

The knowledge you gained in this article is just the beginning. There are techniques, patterns and practices that transform beginner developers into sought-after professionals.

Invest in Your Future

I've prepared complete material for you to master JavaScript:

Payment options:

1x of $4.90 no interest
or $4.90 at sight

📖 View Complete Content