Back to blog

AI Conference Flooded with Reviews Made by Artificial Intelligence

Hello HaWkers, recent news has raised serious discussions about the integrity of scientific research. One of the world's largest artificial intelligence conferences discovered that a significant portion of paper reviews was done using AI tools like ChatGPT.

The irony is impossible to ignore: AI researchers using AI to review research about AI. But what are the real implications of this?

What Happened

NeurIPS (Conference on Neural Information Processing Systems), one of the most prestigious machine learning and artificial intelligence conferences, identified a concerning pattern in reviews of papers submitted in 2025.

Investigation Findings

Data collected:

  • Analysis of thousands of submitted reviews
  • Detection of linguistic patterns typical of LLMs
  • Comparison with reviews from previous years
  • Use of AI-generated text detection tools

Identified signs:

  1. Repetitive formulaic phrases
  2. Standardized feedback structure
  3. Absence of context-specific criticism
  4. Generic comments that could apply to any paper

Scale of the Problem

Impact estimates:

  • Significant percentage of reviews with signs of AI use
  • Exponential increase compared to 2024
  • Problem identified in multiple conference tracks
  • Reviewers from different countries and institutions

Why This Is Concerning

Integrity of the Peer Review Process

Peer review is the foundation of modern science. Specialized researchers evaluate colleagues' work to ensure quality, identify errors and validate contributions.

Peer review functions:

  • Verify methodology: Was the research conducted correctly?
  • Assess originality: Does the work bring new contributions?
  • Identify errors: Are there logical or mathematical flaws?
  • Suggest improvements: How can the work be improved?

Problems with AI reviews:

  • LLMs don't deeply understand content
  • Cannot verify experiments or reproduction
  • Generate generic feedback without specialized insights
  • May accept or reject papers arbitrarily

Impact on Academic Career

Consequences for researchers:

  • Papers unfairly rejected by superficial reviews
  • Papers accepted without adequate rigor
  • Inequality between those who receive human vs AI reviews
  • Loss of confidence in the publication system

How AI Is Being Used

Identified Scenarios

Problematic use:

  1. Complete AI review: Reviewer copies the paper into ChatGPT and requests a review
  2. Review editing: Human review improved by AI (gray area)
  3. Multiple reviews: A reviewer using AI to accept more papers than they could manually

Example of generic review (typical of AI):

"This paper presents an interesting contribution to the field. The experiments are well conducted and the results are promising. I suggest the authors expand the discussion on limitations and future work. The paper is well written and well organized."

Compare with a specialized human review:

"The proof of Theorem 3.2 on page 5 assumes the distribution is i.i.d., but this contradicts the problem formulation in Section 2. Furthermore, the baselines used in Table 2 are from 2019 and there are more recent methods that should be compared, specifically [X] and [Y]. The complexity analysis also seems to ignore preprocessing cost."

Pressure on Reviewers

Why reviewers resort to AI:

  • Growing volume of submissions (NeurIPS receives 10,000+ papers)
  • Tight deadlines for review delivery
  • Lack of compensation for review work
  • Pressure to accept multiple review invitations
  • Review fatigue in senior researchers

Scientific Community Response

NeurIPS Measures

Actions taken:

  1. Detection: Implementation of tools to identify AI reviews
  2. Policies: Updated guidelines for reviewers
  3. Consequences: Removal of reviewers who violated rules
  4. Transparency: Public disclosure of the problem

Community Debate

Divergent positions:

Against any AI use:

  • Review is professional responsibility
  • AI does not replace human expertise
  • Compromises scientific integrity

Favorable to partial use:

  • AI can help identify grammatical problems
  • Can assist in organizing thoughts
  • Human reviewers still make final evaluation

Gray area:

  • Using AI to summarize long papers
  • Checking references and formatting
  • Translating papers from unknown languages

The Paradox of AI Reviewing AI

The Fundamental Irony

We are in a situation where:

Problematic cycle:

  1. Researchers use AI to write papers
  2. Reviewers use AI to evaluate papers
  3. Editors use AI to make decisions
  4. The "science" produced is a conversation between LLMs

Risks of this cycle:

  • Loss of human critical thinking
  • Research homogenization
  • Model biases propagated
  • Stagnation of real innovation

Philosophical Questions

Questions without easy answers:

  • If an AI can review papers, are human reviewers necessary?
  • What is the value of a degree if AI does intellectual work?
  • How to distinguish human contribution from machine contribution?
  • Is science still "made by humans, for humans"?

What This Means For Developers

Impact on Research Quality

For those who consume research:

  • Accepted papers may have less rigor
  • Results may not be reproducible
  • Tool recommendations may be biased
  • Benchmarks may be questionable

How to Evaluate Papers Now

Tips for critical readers:

  1. Check reproduction: Code available? Open data?
  2. Read methodology: Do experiments make sense?
  3. Compare baselines: Are they using recent methods?
  4. Seek second opinion: What do other researchers say?
  5. Trust but verify: Implement yourself when possible

Possible Solutions

Structural Changes

Proposals under discussion:

  1. Open review: Public reviews with reviewer name
  2. Compensation: Pay reviewers for their work
  3. Invitation limit: Restrict number of reviews per person
  4. Expertise verification: Ensure reviewer knows the area
  5. Detection tools: AI to detect AI

Technology As Solution

Tools being developed:

  • AI-generated text detectors specific to academics
  • Reviewer expertise verification systems
  • Review platforms with auditing
  • Blockchain to track review process

Cultural Change

What needs to change:

  • Value quality over quantity of publications
  • Recognize review as valuable work
  • Reduce pressure to publish at any cost
  • Educate about ethical use of AI tools

Conclusion

The discovery that a major AI conference was flooded with reviews made by artificial intelligence is a warning to the entire scientific community. The peer review system, built over centuries, faces its greatest challenge in the LLM era.

For developers who consume academic research, this means being more critical and careful when evaluating papers and their recommendations. Quality science still depends on humans committed to rigor and integrity.

If you want to understand more about how AI is transforming developers' work, I recommend checking out the article about 85% of Developers Use AI where we analyze data from JetBrains research.

Let's go! 🦅

💻 Master JavaScript for Real

The knowledge you gained in this article is just the beginning. There are techniques, patterns and practices that transform beginner developers into sought-after professionals.

Invest in Your Future

I've prepared complete material for you to master JavaScript:

Payment options:

  • 1x of $4.90 no interest
  • or $4.90 at sight

📖 View Complete Content

Comments (0)

This article has no comments yet 😢. Be the first! 🚀🦅

Add comments