AI Conference Flooded with Reviews Made by Artificial Intelligence
Hello HaWkers, recent news has raised serious discussions about the integrity of scientific research. One of the world's largest artificial intelligence conferences discovered that a significant portion of paper reviews was done using AI tools like ChatGPT.
The irony is impossible to ignore: AI researchers using AI to review research about AI. But what are the real implications of this?
What Happened
NeurIPS (Conference on Neural Information Processing Systems), one of the most prestigious machine learning and artificial intelligence conferences, identified a concerning pattern in reviews of papers submitted in 2025.
Investigation Findings
Data collected:
- Analysis of thousands of submitted reviews
- Detection of linguistic patterns typical of LLMs
- Comparison with reviews from previous years
- Use of AI-generated text detection tools
Identified signs:
- Repetitive formulaic phrases
- Standardized feedback structure
- Absence of context-specific criticism
- Generic comments that could apply to any paper
Scale of the Problem
Impact estimates:
- Significant percentage of reviews with signs of AI use
- Exponential increase compared to 2024
- Problem identified in multiple conference tracks
- Reviewers from different countries and institutions
Why This Is Concerning
Integrity of the Peer Review Process
Peer review is the foundation of modern science. Specialized researchers evaluate colleagues' work to ensure quality, identify errors and validate contributions.
Peer review functions:
- Verify methodology: Was the research conducted correctly?
- Assess originality: Does the work bring new contributions?
- Identify errors: Are there logical or mathematical flaws?
- Suggest improvements: How can the work be improved?
Problems with AI reviews:
- LLMs don't deeply understand content
- Cannot verify experiments or reproduction
- Generate generic feedback without specialized insights
- May accept or reject papers arbitrarily
Impact on Academic Career
Consequences for researchers:
- Papers unfairly rejected by superficial reviews
- Papers accepted without adequate rigor
- Inequality between those who receive human vs AI reviews
- Loss of confidence in the publication system
How AI Is Being Used
Identified Scenarios
Problematic use:
- Complete AI review: Reviewer copies the paper into ChatGPT and requests a review
- Review editing: Human review improved by AI (gray area)
- Multiple reviews: A reviewer using AI to accept more papers than they could manually
Example of generic review (typical of AI):
"This paper presents an interesting contribution to the field. The experiments are well conducted and the results are promising. I suggest the authors expand the discussion on limitations and future work. The paper is well written and well organized."
Compare with a specialized human review:
"The proof of Theorem 3.2 on page 5 assumes the distribution is i.i.d., but this contradicts the problem formulation in Section 2. Furthermore, the baselines used in Table 2 are from 2019 and there are more recent methods that should be compared, specifically [X] and [Y]. The complexity analysis also seems to ignore preprocessing cost."
Pressure on Reviewers
Why reviewers resort to AI:
- Growing volume of submissions (NeurIPS receives 10,000+ papers)
- Tight deadlines for review delivery
- Lack of compensation for review work
- Pressure to accept multiple review invitations
- Review fatigue in senior researchers
Scientific Community Response
NeurIPS Measures
Actions taken:
- Detection: Implementation of tools to identify AI reviews
- Policies: Updated guidelines for reviewers
- Consequences: Removal of reviewers who violated rules
- Transparency: Public disclosure of the problem
Community Debate
Divergent positions:
Against any AI use:
- Review is professional responsibility
- AI does not replace human expertise
- Compromises scientific integrity
Favorable to partial use:
- AI can help identify grammatical problems
- Can assist in organizing thoughts
- Human reviewers still make final evaluation
Gray area:
- Using AI to summarize long papers
- Checking references and formatting
- Translating papers from unknown languages
The Paradox of AI Reviewing AI
The Fundamental Irony
We are in a situation where:
Problematic cycle:
- Researchers use AI to write papers
- Reviewers use AI to evaluate papers
- Editors use AI to make decisions
- The "science" produced is a conversation between LLMs
Risks of this cycle:
- Loss of human critical thinking
- Research homogenization
- Model biases propagated
- Stagnation of real innovation
Philosophical Questions
Questions without easy answers:
- If an AI can review papers, are human reviewers necessary?
- What is the value of a degree if AI does intellectual work?
- How to distinguish human contribution from machine contribution?
- Is science still "made by humans, for humans"?
What This Means For Developers
Impact on Research Quality
For those who consume research:
- Accepted papers may have less rigor
- Results may not be reproducible
- Tool recommendations may be biased
- Benchmarks may be questionable
How to Evaluate Papers Now
Tips for critical readers:
- Check reproduction: Code available? Open data?
- Read methodology: Do experiments make sense?
- Compare baselines: Are they using recent methods?
- Seek second opinion: What do other researchers say?
- Trust but verify: Implement yourself when possible
Possible Solutions
Structural Changes
Proposals under discussion:
- Open review: Public reviews with reviewer name
- Compensation: Pay reviewers for their work
- Invitation limit: Restrict number of reviews per person
- Expertise verification: Ensure reviewer knows the area
- Detection tools: AI to detect AI
Technology As Solution
Tools being developed:
- AI-generated text detectors specific to academics
- Reviewer expertise verification systems
- Review platforms with auditing
- Blockchain to track review process
Cultural Change
What needs to change:
- Value quality over quantity of publications
- Recognize review as valuable work
- Reduce pressure to publish at any cost
- Educate about ethical use of AI tools
Conclusion
The discovery that a major AI conference was flooded with reviews made by artificial intelligence is a warning to the entire scientific community. The peer review system, built over centuries, faces its greatest challenge in the LLM era.
For developers who consume academic research, this means being more critical and careful when evaluating papers and their recommendations. Quality science still depends on humans committed to rigor and integrity.
If you want to understand more about how AI is transforming developers' work, I recommend checking out the article about 85% of Developers Use AI where we analyze data from JetBrains research.
Let's go! 🦅
💻 Master JavaScript for Real
The knowledge you gained in this article is just the beginning. There are techniques, patterns and practices that transform beginner developers into sought-after professionals.
Invest in Your Future
I've prepared complete material for you to master JavaScript:
Payment options:
- 1x of $4.90 no interest
- or $4.90 at sight

