#statistical-bias News & Analysis

3 articles tagged with #statistical-bias. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles

AINeutralarXiv – CS AI · Jun 237/10

🧠

Beyond Simpson's Paradox: A Cascade of Confounders in AI Agent Pull-Request Co-Authorship

A rigorous analysis of AI coding agents reveals that apparent benefits of human co-authorship in pull requests disappear under proper statistical controls, demonstrating how Simpson's Paradox and confounding variables can mask true causal relationships in AI agent research.

🏢 Microsoft🧠 Claude

AINeutralarXiv – CS AI · May 96/10

🧠

Log-Likelihood, Simpson's Paradox, and the Detection of Machine-Generated Text

Researchers identify a critical flaw in machine-generated text detection: token-level likelihood signals vary inconsistently across a detector model's hidden space, causing Simpson's paradox that undermines existing detectors. They propose a learned local calibration method that dramatically improves detection performance, with calibrated variants achieving AUROC improvements from 0.63 to 0.85 on GPT-5.4 text.

🧠 GPT-5

AINeutralarXiv – CS AI · Apr 106/10

🧠

On the Step Length Confounding in LLM Reasoning Data Selection

Researchers identify a critical flaw in naturalness-based data selection methods for large language model reasoning datasets, where algorithms systematically favor longer reasoning steps rather than higher-quality reasoning. The study proposes two corrective methods (ASLEC-DROP and ASLEC-CASL) that successfully mitigate this 'step length confounding' bias across multiple LLM benchmarks.