🧠 AI⚪ NeutralImportance 6/10

When Eyes Betray AI: Social Gaze Consistency as a Semantic Cue for AI-Generated Image Detection

arXiv – CS AI|Kim Jihyeon, Sohee Kim, Soosan Lee, Souhwan Jung, James Matthew Rehg, Hyesong Choi|May 27, 2026 at 04:00 AM

🤖AI Summary

Researchers introduce Social Gaze Consistency as a novel method to detect AI-generated images by analyzing the coherence of eye direction and head-eye alignment between people. The technique achieves meaningful improvements in detection accuracy across multiple vision models, suggesting that high-level semantic features offer advantages over traditional low-level artifact detection as generative models become more sophisticated.

Analysis

This research addresses a critical gap in AI-generated image detection as generative models increasingly eliminate traditional pixel-level fingerprints. The team's discovery that gaze consistency—how naturally eyes align and interact between people—serves as a reliable detection axis represents a paradigm shift in forensic AI. Rather than chasing low-level artifacts that models can easily learn to fake, the researchers identified a semantic property that remains difficult to synthesize convincingly, particularly in multi-person interactions.

The methodology demonstrates sophistication through three key innovations: controlled datasets preventing memorization shortcuts, caption supervision maintaining reasoning consistency, and validation across different model architectures. The consistent performance gains—3.7 percentage points on interaction subsets and 1.3 points on person subsets—prove the approach generalizes beyond specific generators. This backbone-agnostic property matters because it suggests the detection principle captures genuine constraints in how diffusion models handle interpersonal dynamics.

The implications extend beyond academic interest. As deepfakes and manipulated media become increasingly difficult to distinguish through traditional means, developing high-level semantic detection axes becomes critical for content verification platforms, social media companies, and regulatory bodies. The finding that training on a single inpainter (FLUX.1-Fill) transfers to multiple generator suites indicates the method captures fundamental limitations rather than generator-specific quirks.

Looking forward, adversarial researchers will likely attempt to incorporate gaze consistency constraints into generative models. The field's evolution toward semantic-level detection suggests an escalating arms race where both detection and generation technology must address increasingly sophisticated behavioral coherence. The promised code release will accelerate community testing and refinement of these detection principles.

Key Takeaways

→Social gaze consistency between interacting people provides a reliable semantic cue for detecting AI-generated images orthogonal to traditional low-level artifact detection.
→The technique improves detection accuracy across different vision model architectures, demonstrating backbone-agnostic applicability rather than reliance on generator-specific fingerprints.
→Block-Compositional Caption Supervision and controlled pair-level dataset design prevent models from memorizing generator shortcuts while learning genuine semantic inconsistencies.
→Detection trained on a single inpainter (FLUX.1-Fill) transfers effectively to multiple generator suites, suggesting the method captures fundamental diffusion model limitations.
→This research represents a paradigm shift toward high-level semantic detection axes as generative models eliminate traditional pixel-level forensic artifacts.

#ai-detection #deepfake-detection #generative-models #image-forensics #computer-vision #semantic-detection #diffusion-models

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

When Eyes Betray AI: Social Gaze Consistency as a Semantic Cue for AI-Generated Image Detection

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge