What Your Posts Reveal: A Benchmark and Agentic Framework for User-Level Privacy Leakage on Social Media
Researchers introduce SopriBench, a synthetic benchmark and Argus framework for detecting cumulative privacy leakage from social media posts. The work addresses gaps in multimodal privacy research by analyzing how scattered cues across text, images, and metadata can collectively expose sensitive user information like location and routines.
This research tackles a critical vulnerability in social media privacy that extends beyond individual post sensitivity. The study reveals that harmless information fragments—a coffee shop mention, a sunset photo, metadata timestamps—aggregate into comprehensive user dossiers when analyzed collectively. This cumulative leakage pattern represents a significant gap in current privacy protection mechanisms that typically evaluate posts in isolation.
The introduction of SopriBench with 50 synthetic user profiles and 1,569 images provides the first unified benchmark for multimodal privacy leakage evaluation. More importantly, the Privacy Exposure Score (PES) metric moves beyond crude binary accuracy measures to quantify exposure severity with contextual weighting. This framework acknowledges that revealing an exact home address carries different privacy implications than revealing a general neighborhood.
Argus, the agentic inference framework, demonstrates practical concern by achieving 25% improvement over existing baselines through cross-post evidence aggregation. The framework's ability to form and verify hypotheses mirrors adversarial attack patterns that sophisticated threat actors could employ against any social media user.
For platform developers and privacy-conscious users, this research underscores the inadequacy of current privacy controls that operate at the individual-post level. The findings suggest that true privacy protection requires understanding user profiles holistically rather than evaluating content discretely. Users should recognize that seemingly innocuous details across multiple posts create exploitable patterns, while platform engineers face pressure to implement cross-post privacy analysis into their safety systems.
- →Cumulative privacy leakage from scattered social media cues can expose sensitive user information like home addresses and routines more effectively than any single post
- →The new Privacy Exposure Score metric weights data granularity by contextual sensitivity, providing more nuanced privacy impact measurement than binary accuracy
- →Argus framework achieves 25% improvement over baseline methods by aggregating cross-post evidence through abductive reasoning patterns
- →Current social media privacy protections fail to account for multimodal inference across images, text, and metadata combinations
- →The research reveals that adversaries can reconstruct detailed user profiles from publicly available posts analyzed collectively rather than individually