AIBearisharXiv – CS AI · 5d ago7/10
🧠Researchers present a comprehensive OS-centered privacy framework arguing that local AI processing alone does not guarantee privacy, as on-device models can still aggregate sensitive data, retain embeddings, invoke cloud services, and emit telemetry. The framework provides a threat model, risk taxonomy, and audit rubric, demonstrating that meaningful privacy depends on constrained information flow, bounded authority, and auditable governance rather than deployment location.
🧠 Gemini
AIBearisharXiv – CS AI · Jun 57/10
🧠Researchers propose a bilayer SIR epidemic model to analyze how synthetic data contamination spreads across AI systems when models train on each other's outputs. Through theoretical analysis, simulations, and GPT-2 experiments, they demonstrate that cross-contamination can sustain itself (R₀ > 1) and identify detection-based filtering as the most effective intervention strategy.
AIBullisharXiv – CS AI · Jun 27/10
🧠GuidaPA is a privacy-preserving chatbot for Italian public administration that uses federated learning to train on sensitive documentation without centralizing data. The system achieves comparable performance to traditional centralized fine-tuning while keeping sensitive data distributed across agency servers, demonstrating federated learning's viability for regulated institutional deployments.
AIBullisharXiv – CS AI · May 297/10
🧠Researchers introduce LLUMI, an open-source LLM system for mental health support that uses community feedback from Reddit to improve response quality without relying on proprietary cloud models. The approach achieves comparable performance to GPT models while offering better privacy protection for sensitive health contexts.
AIBullisharXiv – CS AI · May 97/10
🧠Researchers present a layered security architecture for multitenant enterprise AI systems that isolates data and controls access in retrieval-augmented generation (RAG) and agentic AI deployments. The approach separates security-critical operations to the server while preventing cross-tenant data leakage, validated through an open-source OGX framework with negligible performance overhead.
🏢 OpenAI
AIBullisharXiv – CS AI · Apr 147/10
🧠Researchers introduce Context Kubernetes, an architecture that applies container orchestration principles to managing enterprise knowledge in AI agent systems. The system addresses critical governance, freshness, and security challenges, demonstrating that without proper controls, AI agents leak data in over 26% of queries and serve stale content silently.
AIBullishCrypto Briefing · 4d ago6/10
🧠OpenAI has acquired Ona, a company specializing in secure cloud execution technology, to integrate its capabilities into Codex. This acquisition aims to address enterprise concerns around security and data governance, potentially accelerating Codex adoption in corporate environments where these considerations are critical.
🏢 OpenAI
AIBearishThe Verge – AI · 5d ago6/10
🧠Microsoft has restricted employee access to Anthropic's newly released Claude Fable 5 model due to data retention concerns, while making it available to external GitHub Copilot and Azure customers. The restriction stems from Anthropic's new data retention requirements conflicting with Microsoft's Zero Data Retention (ZDR) policy for internal tools.
🏢 Anthropic🏢 Microsoft🧠 Claude
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers introduce SlideCheck, a data guidance tool for pathology foundation models that uses frozen model features to score and curate pretraining datasets. The system provides abnormality and malignancy scores to help organize and audit WSI-derived patch data, demonstrating that controlled dataset composition significantly influences downstream self-supervised learning outcomes.
AINeutralarXiv – CS AI · Jun 86/10
🧠Researchers introduce REMEDI, a benchmark for evaluating machine unlearning methods in clinical disease inference using real patient data from MIMIC-III. The study reveals fundamental trade-offs between model utility and data removal effectiveness, with existing unlearning techniques proving poorly suited for multi-label medical classification tasks.
GeneralNeutralCrypto Briefing · Jun 16/10
📰European cloud providers are rallying behind the EU's cloud sovereignty initiative, which aims to reduce the continent's dependence on US technology giants like AWS, Microsoft Azure, and Google Cloud. The push could fundamentally reshape Europe's tech market by strengthening local competitors and limiting American tech dominance in the region.
AINeutralarXiv – CS AI · Jun 16/10
🧠Researchers propose Gap-K%, a novel method for detecting whether text was part of an LLM's pretraining data by analyzing the probability gap between a model's top prediction and the actual target token. The technique outperforms existing approaches on standard benchmarks and addresses critical privacy and copyright concerns surrounding the opaque datasets used to train large language models.
AINeutralDecrypt – AI · May 256/10
🧠Pope Leo released the Catholic Church's first AI encyclical, a 245-paragraph document asserting that data constitutes a common good and rejecting the notion that technology is morally neutral. The document was presented alongside Anthropic co-founder Christopher Olah, whose AI company is currently engaged in litigation against the Trump administration over military AI applications.
🏢 Anthropic
AIBullisharXiv – CS AI · May 126/10
🧠Researchers have developed GLiNER2-PII, a compact 0.3B-parameter multilingual model for detecting personally identifiable information across 42 entity types at character-level precision. Trained on a synthetic corpus of 4,910 annotated texts to overcome privacy constraints in real data collection, the model outperforms existing systems including OpenAI's Privacy Filter on benchmark evaluations and is now publicly available on Hugging Face.
🏢 OpenAI🏢 Hugging Face
AIBullishOpenAI News · May 66/10
🧠OpenAI has implemented privacy safeguards in ChatGPT's training process, allowing users to control whether their conversations contribute to model improvement while minimizing personal data retention. The approach addresses growing privacy concerns around AI model training without compromising the system's ability to learn from diverse data sources.
🧠 ChatGPT
AIBullishMIT Technology Review · May 16/10
🧠Companies are increasingly taking control of their own data to customize AI systems for specific needs, creating a new paradigm of data sovereignty. The challenge involves balancing proprietary data ownership with the requirement for safe, high-quality data flows that enable reliable AI insights. MIT Technology Review's EmTech AI conference explores how AI factories achieve scalability while maintaining governance standards.
AINeutralarXiv – CS AI · Apr 156/10
🧠Researchers introduce PrivacyReasoner, an LLM-based agent architecture that reconstructs individual privacy perspectives from online comment history to predict how specific people would perceive data practices. The system outperforms baseline models in predicting privacy concerns across AI, e-commerce, and healthcare domains by contextually activating relevant privacy beliefs.
AIBullisharXiv – CS AI · Apr 146/10
🧠Researchers introduce AdaQE-CG, a framework that automatically generates model and data cards for AI systems with improved accuracy and completeness. The approach combines dynamic query expansion to extract information from papers with cross-card knowledge transfer to fill gaps, accompanied by MetaGAI-Bench, a new benchmark for evaluating documentation quality.
🏢 Meta🏢 Hugging Face
AIBullisharXiv – CS AI · Apr 146/10
🧠A research paper proposes a comprehensive policy framework for India to address fragmentation in biomedical data sharing by aligning institutional incentives around AI and digital health. The framework recommends recognizing data curation in academic promotions, incorporating open data metrics into institutional rankings, and implementing Shapley Value-based revenue sharing in federated learning—while navigating India's 2023 data protection regulations.
AIBullishOpenAI News · Feb 56/106
🧠OpenAI announces the introduction of data residency capabilities in Europe, expanding their enterprise-grade data privacy and security offerings. This development builds upon their existing compliance programs designed to support customers globally with enhanced data governance requirements.
AIBullishFortune Crypto · Mar 105/10
🧠Financial software company Datarails is launching a new FinanceOS product to proactively disrupt its own business model with AI before competitors do. The company is positioning data and financial model governance as its key competitive advantage in an AI-driven financial analysis landscape.