#data-privacy News & Analysis

36 articles tagged with #data-privacy. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

36 articles

AIBearisharXiv – CS AI · 2d ago7/10

🧠

Finding DoRI: Discovery of Retained Images in Diffusion Models

Researchers challenge the assumption that memorization in text-to-image diffusion models can be localized to specific weights, demonstrating that pruning efforts can be bypassed through minor text embedding perturbations. The study reveals memorization is distributed throughout embedding space, suggesting current mitigation strategies are fundamentally fragile and requiring new approaches to protect training data privacy.

AINeutralarXiv – CS AI · 4d ago7/10

🧠

ICCU: In-Context Continual Unlearning via Pattern-Induced Refusal Rules

Researchers introduce ICCU, an in-context continual unlearning framework that removes specific data influence from language models without modifying parameters. The method uses pattern-induced refusal rules applied at inference time, addressing the inefficiency of sequential unlearning requests in production deployments.

AIBearisharXiv – CS AI · 4d ago7/10

🧠

Pretraining Data Exposure in Large Language Models: A Survey of Membership Inference, Data Contamination, and Security Implications

A comprehensive survey examines Pretraining Data Exposure (PDE) in large language models, unifying two previously isolated research areas—membership inference and data contamination—to assess whether specific data appeared in LLM training datasets. The work formalizes exposure levels, reviews attack and defense mechanisms, and highlights privacy and evaluation integrity risks as model sizes and training data scales continue to grow.

AIBearisharXiv – CS AI · May 17/10

🧠

Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors

Researchers demonstrate a novel attack that steals sensitive secrets (API keys, personal identifiers, financial records) from locally fine-tuned language models by embedding malicious code in model architectures. The attack achieves over 98% success rate and bypasses current defense mechanisms including differential privacy and code auditing, exposing a critical supply-chain vulnerability in AI model development.

AIBearisharXiv – CS AI · Apr 147/10

🧠

Powerful Training-Free Membership Inference Against Autoregressive Language Models

Researchers have developed EZ-MIA, a training-free membership inference attack that dramatically improves detection of memorized data in fine-tuned language models by analyzing probability shifts at error positions. The method achieves 3.8x higher detection rates than previous approaches on GPT-2 and demonstrates that privacy risks in fine-tuned models are substantially greater than previously understood.

🧠 Llama

AI × CryptoBullishCrypto Briefing · Apr 107/10

🤖

Illia Polosukhin: Traditional AI services expose sensitive data, crypto simplifies global payments, and AI will redefine computing interfaces | Bankless

Illia Polosukhin argues that AI will fundamentally reshape computing interfaces, potentially obsoleting traditional operating systems, while blockchain technology provides the security layer necessary for this integration. He contends that traditional AI services expose user data vulnerabilities, whereas cryptocurrency enables more secure global payments and decentralized infrastructure.

AINeutralarXiv – CS AI · Mar 57/10

🧠

Why Do Unlearnable Examples Work: A Novel Perspective of Mutual Information

Researchers propose a new method called Mutual Information Unlearnable Examples (MI-UE) to protect data privacy by preventing unauthorized AI models from learning from scraped data. The approach uses mutual information theory to create more effective data poisoning techniques that impede deep learning model generalization.

AIBullisharXiv – CS AI · Mar 56/10

🧠

PRIVATEEDIT: A Privacy-Preserving Pipeline for Face-Centric Generative Image Editing

Researchers have developed PRIVATEEDIT, a privacy-preserving pipeline for face-centric image editing that keeps biometric data on-device rather than uploading to third-party services. The system uses local segmentation and masking to separate identity-sensitive regions from editable content, allowing high-quality editing while maintaining user control over facial data.

AIBearishDecrypt – AI · Mar 57/10

🧠

Inside the Ray-Ban Smart Glasses Controversy Plaguing Meta

Meta's Ray-Ban smart glasses are under investigation due to privacy concerns regarding the collection and use of sensitive footage. Regulators and privacy advocates are raising significant concerns about the potential misuse of data captured through the wearable technology.

AIBearishArs Technica – AI · Feb 237/106

🧠

AIs can generate near-verbatim copies of novels from training data

Research reveals that large language models (LLMs) can reproduce near-exact copies of novels and other content from their training datasets, indicating these AI systems memorize significantly more training data than previously understood. This discovery raises important concerns about copyright infringement, data privacy, and the extent of memorization in AI training processes.

$NEAR

AINeutralBlockonomi · 1d ago6/10

🧠

OpenAI’s ChatGPT Now Offers Bank Account Integration — Is It Safe?

OpenAI has introduced bank account integration for ChatGPT Pro users through Plaid, enabling budget tracking and financial advice features. The development raises critical questions about data security and privacy implications when AI systems gain access to sensitive financial information.

🏢 OpenAI🧠 ChatGPT

AINeutralarXiv – CS AI · May 116/10

🧠

SHRED: Retain-Set-Free Unlearning via Self-Distillation with Logit Demotion

Researchers introduce SHRED, a machine unlearning method for large language models that removes memorized private or copyrighted data without requiring a curated retain set of examples. By selectively demoting logits of high-information tokens while preserving model utility through self-distillation, SHRED achieves superior trade-offs between forgetting efficacy and performance compared to existing retain-set-dependent approaches.

AIBullishCrypto Briefing · May 96/10

🧠

David Moscatelli: Organizations are hesitant about public AI due to privacy concerns, local AI solutions are preferred in banking and healthcare, and the Go One device enhances on-premises AI scalability | TWIST

Go Abacus introduces the Go One device, a $250,000 on-premises AI solution designed to address privacy concerns in regulated industries like banking and healthcare. The device enables organizations to deploy and scale AI locally rather than relying on public cloud services, reflecting a broader market shift toward data sovereignty in sensitive sectors.

AIBullishCrypto Briefing · Apr 216/10

🧠

Josh Sirota: AI models must update frequently for business effectiveness, local hardware enhances data privacy, and proprietary solutions address task inefficiencies | TWIST

Josh Sirota discusses three critical trends in enterprise AI: the necessity for frequent model updates to maintain business relevance, the privacy advantages of deploying AI on local hardware rather than cloud infrastructure, and the value of proprietary solutions in solving specific task inefficiencies. These insights highlight a shift toward decentralized, privacy-first AI deployments in enterprise environments.

AINeutralarXiv – CS AI · Apr 136/10

🧠

TRU: Targeted Reverse Update for Efficient Multimodal Recommendation Unlearning

Researchers propose TRU (Targeted Reverse Update), a machine unlearning framework designed to efficiently remove user data from multimodal recommendation systems without full retraining. The method addresses non-uniform data influence across ranking behavior, modality branches, and network layers through coordinated interventions, achieving better performance than existing approximate unlearning approaches.

AIBearishCrypto Briefing · Apr 107/10

🧠

Mark Suman: AI systems can understand human thought patterns better than we do, the rapid pace of AI development outstrips ethical considerations, and the opacity of AI companies raises serious privacy concerns | The Peter McCormack Show

Mark Suman discusses concerns that AI systems may understand human thought patterns better than humans themselves understand them, while the rapid pace of AI development outpaces ethical frameworks and regulatory considerations. The opacity of AI companies raises significant privacy concerns that demand urgent attention from policymakers and industry stakeholders.

AINeutralarXiv – CS AI · Apr 106/10

🧠

Machine Unlearning in the Era of Quantum Machine Learning: An Empirical Study

Researchers present the first empirical study of machine unlearning in hybrid quantum-classical neural networks, adapting classical unlearning methods to quantum settings and introducing quantum-specific strategies. The study reveals that quantum models can effectively support unlearning, with performance varying based on circuit depth and entanglement structure, establishing baseline insights for privacy-preserving quantum machine learning systems.

AIBullisharXiv – CS AI · Mar 266/10

🧠

PLACID: Privacy-preserving Large language models for Acronym Clinical Inference and Disambiguation

Researchers developed PLACID, a privacy-preserving system using small on-device AI models (2B-10B parameters) for clinical acronym disambiguation in healthcare settings. The cascaded approach combines general-purpose models for detection with domain-specific biomedical models, achieving 81% expansion accuracy while keeping sensitive health data local.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Computation and Communication Efficient Federated Unlearning via On-server Gradient Conflict Mitigation and Expression

Researchers propose FOUL (Federated On-server Unlearning), a new framework for efficiently removing specific participants' data from federated learning models without accessing client data. The approach reduces computational and communication costs while maintaining privacy compliance through a two-stage process that performs unlearning operations on the server side.

AINeutralarXiv – CS AI · Mar 176/10

🧠

PMAx: An Agentic Framework for AI-Driven Process Mining

Researchers have developed PMAx, an autonomous AI framework that democratizes process mining by allowing business users to analyze organizational workflows through natural language queries. The system uses a multi-agent architecture with local execution to ensure data privacy and mathematical accuracy while eliminating the need for specialized technical expertise.

AIBullisharXiv – CS AI · Mar 166/10

🧠

Stake the Points: Structure-Faithful Instance Unlearning

Researchers propose a new "structure-faithful" framework for machine unlearning that preserves semantic relationships in AI models while removing specific data. The method uses semantic anchors to maintain knowledge structure, showing significant performance improvements of 19-33% across image classification, retrieval, and face recognition tasks.

AIBullisharXiv – CS AI · Mar 96/10

🧠

Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence

This research survey examines Federated Learning (FL), a distributed machine learning approach that enables collaborative AI model training without centralizing sensitive data. The paper covers FL's technical challenges, privacy mechanisms, and applications across healthcare, finance, and IoT systems.

AIBearishDecrypt – AI · Mar 46/101

🧠

Before You Quit ChatGPT, Do This to Take Your Data With You

The 'QuitGPT' movement has reached 2.5 million pledges as users move away from ChatGPT. The article provides guidance on how users can export and preserve their data before deleting their ChatGPT accounts.

AINeutralarXiv – CS AI · Mar 36/107

🧠

Challenges in Enabling Private Data Valuation

Researchers identify fundamental conflicts between data privacy and data valuation methods used in AI training. The study shows that differential privacy requirements often destroy the fine-grained distinctions needed for effective data valuation, particularly for rare or influential examples.

AIBullisharXiv – CS AI · Mar 37/107

🧠

ROKA: Robust Knowledge Unlearning against Adversaries

Researchers introduce ROKA, a new machine unlearning method that prevents knowledge contamination and indirect attacks on AI models. The approach uses 'Neural Healing' to preserve important knowledge while forgetting targeted data, providing theoretical guarantees for knowledge preservation during unlearning.

Page 1 of 2Next →