#model-protection News & Analysis

7 articles tagged with #model-protection. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

7 articles

AIBullisharXiv – CS AI · Jun 197/10

🧠

From Construction to Injection: Edit-Based Fingerprints for Large Language Models

Researchers propose a novel fingerprinting framework for large language models that combines Code-mixing Fingerprints (CF) and Multi-Candidate Editing (MCEdit) to protect against unauthorized redistribution and commercial misuse. The approach addresses key vulnerabilities in existing fingerprinting methods by balancing imperceptibility with robustness against defensive filtering and downstream model modifications.

🏢 Perplexity

AIBullisharXiv – CS AI · Jun 97/10

🧠

FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint

Researchers introduce FIT-Print, a new model fingerprinting technique that defends against false ownership claims on AI models by using targeted signatures rather than arbitrary outputs. The method achieves 100% success in preventing fraudulent ownership assertions while maintaining perfect legitimate verification rates, addressing a critical vulnerability in existing intellectual property protection mechanisms for machine learning models.

AIBearisharXiv – CS AI · May 287/10

🧠

Position: Retire the "Positive Backdoor" Label -- Secret Alignment Requires Strict and Systematic Evaluation

A research position paper argues the AI/ML community should abandon the "positive backdoor" terminology and instead rigorously evaluate trigger-activated hidden behaviors as "Secret Alignment." Researchers found that existing implementations show significant brittleness in security properties, particularly in confidentiality, integrity, and availability—revealing that protective claims lack standardized evaluation frameworks.

AIBearisharXiv – CS AI · Apr 147/10

🧠

Beyond A Fixed Seal: Adaptive Stealing Watermark in Large Language Models

Researchers have developed Adaptive Stealing (AS), a novel watermark stealing algorithm that exploits vulnerabilities in LLM watermarking systems by dynamically selecting optimal attack strategies based on contextual token states. This advancement demonstrates that existing fixed-strategy watermark defenses are insufficient, highlighting critical security gaps in protecting proprietary LLM services and raising urgent questions about watermark robustness.

AINeutralarXiv – CS AI · Apr 206/10

🧠

Protecting Language Models Against Unauthorized Distillation through Trace Rewriting

Researchers propose trace rewriting techniques to protect language models from unauthorized knowledge distillation, a process where smaller models learn from larger ones without permission. The methods preserve model accuracy while degrading distillation usefulness and embedding detectable watermarks in student models.

AINeutralarXiv – CS AI · Mar 126/10

🧠

RandMark: On Random Watermarking of Visual Foundation Models

Researchers propose RandMark, a new method for watermarking visual foundation models to protect intellectual property rights. The approach uses a small encoder-decoder network to embed random digital watermarks into internal representations, enabling ownership verification with low false detection rates.

AIBullisharXiv – CS AI · Mar 66/10

🧠

Authorize-on-Demand: Dynamic Authorization with Legality-Aware Intellectual Property Protection for VLMs

Researchers propose AoD-IP, a new framework for protecting intellectual property in vision-language models through dynamic authorization and legality-aware assessment. The system allows flexible, user-controlled authorization that can adapt to changing deployment scenarios while preventing unauthorized use of valuable AI models.