AIBullisharXiv – CS AI · 5h ago7/10
🧠
Zero-Shot Embedding Drift Detection: A Lightweight Defense Against Prompt Injections in LLMs
Researchers introduce Zero-Shot Embedding Drift Detection (ZEDD), a lightweight defense mechanism that detects prompt injection attacks on large language models by measuring semantic shifts in embedding space. The method achieves over 93% accuracy with less than 3% false positives across multiple LLM architectures without requiring model access or task-specific training.
🧠 Llama