AIBullisharXiv โ CS AI ยท 5h ago7/10
๐ง
Sentra-Guard: A Real-Time Multilingual Defense Against Adversarial LLM Prompts
Researchers introduce Sentra-Guard, a real-time defense system that detects and mitigates jailbreak and prompt injection attacks on large language models with 99.96% accuracy. The multilingual framework combines FAISS-indexed semantic embeddings with fine-tuned transformers and human-in-the-loop feedback, significantly outperforming existing defenses like LlamaGuard-2 and OpenAI Moderation.
๐ข OpenAI