SAGE: An LLM-driven Self Reflective Agentic Framework for Fraud Detection
SAGE is a new LLM-driven multi-agent framework that combines large language models with a Data Diagnostic Tree and reinforcement learning to detect fraud in payment and e-commerce systems. The framework achieves 40.86% F1 improvement over baselines while maintaining interpretability for risk managers, addressing key limitations of existing machine learning and graph neural network approaches.
SAGE represents a significant advancement in applying large language models to the practical challenge of fraud detection, where traditional machine learning systems struggle with competing demands. The framework's core innovation lies in coordinating three specialized agents that operate on a Data Diagnostic Tree structure, allowing the system to make interpretable decisions while optimizing for fraud-specific metrics like precision and recall rather than generic accuracy. This is particularly important because fraud detection operates under extreme class imbalance—fraudulent transactions represent a tiny fraction of overall traffic—making traditional ML metrics misleading.
The research builds on growing recognition that LLMs offer advantages beyond text processing, including semantic reasoning about complex patterns and the ability to generate explanations for individual decisions. Previous approaches sacrificed either accuracy (automated ML systems), interpretability (graph neural networks), or fraud-specific optimization (general-purpose LLM agents). SAGE's multi-agent architecture with natural-language guided Markov decision processes suggests a hybrid approach where language models serve as reasoning engines rather than just classifiers.
The experimental results—winning 96% of method-dataset comparisons across five fraud datasets and five LLM backbones—indicate the framework generalizes effectively across different fraud types and model architectures. For financial institutions and e-commerce platforms, this combination of accuracy, robustness under class imbalance, and human-interpretable decisions addresses real operational requirements. Risk managers need to understand why transactions are flagged, not just receive black-box scores. The open-source release enables rapid adoption and testing in production environments, potentially accelerating the deployment of LLM-based fraud systems across the industry.
- →SAGE achieves 40.86% F1 improvement over baselines while maintaining explainability through natural language reasoning.
- →Multi-agent architecture with Data Diagnostic Trees enables fraud-specific optimization rather than generic accuracy metrics.
- →Framework demonstrates 96% win rate across five fraud datasets and five different LLM backbones.
- →Interpretability for risk managers addresses critical gap where existing systems sacrifice explainability for accuracy.
- →Open-source code release accelerates potential real-world deployment in payment and e-commerce fraud detection.