SHERLOCK: Towards Dynamic Knowledge Adaptation in LLM-enhanced E-commerce Risk Management
Sherlock is an AI framework that combines Large Language Models with structured domain knowledge to automate e-commerce fraud investigation and risk management. Deployed at JD.com, it achieved an 82% expert acceptance rate and 386.7% throughput increase while continuously adapting to evolving fraud tactics through a self-improving data flywheel.
Sherlock addresses a critical operational bottleneck in e-commerce: the manual, labor-intensive process of investigating fraud across fragmented data sources. By engineering LLMs to reason over a structured knowledge base rather than relying on general-purpose capabilities, the framework demonstrates how domain-specific augmentation can unlock practical AI deployment at scale. The system's architecture reveals sophisticated thinking about real-world AI challenges—retrieval-augmented generation tailored for investigative workflows, two-stage reasoning pipelines, and mechanisms for handling long-tail knowledge gaps that pure LLMs struggle with.
The results published from JD.com's production deployment carry weight because they reflect actual operational metrics, not theoretical improvements. An 82% expert acceptance rate signals genuine utility in a high-stakes domain where false positives waste investigator time and false negatives create liability. The 386.7% throughput increase suggests the framework substantially reduces human workload, freeing analysts for higher-value judgment calls.
What distinguishes Sherlock is its self-evolution mechanism—the ability to recover from performance degradation as fraud tactics shift. Traditional ML systems degrade when patterns change; Sherlock's flywheel approach combining real-time KB updates with periodic retraining demonstrated recovery after two separate adversarial drifts, pushing the acceptance rate ceiling up 3.5%. This adaptive capacity directly addresses why pure LLM systems struggle in adversarial environments where attackers actively evolve techniques.
For the broader AI-in-commerce landscape, Sherlock validates that structured knowledge integration and continuous feedback loops are essential for production AI systems facing intelligent adversaries. Organizations deploying LLMs in fraud, compliance, and risk contexts should expect similar architectural requirements to achieve reliable, defensible automation.
- →Sherlock combines LLMs with structured domain knowledge bases to automate fraud investigation, achieving 82% expert acceptance at JD.com with 386.7% throughput gains.
- →The framework uses retrieval-augmented generation and reflect-refine modules specifically engineered for multi-source case analysis rather than generic LLM prompting.
- →A self-evolving flywheel combining real-time KB updates and periodic retraining enables the system to counteract adversarial drifts that typically degrade ML performance.
- →Production deployment data shows the system successfully recovered from performance decay twice over 90 days, demonstrating practical resilience in adversarial e-commerce environments.
- →This architecture pattern—structured knowledge + LLM reasoning + feedback loops—appears critical for deploying AI in high-stakes domains with intelligent, evolving threats.