AIBullisharXiv – CS AI · Jun 57/10
🧠Researchers introduce RARO, a new training method that enables Large Language Models to develop strong reasoning capabilities using only expert demonstrations, without requiring task-specific verifiers. The approach uses adversarial learning between a policy and critic to achieve significant performance improvements across multiple reasoning tasks.
AIBullisharXiv – CS AI · Jun 27/10
🧠Sherlock is an AI framework that combines Large Language Models with structured domain knowledge to automate e-commerce fraud investigation and risk management. Deployed at JD.com, it achieved an 82% expert acceptance rate and 386.7% throughput increase while continuously adapting to evolving fraud tactics through a self-improving data flywheel.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers have developed a new approach called Model Predictive Adversarial Imitation Learning that combines inverse reinforcement learning with model predictive control to enable AI agents to learn from incomplete human demonstrations. The method shows significant improvements in sample efficiency, generalization, and robustness compared to traditional imitation learning approaches.
AIBullisharXiv – CS AI · Feb 277/106
🧠Researchers have developed DAIL (Discovered Adversarial Imitation Learning), the first meta-learned AI algorithm that uses LLM-guided evolutionary methods to automatically discover reward assignment functions for training AI agents. This breakthrough addresses stability issues in adversarial imitation learning and demonstrates superior performance compared to human-designed approaches across different environments.
AINeutralarXiv – CS AI · Jun 56/10
🧠Researchers propose an adversarial framework for developing safer robot systems by simulating hazardous scenarios through competing AI agents—one creating dangerous situations and another refining safety policies to prevent them. This approach aims to efficiently identify edge cases and high-risk failures that traditional random testing misses, advancing safety standards for physical AI systems in real-world environments.
AINeutralarXiv – CS AI · Jun 25/10
🧠TabChange is a new machine learning approach for modifying individual attributes in tabular datasets while maintaining data naturalness and minimizing unintended changes. The method analyzes attribute relationships and uses adversarial techniques to remove latent information about target attributes, producing more valid counterfactuals than existing generative models.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers have resolved a longstanding open problem in robust dynamic pricing by developing a binary search variant that achieves decoupled regret bounds of O(C + log T) when corruption is known and O(C + log² T) when unknown, significantly improving upon the previous O(C log log T) bound from 2025.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers introduce Repeated Deceptive Path Planning (RDPP), a framework addressing how agents can conceal destinations from learning adversaries who adapt over time. The proposed Deceptive Meta Planning (DeMP) algorithm uses two-level optimization to sustain deception against evolving observers, outperforming existing static-observer approaches while maintaining reasonable path costs.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers propose FedRio, a federated learning framework that enables social media platforms to collaboratively detect bot accounts without sharing raw user data. The system uses graph neural networks, adversarial learning, and reinforcement learning to improve bot detection accuracy while maintaining privacy across heterogeneous platform architectures.