AIBearisharXiv – CS AI · 3d ago7/10
🧠Researchers discovered that reflexive AI agents systematically store confident but false interpretations of tasks in their memory, a phenomenon called memory confabulation, causing them to repeat incorrect behaviors even when environments reset. The study introduces a metric to detect this failure mode and proposes programmatic solutions that significantly improve agent performance and reduce reliance on false reflective content.
AIBullisharXiv – CS AI · 4d ago7/10
🧠Researchers introduce MemCog, a new memory system for conversational AI agents that integrates memory access into the reasoning process rather than treating it as a separate tool. The system uses associative link graphs and proactive reasoning to enable agents to autonomously explore relevant information, achieving state-of-the-art performance on multiple benchmarks including a newly created ProactiveMemBench.
AIBullisharXiv – CS AI · 5d ago7/10
🧠Researchers conducted a 4-month case study embedding a persistent AI agent into a real academic research environment, tracking 75,671 telemetry records across 96 active days. The study reveals that persistent agents shift computational economics from cost-per-token to cost-per-artifact, with cache-dominant workflows achieving 82.9% token reuse efficiency.
AIBullisharXiv – CS AI · May 97/10
🧠Researchers introduce BeliefMem, a novel memory architecture for LLM agents that retains multiple candidate conclusions with associated probabilities instead of committing to single deterministic interpretations. This probabilistic approach preserves uncertainty, allows agents to update confidence as new evidence arrives, and demonstrates superior performance on LoCoMo and ALFWorld benchmarks compared to existing memory methods.
AIBullisharXiv – CS AI · May 77/10
🧠Researchers introduce Lossless Context Management (LCM), a deterministic architecture for LLM memory that outperforms Claude Code on long-context tasks up to 1M tokens. LCM combines recursive context compression with engine-managed task partitioning, representing an evolution of recursive language models that prioritizes reliability and state retrievability over flexibility.
🧠 Claude🧠 Opus
AIBearisharXiv – CS AI · Apr 147/10
🧠Researchers have developed ADAM, a novel privacy attack that exploits vulnerabilities in Large Language Model agents' memory systems through adaptive querying, achieving up to 100% success rates in extracting sensitive information. The attack highlights critical security gaps in modern LLM-based systems that rely on memory modules and retrieval-augmented generation, underscoring the urgent need for privacy-preserving safeguards.
AINeutralarXiv – CS AI · Apr 107/10
🧠Researchers introduce ATANT, an open evaluation framework designed to measure whether AI systems can maintain coherent context and continuity across time without confusing information across different narratives. The framework achieves up to 100% accuracy in isolated scenarios but drops to 96% when managing 250 simultaneous narratives, revealing practical limitations in current AI memory architectures.
AIBullisharXiv – CS AI · Apr 77/10
🧠MemMachine is an open-source memory system for AI agents that preserves conversational ground truth and achieves superior accuracy-efficiency tradeoffs compared to existing solutions. The system integrates short-term, long-term episodic, and profile memory while using 80% fewer input tokens than comparable systems like Mem0.
🧠 GPT-4🧠 GPT-5
AIBullisharXiv – CS AI · Apr 67/10
🧠Researchers present Opal, a private memory system for personal AI that uses trusted hardware enclaves and oblivious RAM to protect user data privacy while maintaining query accuracy. The system achieves 13 percentage point improvement in retrieval accuracy over semantic search and 29x higher throughput with 15x lower costs than secure baselines.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce SuperLocalMemory V3, a new mathematical framework for AI agent memory systems using information geometry and sheaf theory. The system achieves 87.7% accuracy with cloud augmentation and offers a zero-LLM configuration that complies with EU AI Act data sovereignty requirements.
AINeutralarXiv – CS AI · Mar 127/10
🧠Researchers propose treating multi-agent AI memory as a computer architecture problem, introducing a three-layer memory hierarchy and identifying critical protocol gaps. The paper highlights multi-agent memory consistency as the most pressing challenge for building scalable collaborative AI systems.
AIBullisharXiv – CS AI · Mar 56/10
🧠Researchers have developed AriadneMem, a new memory system for long-horizon LLM agents that addresses challenges in maintaining accurate memory under fixed context budgets. The system uses a two-phase pipeline with entropy-aware gating and conflict-aware coarsening to improve multi-hop reasoning while reducing runtime by 77.8% and using only 497 context tokens.
🧠 GPT-4
AINeutralarXiv – CS AI · Mar 56/10
🧠Researchers introduce LifeBench, a new AI benchmark that tests long-term memory systems by requiring integration of both declarative and non-declarative memory across extended timeframes. Current state-of-the-art memory systems achieve only 55.2% accuracy on this challenging benchmark, highlighting significant gaps in AI's ability to handle complex, multi-source memory tasks.
AIBullisharXiv – CS AI · Mar 56/10
🧠PRAM-R introduces a new AI framework for autonomous driving that uses LLM-guided modality routing to adaptively select sensors based on environmental conditions. The system achieves 6.22% modality reduction while maintaining trajectory accuracy, demonstrating efficient resource management in multimodal perception systems.
AIBullisharXiv – CS AI · Mar 56/10
🧠Researchers propose PlugMem, a task-agnostic plugin memory module for LLM agents that structures episodic memories into knowledge-centric graphs for efficient retrieval. The system consistently outperforms existing memory designs across multiple benchmarks while maintaining transferability between different tasks.
AIBullisharXiv – CS AI · Mar 57/10
🧠Researchers developed ELMUR, a new AI architecture that uses external memory to help robots make better decisions over extremely long time periods. The system achieved 100% success on tasks requiring memory of up to one million steps and nearly doubled performance on robotic manipulation tasks compared to existing methods.
AIBullisharXiv – CS AI · Mar 46/106
🧠SuperLocalMemory is a new privacy-preserving memory system for multi-agent AI that defends against memory poisoning attacks through local-first architecture and Bayesian trust scoring. The open-source system eliminates cloud dependencies while providing personalized retrieval through adaptive learning-to-rank, demonstrating strong performance metrics including 10.6ms search latency and 72% trust degradation for sleeper attacks.
AINeutralarXiv – CS AI · Mar 37/105
🧠Researchers introduce 'agentic unlearning' through Synchronized Backflow Unlearning (SBU), a framework that removes sensitive information from both AI model parameters and persistent memory systems. The method addresses critical gaps in existing unlearning techniques by preventing cross-pathway recontamination between memory and parameters.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduce LightMem, a new memory system for Large Language Models that mimics human memory structure with three stages: sensory, short-term, and long-term memory. The system achieves up to 7.7% better QA accuracy while reducing token usage by up to 106x and API calls by up to 159x compared to existing methods.
AIBullisharXiv – CS AI · Feb 277/107
🧠Researchers have released LLMServingSim 2.0, a unified simulator that models the complex interactions between heterogeneous hardware and disaggregated software in large language model serving infrastructures. The simulator achieves 0.97% average error compared to real deployments while maintaining 10-minute simulation times for complex configurations.
$NEAR
AIBullisharXiv – CS AI · Feb 277/105
🧠Researchers introduce U-Mem, an autonomous memory agent system that actively acquires and validates knowledge for large language models. The system uses cost-aware knowledge extraction and semantic Thompson sampling to improve performance, showing significant gains on benchmarks like HotpotQA and AIME25.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers introduce S3MEM, a structured memory framework that improves how AI agents retrieve and answer questions about long trajectory histories. The system outperforms standard retrieval-augmented generation by organizing trajectories into scene-event units and using anchor-sensitive retrieval, achieving better accuracy with fewer tokens across multiple interactive environments.
AIBullisharXiv – CS AI · 3d ago6/10
🧠Researchers introduce PersonaAgent, a personalized LLM agent framework that moves beyond one-size-fits-all AI systems by integrating personalized memory and action modules. The system uses individual user personas as prompts that dynamically adapt through real-time preference alignment, demonstrating improved performance in delivering tailored user experiences.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers introduce a benchmark for evaluating how AI systems handle conflicting information across multiple memory sources, addressing a critical gap in testing personal AI agents. The study compares various approaches including fusion methods and LLMs, revealing that trained fusion models outperform prompt-based LLMs by 10+ percentage points on accuracy, with selective abstention improving performance further.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers introduce 'Behavioral Specification,' a compressed interpretive layer that captures user preferences more accurately than raw data or extracted facts, achieving 25x context reduction while improving AI alignment on interpretation-heavy tasks. The work establishes 'representational accuracy' as a distinct metric from recall, demonstrating that faithful user representation is critical for human-AI alignment across diverse populations.