AIBullisharXiv – CS AI · Apr 136/10
🧠Researchers introduce RecaLLM, a post-trained language model that addresses the 'lost-in-thought' phenomenon where retrieval performance degrades during extended reasoning chains. The model interleaves explicit in-context retrieval with reasoning steps and achieves strong performance on long-context benchmarks using training data significantly shorter than existing approaches.
AIBullisharXiv – CS AI · Mar 126/10
🧠Researchers have developed LookaheadKV, a new framework that significantly improves memory efficiency in large language models by intelligently evicting less important cached data. The method achieves superior accuracy while reducing computational costs by up to 14.5x compared to existing approaches, making long-context AI tasks more practical.
AIBullisharXiv – CS AI · Mar 36/106
🧠Researchers developed a new token reduction method for hybrid vision-language models that process long videos, achieving 3.8-4.2x speedup while retaining only 25% of visual tokens. The approach uses progressive reduction and unified scoring for both attention and Mamba blocks, maintaining near-baseline accuracy on long-context video benchmarks.
$NEAR
AIBearisharXiv – CS AI · Mar 36/104
🧠Researchers introduced SciTrek, a new benchmark for testing large language models' ability to perform numerical reasoning across long scientific documents. The benchmark reveals significant challenges for current LLMs, with the best model achieving only 46.5% accuracy at 128K tokens, and performance declining as context length increases.
$COMP
AIBullisharXiv – CS AI · Mar 36/103
🧠Researchers introduce ReMemR1, a new approach to improve large language models' ability to handle long-context question answering by integrating memory retrieval into the memory update process. The system enables non-linear reasoning through selective callback of historical memories and uses multi-level reward design to strengthen training.
AIBullisharXiv – CS AI · Mar 26/1012
🧠Researchers developed a new discriminative AI model based on Qwen3-0.6B that can efficiently segment ultra-long documents up to 13k tokens for better information retrieval. The model achieves superior performance compared to generative alternatives while delivering two orders of magnitude faster inference on the Wikipedia WIKI-727K dataset.
AIBullishHugging Face Blog · Jul 86/105
🧠SmolLM3 represents a new compact language model that combines multilingual capabilities with long-context reasoning abilities. The model appears to be designed for efficiency while maintaining strong performance across multiple languages and complex reasoning tasks.
AINeutralHugging Face Blog · Apr 166/108
🧠HELMET is a new holistic evaluation framework for assessing long-context language models across multiple dimensions and use cases. The framework aims to provide comprehensive benchmarking capabilities for AI models that can process extended text sequences.
AINeutralHugging Face Blog · Jan 234/105
🧠The article title suggests coverage of KVPress, a technique for managing long contexts in Large Language Models. However, the article body appears to be empty or unavailable, preventing detailed analysis of the content.