y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#question-answering News & Analysis

32 articles tagged with #question-answering. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

32 articles
AIBullisharXiv – CS AI · Mar 36/1010
🧠

DeepResearch-9K: A Challenging Benchmark Dataset of Deep-Research Agent

Researchers have released DeepResearch-9K, a large-scale dataset with 9,000 questions across three difficulty levels designed to train and benchmark AI research agents. The accompanying open-source framework DeepResearch-R1 supports multi-turn web interactions and reinforcement learning approaches for developing more sophisticated AI research capabilities.

AIBullisharXiv – CS AI · Mar 36/103
🧠

Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents

Researchers introduce ReMemR1, a new approach to improve large language models' ability to handle long-context question answering by integrating memory retrieval into the memory update process. The system enables non-linear reasoning through selective callback of historical memories and uses multi-level reward design to strengthen training.

AIBullisharXiv – CS AI · Mar 36/103
🧠

HIMM: Human-Inspired Long-Term Memory Modeling for Embodied Exploration and Question Answering

Researchers propose HIMM, a new memory framework for AI embodied agents that separates episodic and semantic memory to improve long-term performance. The system achieves significant gains on benchmarks, with 7.3% improvement in LLM-Match and 11.4% in LLM MatchXSPL, addressing key challenges in deploying multimodal language models as embodied agent brains.

AINeutralarXiv – CS AI · Feb 276/107
🧠

SPARTA: Scalable and Principled Benchmark of Tree-Structured Multi-hop QA over Text and Tables

Researchers introduce SPARTA, an automated framework for generating large-scale Table-Text question answering benchmarks that require complex multi-hop reasoning across structured and unstructured data. The benchmark exposes significant weaknesses in current AI models, with state-of-the-art systems experiencing over 30 F1 point performance drops compared to existing simpler datasets.

AIBullisharXiv – CS AI · Feb 276/107
🧠

RELOOP: Recursive Retrieval with Multi-Hop Reasoner and Planners for Heterogeneous QA

Researchers introduce RELOOP, a new retrieval-augmented generation framework that improves multi-step question answering across text, tables, and knowledge graphs. The system uses hierarchical sequences and structure-aware iteration to achieve better accuracy while reducing computational costs compared to existing RAG methods.

AINeutralarXiv – CS AI · Mar 115/10
🧠

MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents

Researchers introduce MA-EgoQA, a benchmark for evaluating AI models' ability to understand multiple egocentric video streams from embodied agents simultaneously. The benchmark includes 1.7k questions across five categories and reveals current approaches struggle with multi-agent system-level understanding.

AINeutralarXiv – CS AI · Mar 44/104
🧠

ConEQsA: Concurrent and Asynchronous Embodied Questions Scheduling and Answering

Researchers introduce ConEQsA, an AI framework that enables embodied agents to handle multiple questions simultaneously in 3D environments with urgency-aware scheduling. The system uses shared memory to reduce redundant exploration and includes a new benchmark with 200 questions across 40 indoor scenes.

← PrevPage 2 of 2