y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#grounding News & Analysis

4 articles tagged with #grounding. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBullisharXiv โ€“ CS AI ยท Feb 277/107
๐Ÿง 

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Molmo2 is a new open-source family of vision-language models that achieves state-of-the-art performance among open models, particularly excelling in video understanding and pixel-level grounding tasks. The research introduces 7 new video datasets and 2 multi-image datasets collected without using proprietary VLMs, along with an 8B parameter model that outperforms existing open-weight models and even some proprietary models on specific tasks.

AIBullisharXiv โ€“ CS AI ยท Apr 76/10
๐Ÿง 

GROUNDEDKG-RAG: Grounded Knowledge Graph Index for Long-document Question Answering

Researchers introduced GroundedKG-RAG, a new retrieval-augmented generation system that creates knowledge graphs directly grounded in source documents to improve long-document question answering. The system reduces resource consumption and hallucinations while maintaining accuracy comparable to state-of-the-art models at lower cost.

AINeutralarXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective

Researchers propose a hierarchical planning framework to analyze why LLM-based web agents fail at complex navigation tasks. The study reveals that while structured PDDL plans outperform natural language plans, low-level execution and perceptual grounding remain the primary bottlenecks rather than high-level reasoning.

AIBullishGoogle DeepMind Blog ยท Dec 176/103
๐Ÿง 

FACTS Grounding: A new benchmark for evaluating the factuality of large language models

Researchers have introduced FACTS Grounding, a new benchmark designed to evaluate how accurately large language models ground their responses in source material and avoid hallucinations. The benchmark includes a comprehensive evaluation system and online leaderboard to measure LLM factuality performance.