y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#temporal-reasoning News & Analysis

9 articles tagged with #temporal-reasoning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

9 articles
AIBullisharXiv โ€“ CS AI ยท 2d ago7/10
๐Ÿง 

Drawing on Memory: Dual-Trace Encoding Improves Cross-Session Recall in LLM Agents

Researchers introduce dual-trace memory encoding for LLM agents, pairing factual records with narrative scene reconstructions to improve cross-session recall by 20+ percentage points. The method significantly enhances temporal reasoning and multi-session knowledge aggregation without increasing computational costs, advancing the capability of persistent AI agent systems.

AIBullisharXiv โ€“ CS AI ยท 3d ago7/10
๐Ÿง 

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

Researchers introduce Audio Flamingo Next (AF-Next), an advanced open-source audio-language model that processes speech, sound, and music with support for inputs up to 30 minutes. The model incorporates a new temporal reasoning approach and demonstrates competitive or superior performance compared to larger proprietary alternatives across 20 benchmarks.

AIBullisharXiv โ€“ CS AI ยท Mar 57/10
๐Ÿง 

When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond?

Researchers developed a new training method combining Chain-of-Thought supervision with reinforcement learning to teach large language models when to abstain from answering temporal questions they're uncertain about. Their approach enabled a smaller Qwen2.5-1.5B model to outperform GPT-4o on temporal question answering tasks while improving reliability by 20% on unanswerable questions.

๐Ÿง  GPT-4
AIBullisharXiv โ€“ CS AI ยท Mar 46/102
๐Ÿง 

Chain of World: World Model Thinking in Latent Motion

Researchers introduce CoWVLA (Chain-of-World VLA), a new Vision-Language-Action model paradigm that combines world-model temporal reasoning with latent motion representation for embodied AI. The approach outperforms existing methods in robotic simulation benchmarks while maintaining computational efficiency through a unified autoregressive decoder that models both keyframes and action sequences.

AINeutralarXiv โ€“ CS AI ยท 3d ago6/10
๐Ÿง 

TimeSeriesExamAgent: Creating Time Series Reasoning Benchmarks at Scale

Researchers introduce TimeSeriesExamAgent, a scalable framework for automatically generating time series reasoning benchmarks using LLM agents and templates. The study reveals that while large language models show promise in time series tasks, they significantly underperform in abstract reasoning and domain-specific applications across healthcare, finance, and weather domains.

AINeutralarXiv โ€“ CS AI ยท Apr 76/10
๐Ÿง 

LiveFact: A Dynamic, Time-Aware Benchmark for LLM-Driven Fake News Detection

Researchers have developed LiveFact, a new dynamic benchmark for evaluating Large Language Models' ability to detect fake news and misinformation in real-time conditions. The benchmark addresses limitations of static testing by using temporal evidence sets and finds that open-source models like Qwen3-235B-A22B now match proprietary systems in performance.

AINeutralarXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

Dynamic Theory of Mind as a Temporal Memory Problem: Evidence from Large Language Models

Research reveals that Large Language Models struggle with dynamic Theory of Mind tasks, particularly tracking how others' beliefs change over time. While LLMs can infer current beliefs effectively, they fail to maintain and retrieve prior belief states after updates occur, showing patterns consistent with human cognitive biases.

AIBearisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

HEARTS: Benchmarking LLM Reasoning on Health Time Series

Researchers introduce HEARTS, a comprehensive benchmark for evaluating large language models' ability to reason over health time series data across 16 datasets and 12 health domains. The study reveals that current LLMs significantly underperform compared to specialized models and struggle with multi-step temporal reasoning in healthcare applications.

AIBullisharXiv โ€“ CS AI ยท Mar 37/108
๐Ÿง 

Egocentric Co-Pilot: Web-Native Smart-Glasses Agents for Assistive Egocentric AI

Researchers have developed Egocentric Co-Pilot, a web-native AI framework that runs on smart glasses and uses Large Language Models to provide assistive AI without requiring screens or free hands. The system combines perception, reasoning, and web tools to support accessibility for people with vision impairments or cognitive overload, showing superior performance compared to commercial baselines.