y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#test-time-adaptation News & Analysis

16 articles tagged with #test-time-adaptation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

16 articles
AIBullisharXiv โ€“ CS AI ยท Apr 77/10
๐Ÿง 

StableTTA: Training-Free Test-Time Adaptation that Improves Model Accuracy on ImageNet1K to 96%

Researchers developed StableTTA, a training-free method that significantly improves AI model accuracy on ImageNet-1K, with 33 models achieving over 95% accuracy and several surpassing 96%. The method allows lightweight architectures to outperform Vision Transformers while using 95% fewer parameters and 89% less computational cost.

AIBullisharXiv โ€“ CS AI ยท Mar 267/10
๐Ÿง 

You only need 4 extra tokens: Synergistic Test-time Adaptation for LLMs

Researchers developed SyTTA, a test-time adaptation framework that improves large language models' performance in specialized domains without requiring additional labeled data. The method achieved over 120% improvement on agricultural question answering tasks using just 4 extra tokens per query, addressing the challenge of deploying LLMs in domains with limited training data.

๐Ÿข Perplexity
AIBullisharXiv โ€“ CS AI ยท Mar 56/10
๐Ÿง 

Test-Time Meta-Adaptation with Self-Synthesis

Researchers introduce MASS, a meta-learning framework that enables large language models to self-adapt at test time by generating synthetic training data and performing targeted self-updates. The system uses bilevel optimization to meta-learn data-attribution signals and optimize synthetic data through scalable meta-gradients, showing effectiveness in mathematical reasoning tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 56/10
๐Ÿง 

Learning Physical Principles from Interaction: Self-Evolving Planning via Test-Time Memory

Researchers introduce PhysMem, a memory framework that enables vision-language model robot planners to learn physical principles through real-time interaction without updating model parameters. The system records experiences, generates hypotheses, and verifies them before application, achieving 76% success on brick insertion tasks compared to 23% for direct experience retrieval.

AIBullisharXiv โ€“ CS AI ยท Mar 37/103
๐Ÿง 

VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models

Researchers introduce VITA, a zero-shot value function learning method that enhances Vision-Language Models through test-time adaptation for robotic manipulation tasks. The system updates parameters sequentially over trajectories to improve temporal reasoning and generalizes across diverse environments, outperforming existing autoregressive VLM methods.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning

Researchers introduce Self-Harmony, a new test-time reinforcement learning framework that improves AI model accuracy by having models solve problems and rephrase questions simultaneously. The method uses harmonic mean aggregation instead of majority voting to select stable answers, achieving state-of-the-art results across 28 of 30 reasoning benchmarks without requiring human supervision.

AIBullisharXiv โ€“ CS AI ยท Feb 277/108
๐Ÿง 

AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning

Researchers propose AgentDropoutV2, a test-time framework that optimizes multi-agent systems by dynamically correcting or removing erroneous outputs without requiring retraining. The system acts as an active firewall with retrieval-augmented rectification, achieving 6.3 percentage point accuracy gains on math benchmarks while preventing error propagation between AI agents.

AIBullisharXiv โ€“ CS AI ยท Apr 76/10
๐Ÿง 

Context is All You Need

Researchers introduce CONTXT, a lightweight neural network adaptation method that improves AI model performance when deployed on data different from training data. The technique uses simple additive and multiplicative transforms to modulate internal representations, providing consistent gains across both discriminative and generative models including LLMs.

AIBullisharXiv โ€“ CS AI ยท Mar 116/10
๐Ÿง 

PRECEPT: Planning Resilience via Experience, Context Engineering & Probing Trajectories A Unified Framework for Test-Time Adaptation with Compositional Rule Learning and Pareto-Guided Prompt Evolution

Researchers introduce PRECEPT, a new framework for AI language model agents that improves knowledge retrieval and adaptation through structured rule learning and conflict-aware memory systems. The framework shows significant performance improvements over existing methods, with 41% better first-try accuracy and enhanced compositional reasoning capabilities.

AIBullisharXiv โ€“ CS AI ยท Mar 36/107
๐Ÿง 

Words & Weights: Streamlining Multi-Turn Interactions via Co-Adaptation

Researchers introduce ROSA2, a framework that improves Large Language Model interactions by simultaneously optimizing both prompts and model parameters during test-time adaptation. The approach outperformed baselines by 30% on mathematical tasks while reducing interaction turns by 40%.

AIBullisharXiv โ€“ CS AI ยท Mar 37/107
๐Ÿง 

Tool Verification for Test-Time Reinforcement Learning

Researchers introduce TยณRL (Tool-Verification for Test-Time Reinforcement Learning), a new method that improves self-evolving AI reasoning models by using external tool verification to prevent incorrect learning from biased consensus. The approach shows significant improvements on mathematical problem-solving tasks, with larger gains on harder problems.

AIBullisharXiv โ€“ CS AI ยท Mar 36/106
๐Ÿง 

TARSE: Test-Time Adaptation via Retrieval of Skills and Experience for Reasoning Agents

Researchers developed TARSE, a new AI system for clinical decision-making that retrieves relevant medical skills and experiences from curated libraries to improve reasoning accuracy. The system performs test-time adaptation to align language models with clinically valid logic, showing improvements over existing medical AI baselines in question-answering benchmarks.

AINeutralarXiv โ€“ CS AI ยท Mar 175/10
๐Ÿง 

Preconditioned Test-Time Adaptation for Out-of-Distribution Debiasing in Narrative Generation

Researchers propose CAP-TTA, a test-time adaptation framework that helps debiased large language models better handle unfamiliar toxic prompts that cause distribution shifts. The method uses context-aware LoRA updates triggered by bias-risk thresholds to reduce toxic outputs while maintaining narrative fluency and reducing computational latency.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

When and Where to Reset Matters for Long-Term Test-Time Adaptation

Researchers propose an Adaptive and Selective Reset (ASR) scheme to address model collapse in long-term test-time adaptation, where AI models gradually degrade and predict only a few classes. The solution dynamically determines when and where to reset models while preserving beneficial knowledge through importance-aware regularization.

AINeutralarXiv โ€“ CS AI ยท Mar 34/105
๐Ÿง 

Decoupling Stability and Plasticity for Multi-Modal Test-Time Adaptation

Researchers propose DASP (Decoupling Adaptation for Stability and Plasticity), a novel framework for adapting multi-modal AI models to changing test environments. The method addresses key challenges of negative transfer and catastrophic forgetting by using asymmetric adaptation strategies that treat biased and unbiased modalities differently.