y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#model-adaptation News & Analysis

15 articles tagged with #model-adaptation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

15 articles
AIBullisharXiv โ€“ CS AI ยท Mar 117/10
๐Ÿง 

Efficiently Aligning Draft Models via Parameter- and Data-Efficient Adaptation

Researchers introduce Efficient Draft Adaptation (EDA), a framework that significantly reduces the cost of adapting draft models for speculative decoding when target LLMs are fine-tuned. EDA achieves superior performance through decoupled architecture, data regeneration, and smart sample selection while requiring substantially less training resources than full retraining.

AIBullisharXiv โ€“ CS AI ยท Mar 56/10
๐Ÿง 

TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

Researchers introduce TTSR, a new framework that enables AI models to improve their reasoning abilities during test time by having a single model alternate between student and teacher roles. The system allows models to learn from their mistakes by analyzing failed reasoning attempts and generating targeted practice questions for continuous improvement.

AIBullisharXiv โ€“ CS AI ยท Mar 57/10
๐Ÿง 

PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters

PlaneCycle introduces a training-free method to convert 2D AI foundation models to 3D without requiring retraining or architectural changes. The technique enables pretrained 2D models like DINOv3 to process 3D volumetric data by cyclically distributing spatial aggregation across orthogonal planes, achieving competitive performance on 3D classification and segmentation tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 47/103
๐Ÿง 

On the Structural Limitations of Weight-Based Neural Adaptation and the Role of Reversible Behavioral Learning

Researchers introduce reversible behavioral learning for AI models, addressing the problem of structural irreversibility in neural network adaptation. The study demonstrates that traditional fine-tuning methods cause permanent changes to model behavior that cannot be deterministically reversed, while their new approach allows models to return to original behavior within numerical precision.

AIBullisharXiv โ€“ CS AI ยท Mar 47/103
๐Ÿง 

Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation

Researchers propose a new IMPRINT framework for transfer learning that improves foundation model adaptation to new tasks without parameter optimization. The framework identifies three key components and introduces a clustering-based variant that outperforms existing methods by 4%.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Distribution-Aligned Decoding for Efficient LLM Task Adaptation

Researchers introduce SVDecode, a new method for adapting large language models to specific tasks without extensive fine-tuning. The technique uses steering vectors during decoding to align output distributions with task requirements, improving accuracy by up to 5 percentage points while adding minimal computational overhead.

AIBullisharXiv โ€“ CS AI ยท Feb 277/106
๐Ÿง 

Rethinking Cross-Modal Fine-Tuning: Optimizing the Interaction between Feature Alignment and Target Fitting

Researchers developed a theoretical framework to optimize cross-modal fine-tuning of pre-trained AI models, addressing the challenge of aligning new feature modalities with existing representation spaces. The approach introduces a novel concept of feature-label distortion and demonstrates improved performance over state-of-the-art methods across benchmark datasets.

AINeutralarXiv โ€“ CS AI ยท 6d ago6/10
๐Ÿง 

Towards Privacy-Preserving Large Language Model: Text-free Inference Through Alignment and Adaptation

Researchers introduce Privacy-Preserving Fine-Tuning (PPFT), a novel training approach that enables LLM services to process user queries without receiving raw text, addressing privacy vulnerabilities in current deployments. The method uses client-side encoders and noise-injected embeddings to maintain competitive model performance while eliminating exposure of sensitive personal, medical, or legal information.

AIBullisharXiv โ€“ CS AI ยท 6d ago6/10
๐Ÿง 

LoRA-DA: Data-Aware Initialization for Low-Rank Adaptation via Asymptotic Analysis

Researchers introduce LoRA-DA, a new initialization method for Low-Rank Adaptation that leverages target-domain data and theoretical optimization principles to improve fine-tuning performance. The method outperforms existing initialization approaches across multiple benchmarks while maintaining computational efficiency.

AINeutralarXiv โ€“ CS AI ยท Mar 166/10
๐Ÿง 

Continual Learning in Large Language Models: Methods, Challenges, and Opportunities

This comprehensive survey examines continual learning methodologies for large language models, focusing on three core training stages and methods to mitigate catastrophic forgetting. The research reveals that while current approaches show promise in specific domains, fundamental challenges remain in achieving seamless knowledge integration across diverse tasks and temporal scales.

AIBullisharXiv โ€“ CS AI ยท Mar 37/107
๐Ÿง 

DeLo: Dual Decomposed Low-Rank Experts Collaboration for Continual Missing Modality Learning

Researchers propose DeLo, a new framework using dual-decomposed low-rank expert architecture to help Large Multimodal Models adapt to real-world scenarios with incomplete data. The system addresses continual missing modality learning by preventing interference between different data types and tasks through specialized routing and memory mechanisms.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

Hyperparameter Trajectory Inference with Conditional Lagrangian Optimal Transport

Researchers introduce Hyperparameter Trajectory Inference (HTI), a method to predict how neural networks behave with different hyperparameter settings without expensive retraining. The approach uses conditional Lagrangian optimal transport to create surrogate models that approximate neural network outputs across various hyperparameter configurations.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1014
๐Ÿง 

From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model

Researchers propose a data-efficient framework to convert generative Multimodal Large Language Models into universal embedding models without extensive pre-training. The method uses hierarchical embedding prompts and Self-aware Hard Negative Sampling to achieve competitive performance on embedding benchmarks using minimal training data.

AIBullishHugging Face Blog ยท Feb 105/104
๐Ÿง 

Parameter-Efficient Fine-Tuning using ๐Ÿค— PEFT

The article discusses parameter-efficient fine-tuning methods using Hugging Face's PEFT library. PEFT enables efficient adaptation of large language models by updating only a small subset of parameters rather than full model retraining.

AINeutralarXiv โ€“ CS AI ยท Mar 34/105
๐Ÿง 

Decoupling Stability and Plasticity for Multi-Modal Test-Time Adaptation

Researchers propose DASP (Decoupling Adaptation for Stability and Plasticity), a novel framework for adapting multi-modal AI models to changing test environments. The method addresses key challenges of negative transfer and catastrophic forgetting by using asymmetric adaptation strategies that treat biased and unbiased modalities differently.