#model-adaptation News & Analysis

36 articles tagged with #model-adaptation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

36 articles

AINeutralarXiv – CS AI · Jun 117/10

🧠

Federated continual learning: A comprehensive survey on lifelong and privacy-preserving learning over distributed and non-stationary data

A comprehensive survey examines Federated Continual Learning (FCL), which combines federated learning's privacy-preserving distributed training with continual learning's ability to adapt to evolving data. The research addresses a critical gap in current FL systems that assume static data, proposing frameworks for real-world applications like healthcare and IoT where data streams continuously shift, causing performance degradation and catastrophic forgetting.

AIBullisharXiv – CS AI · Jun 57/10

🧠

DRIFT: A Residual Flow Adapter for Decoding Continuous Outputs in Vision-Language Models

Researchers introduce DRIFT, a framework that adapts pretrained vision-language models to handle continuous numerical outputs rather than discrete tokens. By combining a base predictor with a flow-matching refinement module, DRIFT improves performance on tasks like temporal localization and robotic control across multiple model architectures.

AIBullisharXiv – CS AI · Jun 27/10

🧠

RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting

Researchers introduce RAFT, a framework addressing the problem of catastrophic forgetting in domain-specific fine-tuning of language models. By combining data refinement with answer-conditioned distillation, RAFT achieves 23.2% improvement in domain accuracy while recovering 10-18% of general capability losses typically incurred during fine-tuning.

AIBullishCrypto Briefing · May 297/10

🧠

MIT’s MeMo boosts LLM performance by 26% without retraining

MIT researchers have developed MeMo, a technique that improves large language model performance by 26% without requiring model retraining. This approach reduces computational costs and enables efficient adaptation across multiple domains, addressing a major pain point in AI deployment.

AIBullisharXiv – CS AI · May 287/10

🧠

PromptEmbedder:: Efficient and Transferable Text Embedding via Dual-LLM Soft Prompting

PromptEmbedder introduces a dual-LLM framework that decouples text embedding from specific model architectures, achieving comparable performance to LoRA while reducing GPU memory by 40% and accelerating training 3.7x. The innovation enables efficient transfer across different LLM backbones by retraining only a lightweight alignment matrix rather than entire models.

AIBullisharXiv – CS AI · May 127/10

🧠

Echo-LoRA: Parameter-Efficient Fine-Tuning via Cross-Layer Representation Injection

Echo-LoRA introduces a parameter-efficient fine-tuning method that injects cross-layer representations from deeper neural network layers into shallow LoRA modules during training, achieving 3-5.7% performance improvements on reasoning tasks without adding inference costs. The technique discards its auxiliary training path post-deployment, maintaining the efficiency benefits of standard LoRA while delivering measurable capability gains.

AIBullisharXiv – CS AI · May 127/10

🧠

Learning Multi-Indicator Weights for Data Selection: A Joint Task-Model Adaptation Framework with Efficient Proxies

Researchers propose a framework for optimizing data selection in large language model instruction tuning by learning task-specific and model-specific weights for multiple quality indicators. Using efficient in-context learning signals on small validation sets, the method achieves comparable performance to full-dataset training with only 30% of samples, revealing important trade-offs between semantic diversity and logical complexity.

🧠 Llama

AIBullisharXiv – CS AI · Apr 147/10

🧠

Proximal Supervised Fine-Tuning

Researchers propose Proximal Supervised Fine-Tuning (PSFT), a new method that applies trust-region constraints from reinforcement learning to improve how foundation models adapt to new tasks. The technique maintains model capabilities while fine-tuning, outperforming standard supervised fine-tuning on out-of-domain generalization tasks.

AIBullisharXiv – CS AI · Apr 147/10

🧠

Pioneer Agent: Continual Improvement of Small Language Models in Production

Researchers introduce Pioneer Agent, an automated system that continuously improves small language models in production by diagnosing failures, curating training data, and retraining under regression constraints. The system demonstrates significant performance gains across benchmarks, with real-world deployments achieving improvements from 84.9% to 99.3% in intent classification.

AIBullisharXiv – CS AI · Mar 117/10

🧠

Efficiently Aligning Draft Models via Parameter- and Data-Efficient Adaptation

Researchers introduce Efficient Draft Adaptation (EDA), a framework that significantly reduces the cost of adapting draft models for speculative decoding when target LLMs are fine-tuned. EDA achieves superior performance through decoupled architecture, data regeneration, and smart sample selection while requiring substantially less training resources than full retraining.

AIBullisharXiv – CS AI · Mar 57/10

🧠

PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters

PlaneCycle introduces a training-free method to convert 2D AI foundation models to 3D without requiring retraining or architectural changes. The technique enables pretrained 2D models like DINOv3 to process 3D volumetric data by cyclically distributing spatial aggregation across orthogonal planes, achieving competitive performance on 3D classification and segmentation tasks.

AIBullisharXiv – CS AI · Mar 56/10

🧠

TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

Researchers introduce TTSR, a new framework that enables AI models to improve their reasoning abilities during test time by having a single model alternate between student and teacher roles. The system allows models to learn from their mistakes by analyzing failed reasoning attempts and generating targeted practice questions for continuous improvement.

AIBullisharXiv – CS AI · Mar 47/103

🧠

Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation

Researchers propose a new IMPRINT framework for transfer learning that improves foundation model adaptation to new tasks without parameter optimization. The framework identifies three key components and introduces a clustering-based variant that outperforms existing methods by 4%.

AIBullisharXiv – CS AI · Mar 47/103

🧠

On the Structural Limitations of Weight-Based Neural Adaptation and the Role of Reversible Behavioral Learning

Researchers introduce reversible behavioral learning for AI models, addressing the problem of structural irreversibility in neural network adaptation. The study demonstrates that traditional fine-tuning methods cause permanent changes to model behavior that cannot be deterministically reversed, while their new approach allows models to return to original behavior within numerical precision.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Distribution-Aligned Decoding for Efficient LLM Task Adaptation

Researchers introduce SVDecode, a new method for adapting large language models to specific tasks without extensive fine-tuning. The technique uses steering vectors during decoding to align output distributions with task requirements, improving accuracy by up to 5 percentage points while adding minimal computational overhead.

AIBullisharXiv – CS AI · Feb 277/106

🧠

Rethinking Cross-Modal Fine-Tuning: Optimizing the Interaction between Feature Alignment and Target Fitting

Researchers developed a theoretical framework to optimize cross-modal fine-tuning of pre-trained AI models, addressing the challenge of aligning new feature modalities with existing representation spaces. The approach introduces a novel concept of feature-label distortion and demonstrates improved performance over state-of-the-art methods across benchmark datasets.

AIBullisharXiv – CS AI · Jun 256/10

🧠

Supervised Post-training of Speech Foundation Models for Robust Adaptation in Speech Deepfake Detection

Researchers propose a supervised post-training method for speech foundation models that improves deepfake detection by addressing the mismatch between self-supervised learning objectives and spoof-detection requirements. The approach achieves state-of-the-art results on multiple benchmarks, demonstrating that targeted adaptation strategies can enhance AI model robustness for security applications.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Cluster-Specific Localized Drift Detection for Efficient Batch Model Adaptation under Controlled Distribution Shift

Researchers propose a framework for simulating controlled distribution shifts in static datasets to evaluate how machine learning models adapt to nonstationary data environments. The study benchmarks six adaptation strategies across multiple model families, addressing a critical gap in reproducible evaluation of drift detection methods for real-world deployment scenarios.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Reliability-Guided Adaptive Ensembling for Robust Test-Time Adaptation

Researchers propose SAFER, a training-free framework that enhances the robustness of test-time adaptation (TTA) methods against adversarial attacks on contaminated data streams. The method uses stochastic augmentation and reliability-guided prediction pooling to maintain performance while mitigating domain shift without requiring source data access.

AIBullisharXiv – CS AI · Jun 196/10

🧠

FlowEdit: Associative Memory for Lifelong Pronunciation Adaptation in Flow-Matching TTS

Researchers introduce FlowEdit, a lifelong adaptation framework for text-to-speech systems that corrects pronunciation errors without retraining the underlying model. Using associative memory and latent conditioning edits, FlowEdit achieves 92.7% error reduction on multilingual proper nouns while maintaining speech quality and completing corrections in ~15 seconds.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Two to Tango: Coupled Task-Reference Selection for Safe LLM Fine-tuning

Researchers propose DualSelect, a framework for fine-tuning large language models that simultaneously selects relevant safety references and compatible task samples to preserve safety alignment while improving task performance. The method achieves significant safety improvements (5.10+ points) across models from 1B to 8B parameters without sacrificing utility.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Fisher-Guided Progressive Parameter Selection for Adaptive Fine-Tuning

Researchers introduce FisherAdapTune, a machine learning framework that dynamically selects which parameters to fine-tune in pretrained models by monitoring Fisher information geometry rather than relying on fixed architectural rules. The method demonstrates improved performance and zero-shot transfer capabilities on segmentation tasks while reducing computational overhead.

AIBullisharXiv – CS AI · Jun 46/10

🧠

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

Researchers introduce ADAPTOOD, a framework that uses data uncertainty to improve machine learning model performance on out-of-distribution time series data, particularly for ECG analysis. The method achieves up to 7% higher accuracy than existing approaches by quantifying distribution shift severity and adapting hyperparameters accordingly, addressing a critical challenge in deploying medical AI models across diverse real-world settings.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

Researchers introduce AdvCL, a novel framework that repurposes adversarial perturbations to improve continual learning in large language models by addressing forgetting, limited transfer, and adversarial vulnerability. The approach combines three modules—Intra-Smooth, Proto-Clip, and Inter-Align—to provide geometric control signals that stabilize model adaptation across sequential tasks while maintaining robustness.

AINeutralarXiv – CS AI · Jun 16/10

🧠

What changes after deployment? A survey on On-device Learning in TinyML

This survey examines on-device learning (ODL) in TinyML systems, analyzing how 70 existing solutions address the challenge of distribution shift in deployed machine learning models on microcontrollers. The research identifies a critical gap between academic benchmarks and real-world deployment scenarios, emphasizing that different types of distribution change require tailored technical approaches.

Page 1 of 2Next →