#transfer-learning News & Analysis

99 articles tagged with #transfer-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

99 articles

AIBullisharXiv – CS AI · Jun 237/10

🧠

KITE: Decoupling Kinematics and Interaction for Zero-Shot Cross-Embodiment Manipulation

Researchers introduce KITE, a machine learning framework that decouples task reasoning from embodiment-specific motor control to enable robot manipulation policies trained on one robot type to transfer zero-shot to structurally different robots. The approach uses learned latent representations of interaction intent based on contact patterns, requiring only kinematic model training for new embodiments without collecting new demonstration data.

AIBullisharXiv – CS AI · Jun 237/10

🧠

The Unreasonable Effectiveness of VLMs for Zero-shot Procedural Mistake Detection

Researchers introduce ZeProM, a zero-shot framework using Video-Language Models to detect procedural mistakes without task-specific training. The approach matches or exceeds supervised methods on standard benchmarks, suggesting a shift toward more generalizable AI solutions for quality control across industries.

AIBullisharXiv – CS AI · Jun 117/10

🧠

LUCID: Learning Embodiment-Agnostic Intent Models from Unstructured Human Videos for Scalable Dexterous Robot Skill Acquisition

LUCID is a machine learning framework that learns robot manipulation skills from unstructured internet videos and human demonstrations, then transfers this knowledge to different robot embodiments through a shared intent model. The approach eliminates the need for expensive, embodiment-specific robot training data and demonstrates zero-shot transfer capabilities across multiple real-world tasks.

AIBullisharXiv – CS AI · Jun 97/10

🧠

Ego-Pi: VLA Fine-Tuning for Ego-Centric Human and Robot Data

Ego-Pi introduces a fine-tuning approach for the π₀.₅ foundation model that leverages egocentric human manipulation data to train humanoid robots with dexterous hands. The research demonstrates that human demonstrations enable robots to learn new task semantics and compose skills into novel behaviors without requiring robot-specific training data, addressing robotics' persistent data scarcity challenge.

AIBullisharXiv – CS AI · Jun 87/10

🧠

OpenSkill: Open-World Self-Evolution for LLM Agents

OpenSkill introduces a framework enabling LLM agents to self-evolve in open-world environments without task-specific supervision, bootstrapping both skills and verification signals from public documentation and web resources. The approach demonstrates superior performance across benchmarks while maintaining transferability across different models, addressing a critical gap in autonomous agent deployment.

AIBullisharXiv – CS AI · Jun 57/10

🧠

Harnessing Structural Context for Entity Alignment Foundation Models

Researchers introduce ContextEA, an advanced foundation model for entity alignment across knowledge graphs that significantly improves upon existing approaches by better leveraging structural context. The model demonstrates superior transfer capabilities to unseen knowledge graph pairs, outperforming finetuned baselines without requiring task-specific adaptation.

AIBullisharXiv – CS AI · Jun 57/10

🧠

Do Models Share Safety Representations? Cross-Model Steering for Safe Visual Generation

Researchers demonstrate that safety behaviors in generative AI models can be represented as portable latent directions that transfer across different architectures without requiring unsafe training data on target models. This framework enables cross-model safety steering for text-to-image and text-to-video generation, suggesting safety is a shared property rather than model-specific.

AIBullisharXiv – CS AI · Jun 47/10

🧠

Building The Ph(ysical)AI Layer Of Machine Intelligence

Researchers propose principle-driven foundation models that encode physics-based principles rather than learn statistical correlations, achieving cross-modal transfer from radio-frequency data to audio, images, text, and video without fine-tuning. A 1.99M parameter frozen encoder reaches 77.7% average accuracy across 15 tasks, with performance varying systematically between physically-grounded (84.5%) and semantic tasks (70.0%), suggesting complementary approaches to AI generalization.

AINeutralarXiv – CS AI · Jun 27/10

🧠

VLM4VLA: Revisiting Vision-Language-Models in Vision-Language-Action Models

Researchers introduce VLM4VLA, a minimal adaptation pipeline converting Vision-Language Models into Vision-Language-Action policies for robotic control. The study reveals that strong general VLM performance doesn't reliably predict downstream task success, and that visual encoders—not language components—represent the primary bottleneck for embodied AI applications.

🏢 Meta

AIBullisharXiv – CS AI · Jun 27/10

🧠

From Human Videos to Robot Manipulation: A Survey on Scalable Vision-Language-Action Learning with Human-Centric Data

A comprehensive survey examines how human videos can be leveraged to train Vision-Language-Action (VLA) models for robot manipulation, addressing the limitation that robot demonstrations are expensive and embodiment-specific. The research categorizes four approaches for extracting actionable knowledge from human videos and identifies critical open challenges in video structuring, embodiment transfer, and real-world evaluation.

AIBearisharXiv – CS AI · May 297/10

🧠

Do Physics Foundation Models Learn Generalizable Physics? A Bias-Aware Benchmark Across Physical Regimes and Distribution Shifts

Researchers benchmarked five physics foundation models across 8 physical dynamics and 25 test regimes, revealing that current models function as conditional rather than universal generalists. The study demonstrates that model performance heavily depends on physical regime, temporal scale, and distribution shifts, with pretraining and scaling unable to reliably overcome these limitations.

AIBullisharXiv – CS AI · May 287/10

🧠

PromptEmbedder:: Efficient and Transferable Text Embedding via Dual-LLM Soft Prompting

PromptEmbedder introduces a dual-LLM framework that decouples text embedding from specific model architectures, achieving comparable performance to LoRA while reducing GPU memory by 40% and accelerating training 3.7x. The innovation enables efficient transfer across different LLM backbones by retraining only a lightweight alignment matrix rather than entire models.

AIBullisharXiv – CS AI · May 127/10

🧠

Forge: Quality-Aware Reinforcement Learning for NP-Hard Optimization in LLMs

Researchers introduce OPT-BENCH, a framework for training LLMs on NP-hard optimization problems using quality-aware reinforcement learning. Testing on Qwen2.5-7B achieves 93.1% success rate and 46.6% quality ratio, substantially outperforming GPT-4o, with demonstrated transfer benefits across mathematics, logic, and reasoning tasks.

🧠 GPT-4

AIBullisharXiv – CS AI · May 117/10

🧠

Rubric-Grounded RL: Structured Judge Rewards for Generalizable Reasoning

Researchers introduce rubric-grounded reinforcement learning, a framework that trains AI models using structured, multi-criterion rewards from an LLM judge rather than binary outcomes. Training Llama-3.1-8B on scientific documents achieved 71.7% normalized reward and demonstrated improved performance on multiple reasoning benchmarks, suggesting that document-grounded training signals can produce generalizable reasoning capabilities.

🧠 Llama

AIBullisharXiv – CS AI · May 97/10

🧠

LANTERN: LLM-Augmented Neurosymbolic Transfer with Experience-Gated Reasoning Networks

Researchers introduce LANTERN, a framework that uses large language models to automatically generate task descriptions and intelligently aggregate knowledge from multiple source tasks for reinforcement learning. The system achieves 40-60% improvements in sample efficiency by adaptively weighting source policies based on task similarity and managing teacher-student knowledge transfer through uncertainty-aware gating.

AIBullisharXiv – CS AI · May 97/10

🧠

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key

Researchers introduce ScaleLogic, a synthetic reasoning framework that systematically studies how reinforcement learning improves LLM reasoning across varying task difficulty and logical complexity. The study reveals that RL training compute follows a power law with reasoning depth, with scaling efficiency improving when models train on more expressively complex logic, suggesting that training content quality matters as much as training volume.

AIBullisharXiv – CS AI · Apr 207/10

🧠

Exascale Multi-Task Graph Foundation Models for Imbalanced, Multi-Fidelity Atomistic Data

Researchers have developed an exascale workflow using graph foundation models trained on 544+ million atomistic structures to accelerate materials discovery. The system can screen 1.1 billion structures in 50 seconds—a task requiring years of traditional computation—and demonstrates strong transfer learning capabilities across diverse chemical applications.

AIBullisharXiv – CS AI · Apr 147/10

🧠

Adapting 2D Multi-Modal Large Language Model for 3D CT Image Analysis

Researchers propose a method to adapt 2D multimodal large language models for 3D medical imaging analysis, introducing a Text-Guided Hierarchical Mixture of Experts framework that enables task-specific feature extraction. The approach demonstrates improved performance on medical report generation and visual question answering tasks while reusing pre-trained parameters from 2D models.

AIBullisharXiv – CS AI · Mar 47/103

🧠

Can Computational Reducibility Lead to Transferable Models for Graph Combinatorial Optimization?

Researchers developed a new neural solver model using GCON modules and energy-based loss functions that achieves state-of-the-art performance across multiple graph combinatorial optimization tasks. The study demonstrates effective transfer learning between related optimization problems through computational reducibility-informed pretraining strategies, representing progress toward foundational AI models for combinatorial optimization.

AIBullisharXiv – CS AI · Mar 47/103

🧠

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Researchers developed D2E (Desktop to Embodied AI), a framework that uses desktop gaming data to pretrain AI models for robotics tasks. Their 1B-parameter model achieved 96.6% success on manipulation tasks and 83.3% on navigation, matching performance of models up to 7 times larger while using scalable desktop data instead of expensive physical robot training data.

AIBullisharXiv – CS AI · Mar 47/103

🧠

Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation

Researchers propose a new IMPRINT framework for transfer learning that improves foundation model adaptation to new tasks without parameter optimization. The framework identifies three key components and introduces a clustering-based variant that outperforms existing methods by 4%.

AIBullisharXiv – CS AI · Mar 46/104

🧠

EvoSkill: Automated Skill Discovery for Multi-Agent Systems

Researchers have developed EvoSkill, an automated framework that enables AI agents to discover and refine domain-specific skills through iterative failure analysis. The system demonstrated significant performance improvements on specialized tasks, with accuracy gains of 7.3% on financial data analysis and 12.1% on search-augmented QA, while showing transferable capabilities across different domains.

AINeutralarXiv – CS AI · Mar 37/105

🧠

Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

Researchers identified that fine-tuning non-robust pretrained AI models with robust objectives can lead to poor performance, termed 'suboptimal transfer.' They propose Epsilon-Scheduling, a novel training technique that adjusts perturbation strength during training to improve both task adaptation and adversarial robustness.

AINeutralarXiv – CS AI · Jun 256/10

🧠

GCT-MARL: Graph-Based Contrastive Transfer for Sample-Efficient Cooperative Multi-Agent Reinforcement Learning

Researchers introduce GCT-MARL, a transfer learning framework for multi-agent reinforcement learning that enables faster training across different environments by combining graph-based contrastive learning with adaptive alignment techniques. The method demonstrates significant convergence improvements over from-scratch training in both homogeneous and heterogeneous agent scenarios, while supporting continual learning across sequential tasks.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Offline Multi-agent Continual Cooperation via Skill Partition and Reuse

Researchers introduce COMAD, a framework for multi-agent reinforcement learning systems to continually discover and reuse coordination skills from offline data without catastrophic forgetting. The approach uses skill partitioning and density-based reusability estimation to enable agents to efficiently transfer knowledge across sequential tasks in open environments.

Page 1 of 4Next →