y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#supervised-learning News & Analysis

9 articles tagged with #supervised-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

9 articles
AIBullisharXiv – CS AI · Mar 56/10
🧠

R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

Researchers developed R1-Code-Interpreter, a large language model that uses multi-stage reinforcement learning to autonomously generate code for step-by-step reasoning across diverse tasks. The 14B parameter model achieves 72.4% accuracy on test tasks, outperforming GPT-4o variants and demonstrating emergent self-checking capabilities through code generation.

🏢 Hugging Face🧠 GPT-4
AIBullisharXiv – CS AI · Mar 57/10
🧠

Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning

Researchers propose Supervised Calibration (SC), a new framework to improve In-Context Learning performance in Large Language Models by addressing systematic biases through optimal affine transformations in logit space. The method achieves state-of-the-art results across multiple LLMs including Mistral-7B, Llama-2-7B, and Qwen2-7B in few-shot learning scenarios.

🧠 Llama
AIBearisharXiv – CS AI · Mar 47/102
🧠

Silent Sabotage During Fine-Tuning: Few-Shot Rationale Poisoning of Compact Medical LLMs

Researchers discovered a new stealth poisoning attack method targeting medical AI language models during fine-tuning that degrades performance on specific medical topics without detection. The attack injects poisoned rationales into training data, proving more effective than traditional backdoor attacks or catastrophic forgetting methods.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Researchers propose EKSFT, a novel fine-tuning method that selectively masks high-entropy and high-KL divergence tokens during supervised fine-tuning of large language models. The approach aims to preserve pre-trained model distributions while efficiently activating task-relevant capabilities in low-data regimes, demonstrating improved performance on mathematical reasoning benchmarks.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?

Researchers demonstrate that reinforcement learning (RL) preserves internal computational circuits in large language models better than supervised fine-tuning (SFT) during task adaptation. Using a new metric called differential circuit vulnerability on Qwen2.5-3B-Instruct, they reveal a mechanistic trade-off: SFT adapts faster but causes substantial circuit disruption and capability forgetting, while RL maintains base model circuits at the cost of slower learning.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Test Time Training for Supervised Causal Learning

Researchers propose Test-Time Training for Supervised Causal Learning (TTT-SCL), a framework addressing critical limitations in causal discovery by generating test-specific training sets. The approach significantly improves performance gaps between synthetic benchmarks and real-world applications while enhancing robustness to distribution shifts.

AINeutralarXiv – CS AI · 4d ago5/10
🧠

Supervised Distributional Reduction via Optimal Transport and Dependence Maximization

Researchers propose Supervised Distributional Reduction (SDR), a machine learning algorithm combining optimal transport theory with dependence maximization to create compact data representations that preserve both geometric structure and predictive information. The method extends the Fused Gromov-Wasserstein framework and offers applications in representation learning and adaptive kernel design for Gaussian Process modeling.

AINeutralarXiv – CS AI · May 46/10
🧠

MemRouter: Memory-as-Embedding Routing for Long-Term Conversational Agents

MemRouter is a new memory management system for conversational AI agents that uses lightweight embedding-based routing instead of expensive LLM generation to decide which conversation turns to store. The approach achieves 52.0 F1 score versus 45.6 for LLM-based alternatives while reducing latency from 970ms to 58ms, suggesting memory admission can be effectively learned through supervised classification rather than generative models.

AIBullisharXiv – CS AI · Mar 24/106
🧠

Asymptotically Stable Quaternion-valued Hopfield-structured Neural Network with Periodic Projection-based Supervised Learning Rules

Researchers propose a quaternion-valued supervised learning Hopfield neural network (QSHNN) that leverages quaternions' geometric advantages for representing rotations and postures. The model introduces periodic projection-based learning rules to maintain quaternionic consistency while achieving high accuracy and fast convergence, with potential applications in robotics and control systems.