y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#supervised-finetuning News & Analysis

3 articles tagged with #supervised-finetuning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles
AINeutralarXiv โ€“ CS AI ยท Apr 107/10
๐Ÿง 

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Researchers challenge the conventional wisdom that supervised finetuning (SFT) merely memorizes while reinforcement learning generalizes. Their analysis reveals that reasoning SFT with chain-of-thought supervision can generalize across domains, but success depends critically on optimization duration, data quality, and base model strength, with generalization improvements coming at the cost of degraded safety performance.

AIBullisharXiv โ€“ CS AI ยท Apr 77/10
๐Ÿง 

PassiveQA: A Three-Action Framework for Epistemically Calibrated Question Answering via Supervised Finetuning

Researchers propose PassiveQA, a new AI framework that teaches language models to recognize when they don't have enough information to answer questions, choosing to ask for clarification or abstain rather than hallucinate responses. The three-action system (Answer, Ask, Abstain) uses supervised fine-tuning to align model behavior with information sufficiency, showing significant improvements in reducing hallucinations.

AIBullisharXiv โ€“ CS AI ยท Mar 57/10
๐Ÿง 

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Researchers introduce Agent Data Protocol (ADP), a standardized format for unifying diverse AI agent training datasets across different formats and tools. The protocol enabled training on 13 unified datasets, achieving ~20% performance gains over base models and state-of-the-art results on coding, browsing, and tool use benchmarks.