#model-training News & Analysis

152 articles tagged with #model-training. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

152 articles

AIBullisharXiv – CS AI · Mar 57/10

🧠

Discern Truth from Falsehood: Reducing Over-Refusal via Contrastive Refinement

Researchers introduce DCR (Discernment via Contrastive Refinement), a new method to reduce over-refusal in safety-aligned large language models. The approach helps LLMs better distinguish between genuinely toxic and seemingly toxic prompts, maintaining safety while improving helpfulness without degrading general capabilities.

AIBearisharXiv – CS AI · Mar 46/103

🧠

Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?

New research reveals that current large language models struggle with collaborative reasoning, showing that 'stronger' models are often more fragile when distracted by misleading information. The study of 15 LLMs found they fail to effectively leverage guidance from other models, with success rates below 9.2% on challenging problems.

AIBullisharXiv – CS AI · Mar 47/103

🧠

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Researchers introduce Skywork-Reward-V2, a suite of AI reward models trained on SynPref-40M, a massive 40-million preference pair dataset created through human-AI collaboration. The models achieve state-of-the-art performance across seven major benchmarks by combining human annotation quality with AI scalability for better preference learning.

AIBullisharXiv – CS AI · Mar 47/103

🧠

Param$\Delta$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

Researchers introduce Param∆, a novel method for transferring post-training capabilities to updated language models without additional training costs. The technique achieves 95% performance of traditional post-training by computing weight differences between base and post-trained models, offering significant cost savings for AI model development.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning

Researchers introduce Self-Harmony, a new test-time reinforcement learning framework that improves AI model accuracy by having models solve problems and rephrase questions simultaneously. The method uses harmonic mean aggregation instead of majority voting to select stable answers, achieving state-of-the-art results across 28 of 30 reasoning benchmarks without requiring human supervision.

AIBullisharXiv – CS AI · Mar 37/103

🧠

SageBwd: A Trainable Low-bit Attention

Researchers have developed SageBwd, a trainable INT8 attention mechanism that can match full-precision attention performance during pre-training while quantizing six of seven attention matrix multiplications. The study identifies key factors for stable training including QK-norm requirements and the impact of tokens per step on quantization errors.

AINeutralarXiv – CS AI · Mar 37/105

🧠

Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

Researchers identified that fine-tuning non-robust pretrained AI models with robust objectives can lead to poor performance, termed 'suboptimal transfer.' They propose Epsilon-Scheduling, a novel training technique that adjusts perturbation strength during training to improve both task adaptation and adversarial robustness.

AIBearisharXiv – CS AI · Mar 37/103

🧠

On The Fragility of Benchmark Contamination Detection in Reasoning Models

New research reveals that benchmark contamination in language reasoning models (LRMs) is extremely difficult to detect, allowing developers to easily inflate performance scores on public leaderboards. The study shows that reinforcement learning methods like GRPO and PPO can effectively conceal contamination signals, undermining the integrity of AI model evaluations.

$NEAR

AINeutralarXiv – CS AI · Mar 37/103

🧠

What Scales in Cross-Entropy Scaling Law?

Researchers discovered that the traditional cross-entropy scaling law for large language models breaks down at very large scales because only one component (error-entropy) actually follows power-law scaling, while other components remain constant. This finding explains why model performance improvements become less predictable as models grow larger and establishes a new error-entropy scaling law for better understanding LLM development.

AIBullisharXiv – CS AI · Feb 277/106

🧠

Know What You Know: Metacognitive Entropy Calibration for Verifiable RL Reasoning

Researchers propose EGPO, a new framework that improves large reasoning models by incorporating uncertainty awareness into reinforcement learning training. The approach addresses the "uncertainty-reward mismatch" where current training methods treat high and low-confidence solutions equally, preventing models from developing better reasoning capabilities.

AIBullisharXiv – CS AI · Feb 277/108

🧠

FlashOptim: Optimizers for Memory Efficient Training

FlashOptim introduces memory optimization techniques that reduce AI training memory requirements by over 50% per parameter while maintaining model quality. The suite reduces AdamW memory usage from 16 bytes to 7 bytes per parameter through improved master weight splitting and 8-bit optimizer state quantization.

AIBullisharXiv – CS AI · Feb 277/106

🧠

veScale-FSDP: Flexible and High-Performance FSDP at Scale

Researchers introduce veScale-FSDP, a redesigned Fully Sharded Data Parallel system that overcomes limitations of current FSDP implementations used for training large-scale AI models. The new system features flexible RaggedShard format and structure-aware planning, achieving 5-66% higher throughput and 16-30% lower memory usage while supporting advanced training methods and scaling to tens of thousands of GPUs.

AINeutralOpenAI News · Jun 187/106

🧠

Toward understanding and preventing misalignment generalization

Researchers have identified how training language models on incorrect responses can lead to broader misalignment issues. They discovered an internal feature responsible for this behavior that can be corrected through minimal fine-tuning.

AIBullishSynced Review · May 157/109

🧠

DeepSeek-V3 New Paper is coming! Unveiling the Secrets of Low-Cost Large Model Training through Hardware-Aware Co-design

DeepSeek has released a 14-page technical paper on their V3 model, focusing on scaling challenges and hardware-aware co-design for low-cost large model training. The paper, co-authored by DeepSeek CEO Wenfeng Liang, reveals insights into cost-effective AI architecture development.

AIBullishOpenAI News · Aug 207/106

🧠

Fine-tuning now available for GPT-4o

OpenAI has announced that fine-tuning capabilities are now available for GPT-4o, allowing users to create custom versions of the model. This feature enables developers to improve performance and accuracy for specific applications by training the model on their particular use cases.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Improved Large Language Diffusion Models

Researchers introduce iLLaDA, an 8B masked diffusion language model trained with fully bidirectional attention instead of the standard autoregressive approach. The model demonstrates significant performance improvements over its predecessor LLaDA and remains competitive with larger models like Qwen2.5 7B, suggesting bidirectional diffusion training is a viable alternative path for building competitive language models.

AINeutralarXiv – CS AI · Jun 236/10

🧠

SFT Overtraining Predicts Rank Inversion via Entropy Collapse Under RLVR

Researchers demonstrate that over-training SFT (supervised fine-tuning) models can paradoxically degrade RLHF performance by compressing the rollout distribution's entropy, causing rank inversion where higher pre-RL pass rates correlate with worse post-RL outcomes. Testing on Qwen2.5-Coder and DeepSeek-Coder reveals this failure mode occurs when entropy collapse prevents effective group-relative reward signals, suggesting a fundamental optimization challenge in LLM alignment pipelines.

AIBullisharXiv – CS AI · Jun 196/10

🧠

Which Pairs to Compare for LLM Post-Training?

Researchers present a theoretical framework for optimizing which comparison pairs to label during large language model preference-based post-training, showing that strategic pair selection can significantly improve sample efficiency. By formulating the problem as a sampling-design challenge with bounds on policy performance, the work provides practical guidance for allocating limited labeling budgets when training models like those using Direct Preference Optimization.

AINeutralarXiv – CS AI · Jun 195/10

🧠

PrefSQA: Pairwise Preference Prediction for Speech Quality Assessment and the Critical Role of High Quality Datasets

Researchers introduce PrefSQA, a machine learning method that predicts speech quality through pairwise preference comparisons rather than traditional mean opinion scores (MOS). The approach incorporates uncertainty-aware logits and attention mechanisms, demonstrating that preference-based labeling produces cleaner, more reliable datasets than scalar MOS ratings, though improvements vary significantly based on dataset quality.

AINeutralarXiv – CS AI · Jun 116/10

🧠

Autoregressive Direct Preference Optimization

Researchers propose Autoregressive Direct Preference Optimization (ADPO), a refined theoretical framework for aligning large language models with human preferences. The innovation explicitly incorporates autoregressive assumptions before applying the Bradley-Terry model, resulting in a mathematically elegant loss function and introducing two distinct length measures—token length and feedback length—for optimizing LLM preference alignment.

AIBullisharXiv – CS AI · Jun 116/10

🧠

Noise-Aware Framework for Correcting Corrupted Labels

Researchers introduce CANOLA, a framework that corrects corrupted labels in datasets by estimating noise distributions and iteratively refining labels through noise-aware deep learning. The approach achieves 19-52% error reduction compared to existing methods and enables simpler models trained on corrected data to outperform complex alternatives by up to 67%.

AINeutralarXiv – CS AI · Jun 106/10

🧠

The Role of Feedback Alignment in Self-Distillation

Researchers demonstrate that self-distillation in language models improves significantly when feedback is structurally aligned with the model's reasoning trace rather than using binary rewards or reference solutions. Step-aligned critique, which targets only tokens where reasoning fails, outperforms alternative approaches by 5-16 points, suggesting that feedback design fundamentally shapes model learning efficiency.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Beyond Absolute Imitation: Anchored Residual Guidance for Privileged On-Policy Distillation

Researchers introduce Anchored Residual On-Policy Distillation (AR-OPD), a new framework for training smaller language models that improves upon existing privileged distillation methods by separating locally reachable reasoning from oracle guidance. The approach achieves 2.3-point gains over full privileged distillation and 7.9-point gains over standard supervised fine-tuning, with significant improvements on long-horizon reasoning tasks.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Active Learning with Foundation Model Priors: Efficient Learning under Class Imbalance

Researchers propose an active learning framework that combines foundation model priors with smaller models to address class imbalance and label noise in real-world datasets. The method achieves over 50% annotation savings compared to existing active learning baselines while maintaining model performance across image and text domains.

AINeutralarXiv – CS AI · Jun 95/10

🧠

SEF-CLGC at SemEval-2026 Task 11: Logical Notation Impact on Language Model Performance

Researchers present SEF-CLGC, a framework combining formal logical notations with Small Language Models to evaluate reasoning capabilities in the SemEval-2026 Task 11. The study demonstrates that training SLMs on hybrid natural and symbolic languages achieves a 27.80% content score while reducing reasoning bias, offering insights into how formal notation impacts language model performance.

← PrevPage 3 of 7Next →