y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#fine-tuning News & Analysis

131 articles tagged with #fine-tuning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

131 articles
AIBullisharXiv โ€“ CS AI ยท Mar 266/10
๐Ÿง 

MedAidDialog: A Multilingual Multi-Turn Medical Dialogue Dataset for Accessible Healthcare

Researchers have introduced MedAidDialog, a multilingual medical dialogue dataset covering seven languages, and developed MedAidLM, a conversational AI model for preliminary medical consultations. The system uses parameter-efficient fine-tuning on small language models to enable deployment without high-end computational infrastructure while incorporating patient context for personalized consultations.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

FedTreeLoRA: Reconciling Statistical and Functional Heterogeneity in Federated LoRA Fine-Tuning

Researchers propose FedTreeLoRA, a new framework for privacy-preserving fine-tuning of large language models that addresses both statistical and functional heterogeneity across federated learning clients. The method uses tree-structured aggregation to allow layer-wise specialization while maintaining shared consensus on foundational layers, significantly outperforming existing personalized federated learning approaches.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

IGU-LoRA: Adaptive Rank Allocation via Integrated Gradients and Uncertainty-Aware Scoring

Researchers introduce IGU-LoRA, a new parameter-efficient fine-tuning method for large language models that adaptively allocates ranks across layers using integrated gradients and uncertainty-aware scoring. The approach addresses limitations of existing methods like AdaLoRA by providing more stable and accurate layer importance estimates, consistently outperforming baselines across diverse tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

Diffusion Reinforcement Learning via Centered Reward Distillation

Researchers present Centered Reward Distillation (CRD), a new reinforcement learning framework for fine-tuning diffusion models that addresses brittleness issues in existing methods. The approach uses within-prompt centering and drift control techniques to achieve state-of-the-art performance in text-to-image generation while reducing reward hacking and convergence issues.

AIBullisharXiv โ€“ CS AI ยท Mar 126/10
๐Ÿง 

When Fine-Tuning Fails and when it Generalises: Role of Data Diversity and Mixed Training in LLM-based TTS

Research demonstrates that LoRA fine-tuning of large language models significantly improves text-to-speech systems, achieving up to 0.42 DNS-MOS gains and 34% SNR improvements when training data has sufficient acoustic diversity. The study establishes LoRA as an effective mechanism for speaker adaptation in compact LLM-based TTS systems, outperforming frozen base models across perceptual quality, speaker fidelity, and signal quality metrics.

AIBullisharXiv โ€“ CS AI ยท Mar 116/10
๐Ÿง 

MSSR: Memory-Aware Adaptive Replay for Continual LLM Fine-Tuning

Researchers propose MSSR (Memory-Inspired Sampler and Scheduler Replay), a new framework for continual fine-tuning of large language models that mitigates catastrophic forgetting while maintaining adaptability. The method estimates sample-level memory strength and schedules rehearsal at adaptive intervals, showing superior performance across three backbone models and 11 sequential tasks compared to existing replay-based strategies.

AIBullisharXiv โ€“ CS AI ยท Mar 116/10
๐Ÿง 

Cognitively Layered Data Synthesis for Domain Adaptation of LLMs to Space Situational Awareness

Researchers developed BD-FDG, a framework for adapting large language models to complex engineering domains like space situational awareness. The method creates high-quality training datasets using structured knowledge organization and cognitive layering, resulting in SSA-LLM-8B that shows 144-176% BLEU-1 improvements while maintaining general performance.

AIBullisharXiv โ€“ CS AI ยท Mar 96/10
๐Ÿง 

Addressing the Ecological Fallacy in Larger LMs with Human Context

Researchers developed a method called HuLM (Human-aware Language Modeling) that improves large language model performance by considering the context of text written by the same author over time. Testing on an 8B Llama model showed that incorporating author context during fine-tuning significantly improves performance across eight downstream tasks.

๐Ÿง  Llama
AIBullisharXiv โ€“ CS AI ยท Mar 55/10
๐Ÿง 

Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

Researchers developed a hybrid AI architecture for agricultural advisory that separates factual retrieval from conversational delivery, using supervised fine-tuning on expert-curated agricultural knowledge. The system showed improved accuracy and safety for smallholder farmers while achieving comparable results to frontier models at lower cost.

AIBullisharXiv โ€“ CS AI ยท Mar 45/103
๐Ÿง 

Quantum-Inspired Fine-Tuning for Few-Shot AIGC Detection via Phase-Structured Reparameterization

Researchers propose Q-LoRA, a quantum-enhanced fine-tuning method that integrates quantum neural networks into LoRA adapters for improved AI-generated content detection. The study also introduces H-LoRA, a classical variant using Hilbert transforms that achieves similar 5%+ accuracy improvements over standard LoRA at lower computational cost.

AIBullisharXiv โ€“ CS AI ยท Mar 36/104
๐Ÿง 

Prompt and Parameter Co-Optimization for Large Language Models

Researchers introduce MetaTuner, a new framework that combines prompt optimization with fine-tuning for Large Language Models, using shared neural networks to discover optimal combinations of prompts and parameters. The approach addresses the discrete-continuous optimization challenge through supervised regularization and demonstrates consistent performance improvements across benchmarks.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

Training Large Language Models To Reason In Parallel With Global Forking Tokens

Researchers developed Set Supervised Fine-Tuning (SSFT) and Global Forking Policy Optimization (GFPO) methods to improve large language model reasoning by enabling parallel processing through 'global forking tokens.' The techniques preserve diverse reasoning modes and demonstrate superior performance on math and code generation benchmarks compared to traditional fine-tuning approaches.

AIBullisharXiv โ€“ CS AI ยท Mar 36/104
๐Ÿง 

Regularization Through Reasoning: Systematic Improvements in Language Model Classification via Explanation-Enhanced Fine-Tuning

Researchers found that fine-tuning large language models with explanations attached to labels significantly improves classification accuracy compared to label-only training. Surprisingly, even random token sequences that mimic explanation structure provide similar benefits, suggesting the improvement comes from increased token budget and regularization rather than semantic meaning.

AINeutralarXiv โ€“ CS AI ยท Mar 36/107
๐Ÿง 

When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation

Researchers fine-tuned the Llama 2 7B model using real patient-doctor interaction transcripts to improve medical query responses, but found significant discrepancies between automatic similarity metrics and GPT-4 evaluations. The study highlights the challenges in evaluating AI medical models and recommends human medical expert review for proper validation.

AIBullisharXiv โ€“ CS AI ยท Mar 37/106
๐Ÿง 

Token-level Data Selection for Safe LLM Fine-tuning

Researchers have developed TOSS, a new framework for safely fine-tuning large language models that operates at the token level rather than sample level. The method identifies and removes unsafe tokens while preserving task-specific information, demonstrating superior performance compared to existing sample-level defense methods in maintaining both safety and utility.

AINeutralarXiv โ€“ CS AI ยท Mar 37/108
๐Ÿง 

SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond

Researchers introduce SafeSci, a comprehensive framework for evaluating safety in large language models used for scientific applications. The framework includes a 0.25M sample benchmark and 1.5M sample training dataset, revealing critical vulnerabilities in 24 advanced LLMs while demonstrating that fine-tuning can significantly improve safety alignment.

AIBullisharXiv โ€“ CS AI ยท Mar 35/104
๐Ÿง 

EstLLM: Enhancing Estonian Capabilities in Multilingual LLMs via Continued Pretraining and Post-Training

Researchers developed EstLLM, enhancing Estonian language capabilities in multilingual LLMs through continued pretraining of Llama 3.1 8B with balanced data mixtures. The approach improved Estonian linguistic performance while maintaining English capabilities, demonstrating that targeted continued pretraining can substantially improve single-language performance in multilingual models.

AINeutralarXiv โ€“ CS AI ยท Mar 26/1013
๐Ÿง 

DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science

Researchers introduce DARE-bench, a new benchmark with 6,300 Kaggle-derived tasks for evaluating Large Language Models' performance on data science and machine learning tasks. The benchmark reveals that even advanced models like GPT-4-mini struggle with ML modeling tasks, while fine-tuning on DARE-bench data can improve model accuracy by up to 8x.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1012
๐Ÿง 

Task-Centric Acceleration of Small-Language Models

Researchers propose TASC (Task-Adaptive Sequence Compression), a framework for accelerating small language models through two methods: TASC-ft for fine-tuning with expanded vocabularies and TASC-spec for training-free speculative decoding. The methods demonstrate improved inference efficiency while maintaining task performance across low output-variability generation tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1018
๐Ÿง 

Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation

Researchers introduce LoRA-Pre, a memory-efficient optimizer that reduces memory overhead in training large language models by using low-rank approximation of momentum states. The method achieves superior performance on Llama models from 60M to 1B parameters while using only 1/8 the rank of baseline methods.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1015
๐Ÿง 

FineScope : SAE-guided Data Selection Enables Domain Specific LLM Pruning and Finetuning

Researchers introduce FineScope, a framework that uses Sparse Autoencoder (SAE) techniques to create smaller, domain-specific language models from larger pretrained LLMs through structured pruning and self-data distillation. The method achieves competitive performance while significantly reducing computational requirements compared to training from scratch.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1019
๐Ÿง 

Thompson Sampling via Fine-Tuning of LLMs

Researchers developed ToSFiT (Thompson Sampling via Fine-Tuning), a new Bayesian optimization method that uses fine-tuned large language models to improve search efficiency in complex discrete spaces. The approach eliminates computational bottlenecks by directly parameterizing reward probabilities and demonstrates superior performance across diverse applications including protein search and quantum circuit design.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1018
๐Ÿง 

LIA: Supervised Fine-Tuning of Large Language Models for Automatic Issue Assignment

Researchers developed LIA, a supervised fine-tuning approach using DeepSeek-R1-Distill-Llama-8B to automatically assign software issues to developers. The system achieved up to 187.8% improvement over the base model and 211.2% better performance than existing methods in developer recommendation accuracy.

AINeutralarXiv โ€“ CS AI ยท Feb 275/107
๐Ÿง 

Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction

Researchers introduced Conditioned Comment Prediction (CCP) to evaluate how well Large Language Models can simulate social media user behavior by predicting user comments. The study found that supervised fine-tuning improves text structure but degrades semantic accuracy, and that behavioral histories are more effective than descriptive personas for user simulation.