y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#parameter-efficiency News & Analysis

48 articles tagged with #parameter-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

48 articles
AIBullisharXiv – CS AI · 2d ago7/10
🧠

Tiny Brains, Giant Impact: Uncovering the Keystone Neurons of LLM with Just a Few Prompts

Researchers have identified "keystone neurons" in large language models—a tiny subset of neurons that remain highly activated across diverse tasks and are critical for model performance. By fine-tuning only these neurons rather than updating all parameters, they achieved comparable or better task performance while preserving other capabilities, offering a more efficient approach to model adaptation.

AIBullisharXiv – CS AI · 3d ago7/10
🧠

Knowledge Graph-Driven Expert-Level Reasoning for Neuroscience

Researchers demonstrate that knowledge graphs extracted from a single neuroscience textbook can be converted into high-quality training data to fine-tune language models, enabling expert-level reasoning that outperforms larger LLMs while using far fewer parameters. This approach challenges the prevailing assumption that domain expertise requires massive, diverse datasets, showing instead that structured, curated knowledge can produce superior specialized AI systems.

AIBullisharXiv – CS AI · 3d ago7/10
🧠

Comparative Analysis of Liquid Neural Networks and LSTM for Sequential Pattern Recognition: Robustness, Efficiency, and Clinical Utility

Researchers benchmark Liquid Neural Networks (LNNs) against traditional LSTMs across four sequential data domains, finding that LNNs deliver superior parameter efficiency and robustness in handling sparse, temporal data—particularly valuable for clinical applications. The study demonstrates LNNs' continuous-time modeling approach outperforms discrete-step RNNs when data is missing or irregularly sampled, suggesting significant implications for real-world AI deployment in healthcare and edge computing.

AIBullisharXiv – CS AI · May 127/10
🧠

The Attacker in the Mirror: Breaking Self-Consistency in Safety via Anchored Bipolicy Self-Play

Researchers propose Anchored Bipolicy Self-Play, a new safety training method that addresses fundamental limitations in parameter-shared self-play red teaming by using distinct LoRA adapters for attacker and defender roles. The approach achieves 100x greater parameter efficiency and improved safety robustness across multiple language model scales without sacrificing reasoning ability.

AIBullisharXiv – CS AI · May 127/10
🧠

MC-RFM: Geometry-Aware Few-Shot Adaptation via Mixed-Curvature Riemannian Flow Matching

Researchers introduce MC-RFM, a novel framework for efficiently adapting frozen vision models to new tasks using mixed-curvature Riemannian geometry. The method represents adapted features on a product manifold combining hyperbolic and Euclidean spaces, outperforming existing parameter-efficient adaptation techniques across multiple benchmarks and backbone architectures.

AIBullishDecrypt · May 117/10
🧠

Baidu's New AI Is Already Beating Top Models and Cost 94% Less to Build

Baidu's ERNIE 5.1 has reached the top of Chinese AI leaderboards while requiring 94% less computational resources to build than competing models. This breakthrough in parameter efficiency demonstrates that raw scale and spending aren't prerequisites for state-of-the-art AI performance, potentially reshaping how organizations approach model development and deployment.

Baidu's New AI Is Already Beating Top Models and Cost 94% Less to Build
AIBullisharXiv – CS AI · May 117/10
🧠

Qwen3-VL-Seg: Unlocking Open-World Referring Segmentation with Vision-Language Grounding

Researchers introduce Qwen3-VL-Seg, an efficient vision-language model that converts bounding box predictions into pixel-level segmentation masks for open-world referring segmentation tasks. The framework adds minimal parameters (17M, 0.4% overhead) while achieving strong performance on language-intensive visual grounding across in-distribution and out-of-distribution benchmarks.

AIBullisharXiv – CS AI · May 117/10
🧠

MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning

Researchers introduce MatryoshkaLoRA, a novel training framework that improves upon Low-Rank Adaptation (LoRA) for efficient large language model fine-tuning by learning hierarchical low-rank representations through a strategically placed diagonal scaling matrix. The method enables dynamic rank selection with minimal accuracy loss and introduces AURAC, a new evaluation metric for hierarchical adapters, addressing a key limitation in current parameter-efficient fine-tuning approaches.

AIBullisharXiv – CS AI · May 97/10
🧠

Leviathan: Decoupling Input and Output Representations in Language Models

Researchers introduce Leviathan, a Transformer architecture that decouples input embeddings from output projections using learned embedding vectorization (LEV), achieving 9% perplexity reduction at 1.2B parameters with minimal overhead. The approach concentrates improvements on rare tokens while requiring 2.1x fewer training tokens to match baseline performance.

🏢 Perplexity
AIBullisharXiv – CS AI · May 97/10
🧠

From History to State: Constant-Context Skill Learning for LLM Agents

Researchers propose constant-context skill learning, a framework enabling LLM agents to learn reusable task procedures as lightweight modules rather than storing long prompts in memory. The approach reduces token usage per inference by 2-7x while maintaining or improving performance across multiple benchmark environments, addressing the privacy-capability tradeoff in agent deployment.

🧠 Llama
AIBullisharXiv – CS AI · May 47/10
🧠

Lightweight Domain Adaptation of a Large Language Model for Legal Assistance in the Indian Context

Researchers developed Legal Assist AI, a framework using an 8-billion-parameter Llama 3.1 model enhanced with Retrieval-Augmented Generation to provide legal assistance tailored to Indian law. The system achieved 60.08% on the All-India Bar Examination benchmark, outperforming OpenAI's 175-billion-parameter GPT-3.5 Turbo while being 22 times more parameter-efficient.

🧠 Llama
AIBullisharXiv – CS AI · May 17/10
🧠

Post-Optimization Adaptive Rank Allocation for LoRA

Researchers introduce PARA, a post-optimization compression method for LoRA (Low-Rank Adaptation) that reduces parameter count by 75-90% while maintaining performance. The technique uses Singular Value Decomposition to allocate non-uniform ranks across model layers based on spectral importance, addressing inefficiencies in standard LoRA implementations.

AIBullisharXiv – CS AI · Apr 157/10
🧠

Efficient Adversarial Training via Criticality-Aware Fine-Tuning

Researchers introduce Criticality-Aware Adversarial Training (CAAT), a parameter-efficient method that identifies and fine-tunes only the most robustness-critical parameters in Vision Transformers, achieving 94.3% of standard adversarial training robustness while tuning just 6% of model parameters. This breakthrough addresses the computational bottleneck preventing large-scale adversarial training deployment.

AIBullisharXiv – CS AI · Mar 167/10
🧠

LightMoE: Reducing Mixture-of-Experts Redundancy through Expert Replacing

Researchers introduce LightMoE, a new framework that compresses Mixture-of-Experts language models by replacing redundant expert modules with parameter-efficient alternatives. The method achieves 30-50% compression rates while maintaining or improving performance, addressing the substantial memory demands that limit MoE model deployment.

AIBullisharXiv – CS AI · Mar 117/10
🧠

Logos: An evolvable reasoning engine for rational molecular design

Researchers introduce Logos, a compact AI model that combines multi-step logical reasoning with chemical consistency for molecular design. The model achieves strong performance in structural accuracy and chemical validity while using fewer parameters than larger language models, and provides transparent reasoning that can be inspected by humans.

AIBullisharXiv – CS AI · Mar 57/10
🧠

Spectral Surgery: Training-Free Refinement of LoRA via Gradient-Guided Singular Value Reweighting

Researchers have developed Spectral Surgery, a training-free method to improve LoRA (Low-Rank Adaptation) model performance by reweighting singular values based on gradient sensitivity. The technique achieves significant performance gains (up to +4.4 points on CommonsenseQA) by adjusting only about 1,000 scalar coefficients without requiring retraining.

🧠 Llama
AIBullisharXiv – CS AI · Mar 57/10
🧠

Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

Researchers introduce Draft-Conditioned Constrained Decoding (DCCD), a training-free method that improves structured output generation in large language models by up to 24 percentage points. The technique uses a two-step process that first generates an unconstrained draft, then applies constraints to ensure valid outputs like JSON and API calls.

AIBullisharXiv – CS AI · Mar 47/102
🧠

DiaBlo: Diagonal Blocks Are Sufficient For Finetuning

DiaBlo introduces a new Parameter-Efficient Fine-Tuning (PEFT) method that updates only diagonal blocks of weight matrices in large language models, offering better performance than LoRA while maintaining similar memory efficiency. The approach eliminates the need for low-rank matrix products and provides theoretical guarantees for convergence, showing competitive results across various AI tasks including reasoning and code generation.

AIBullisharXiv – CS AI · Feb 277/106
🧠

Knowledge Fusion of Large Language Models Via Modular SkillPacks

Researchers introduce GraftLLM, a new method for transferring knowledge between large language models using 'SkillPack' format that preserves capabilities while avoiding catastrophic forgetting. The approach enables efficient model fusion and continual learning for heterogeneous models through modular knowledge storage.

AINeutralarXiv – CS AI · 2d ago6/10
🧠

Context Distillation as Latent Memory Management

Researchers propose a novel approach to context distillation that treats compressed contextual information as a latent memory management problem, using modular LoRA adapters with intelligent retrieval and self-gating mechanisms to improve efficiency and robustness in machine learning systems.

AINeutralarXiv – CS AI · 2d ago6/10
🧠

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Researchers propose EKSFT, a novel fine-tuning method that selectively masks high-entropy and high-KL divergence tokens during supervised fine-tuning of large language models. The approach aims to preserve pre-trained model distributions while efficiently activating task-relevant capabilities in low-data regimes, demonstrating improved performance on mathematical reasoning benchmarks.

AIBullisharXiv – CS AI · 2d ago6/10
🧠

DynSess: Dynamic Session-Level Evaluation and Optimization Framework for Role-Playing Agents

Researchers introduce DynSess, a framework that evaluates and optimizes role-playing agents at the session level rather than individual turns, enabling LLMs to maintain character consistency across extended conversations. The framework includes improved evaluation metrics, optimized training methods (DSPO and GSRPO), and demonstrates performance matching larger models with fewer parameters.

AIBullisharXiv – CS AI · 3d ago6/10
🧠

Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts

Researchers present a method for aggressively pruning expert modules from mixture-of-experts large language models to create specialized translation systems. The approach removes up to 90% of experts with minimal performance degradation, demonstrating that translation tasks require only a fraction of a full LLM's parameters, enabling substantial model compression.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Energy-Structured Low-Rank Adaptation for Continual Learning

Researchers propose E²-LoRA, a novel continual learning method that addresses task interference by concentrating knowledge into low-rank representations rather than spreading it across multiple basis vectors. The approach theoretically proves that preserving parameters along principal drift directions minimizes reconstruction error while freeing model capacity for future tasks.

Page 1 of 2Next →