#parameter-efficiency News & Analysis

71 articles tagged with #parameter-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

71 articles

AIBullisharXiv – CS AI · Jun 257/10

🧠

Rational Neural Networks have Expressivity Advantages

Researchers demonstrate that neural networks using trainable rational activation functions achieve exponentially better parameter efficiency and expressivity compared to standard activations like ReLU, Sigmoid, and Tanh. The findings show rational activations require only polylogarithmic overhead to approximate fixed-activation networks, while the reverse requires logarithmic parameters—a theoretical advantage that translates to practical performance gains.

AIBullisharXiv – CS AI · Jun 237/10

🧠

Tapered Language Models

Researchers propose Tapered Language Models (TLMs), an architectural principle that allocates more parameters to earlier layers and fewer to later layers, contrary to the uniform allocation standard since the original transformer. Experiments across multiple model scales and architectures show this depth-aware capacity distribution improves perplexity and benchmark performance at no additional computational cost.

🏢 Perplexity

AIBullisharXiv – CS AI · Jun 117/10

🧠

SirenFNO: Efficient and Full Frequency Learning of Fourier Neural Operators

Researchers introduce SirenFNO, a neural network framework that improves Fourier Neural Operators by eliminating frequency truncation limitations and enabling full-spectrum learning. The approach achieves 4-15x parameter reduction while maintaining discretization invariance, with functional decomposition variants reaching up to 73x fewer parameters across multiple PDE benchmarks.

AIBullisharXiv – CS AI · Jun 97/10

🧠

How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions

Researchers demonstrate that smaller language models (270M-8B parameters) can match or nearly match the performance of larger models for merchant information extraction in financial transactions through strategic fine-tuning techniques. The study identifies Qwen 3.5 4B as achieving 96.60% F1 score with half the parameters of the baseline LLaMA 3.1-8B model, offering significant cost and latency improvements for production deployment.

AIBullisharXiv – CS AI · May 297/10

🧠

Tiny Brains, Giant Impact: Uncovering the Keystone Neurons of LLM with Just a Few Prompts

Researchers have identified "keystone neurons" in large language models—a tiny subset of neurons that remain highly activated across diverse tasks and are critical for model performance. By fine-tuning only these neurons rather than updating all parameters, they achieved comparable or better task performance while preserving other capabilities, offering a more efficient approach to model adaptation.

AIBullisharXiv – CS AI · May 287/10

🧠

Comparative Analysis of Liquid Neural Networks and LSTM for Sequential Pattern Recognition: Robustness, Efficiency, and Clinical Utility

Researchers benchmark Liquid Neural Networks (LNNs) against traditional LSTMs across four sequential data domains, finding that LNNs deliver superior parameter efficiency and robustness in handling sparse, temporal data—particularly valuable for clinical applications. The study demonstrates LNNs' continuous-time modeling approach outperforms discrete-step RNNs when data is missing or irregularly sampled, suggesting significant implications for real-world AI deployment in healthcare and edge computing.

AIBullisharXiv – CS AI · May 287/10

🧠

Knowledge Graph-Driven Expert-Level Reasoning for Neuroscience

Researchers demonstrate that knowledge graphs extracted from a single neuroscience textbook can be converted into high-quality training data to fine-tune language models, enabling expert-level reasoning that outperforms larger LLMs while using far fewer parameters. This approach challenges the prevailing assumption that domain expertise requires massive, diverse datasets, showing instead that structured, curated knowledge can produce superior specialized AI systems.

AIBullisharXiv – CS AI · May 127/10

🧠

The Attacker in the Mirror: Breaking Self-Consistency in Safety via Anchored Bipolicy Self-Play

Researchers propose Anchored Bipolicy Self-Play, a new safety training method that addresses fundamental limitations in parameter-shared self-play red teaming by using distinct LoRA adapters for attacker and defender roles. The approach achieves 100x greater parameter efficiency and improved safety robustness across multiple language model scales without sacrificing reasoning ability.

AIBullisharXiv – CS AI · May 127/10

🧠

MC-RFM: Geometry-Aware Few-Shot Adaptation via Mixed-Curvature Riemannian Flow Matching

Researchers introduce MC-RFM, a novel framework for efficiently adapting frozen vision models to new tasks using mixed-curvature Riemannian geometry. The method represents adapted features on a product manifold combining hyperbolic and Euclidean spaces, outperforming existing parameter-efficient adaptation techniques across multiple benchmarks and backbone architectures.

AIBullishDecrypt · May 117/10

🧠

Baidu's New AI Is Already Beating Top Models and Cost 94% Less to Build

Baidu's ERNIE 5.1 has reached the top of Chinese AI leaderboards while requiring 94% less computational resources to build than competing models. This breakthrough in parameter efficiency demonstrates that raw scale and spending aren't prerequisites for state-of-the-art AI performance, potentially reshaping how organizations approach model development and deployment.

AIBullisharXiv – CS AI · May 117/10

🧠

MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning

Researchers introduce MatryoshkaLoRA, a novel training framework that improves upon Low-Rank Adaptation (LoRA) for efficient large language model fine-tuning by learning hierarchical low-rank representations through a strategically placed diagonal scaling matrix. The method enables dynamic rank selection with minimal accuracy loss and introduces AURAC, a new evaluation metric for hierarchical adapters, addressing a key limitation in current parameter-efficient fine-tuning approaches.

AIBullisharXiv – CS AI · May 117/10

🧠

Qwen3-VL-Seg: Unlocking Open-World Referring Segmentation with Vision-Language Grounding

Researchers introduce Qwen3-VL-Seg, an efficient vision-language model that converts bounding box predictions into pixel-level segmentation masks for open-world referring segmentation tasks. The framework adds minimal parameters (17M, 0.4% overhead) while achieving strong performance on language-intensive visual grounding across in-distribution and out-of-distribution benchmarks.

AIBullisharXiv – CS AI · May 97/10

🧠

From History to State: Constant-Context Skill Learning for LLM Agents

Researchers propose constant-context skill learning, a framework enabling LLM agents to learn reusable task procedures as lightweight modules rather than storing long prompts in memory. The approach reduces token usage per inference by 2-7x while maintaining or improving performance across multiple benchmark environments, addressing the privacy-capability tradeoff in agent deployment.

🧠 Llama

AIBullisharXiv – CS AI · May 97/10

🧠

Leviathan: Decoupling Input and Output Representations in Language Models

Researchers introduce Leviathan, a Transformer architecture that decouples input embeddings from output projections using learned embedding vectorization (LEV), achieving 9% perplexity reduction at 1.2B parameters with minimal overhead. The approach concentrates improvements on rare tokens while requiring 2.1x fewer training tokens to match baseline performance.

🏢 Perplexity

AIBullisharXiv – CS AI · May 47/10

🧠

Lightweight Domain Adaptation of a Large Language Model for Legal Assistance in the Indian Context

Researchers developed Legal Assist AI, a framework using an 8-billion-parameter Llama 3.1 model enhanced with Retrieval-Augmented Generation to provide legal assistance tailored to Indian law. The system achieved 60.08% on the All-India Bar Examination benchmark, outperforming OpenAI's 175-billion-parameter GPT-3.5 Turbo while being 22 times more parameter-efficient.

🧠 Llama

AIBullisharXiv – CS AI · May 17/10

🧠

Post-Optimization Adaptive Rank Allocation for LoRA

Researchers introduce PARA, a post-optimization compression method for LoRA (Low-Rank Adaptation) that reduces parameter count by 75-90% while maintaining performance. The technique uses Singular Value Decomposition to allocate non-uniform ranks across model layers based on spectral importance, addressing inefficiencies in standard LoRA implementations.

AIBullisharXiv – CS AI · Apr 157/10

🧠

Efficient Adversarial Training via Criticality-Aware Fine-Tuning

Researchers introduce Criticality-Aware Adversarial Training (CAAT), a parameter-efficient method that identifies and fine-tunes only the most robustness-critical parameters in Vision Transformers, achieving 94.3% of standard adversarial training robustness while tuning just 6% of model parameters. This breakthrough addresses the computational bottleneck preventing large-scale adversarial training deployment.

AIBullisharXiv – CS AI · Mar 167/10

🧠

LightMoE: Reducing Mixture-of-Experts Redundancy through Expert Replacing

Researchers introduce LightMoE, a new framework that compresses Mixture-of-Experts language models by replacing redundant expert modules with parameter-efficient alternatives. The method achieves 30-50% compression rates while maintaining or improving performance, addressing the substantial memory demands that limit MoE model deployment.

AIBullisharXiv – CS AI · Mar 117/10

🧠

Logos: An evolvable reasoning engine for rational molecular design

Researchers introduce Logos, a compact AI model that combines multi-step logical reasoning with chemical consistency for molecular design. The model achieves strong performance in structural accuracy and chemical validity while using fewer parameters than larger language models, and provides transparent reasoning that can be inspected by humans.

AIBullisharXiv – CS AI · Mar 57/10

🧠

Spectral Surgery: Training-Free Refinement of LoRA via Gradient-Guided Singular Value Reweighting

Researchers have developed Spectral Surgery, a training-free method to improve LoRA (Low-Rank Adaptation) model performance by reweighting singular values based on gradient sensitivity. The technique achieves significant performance gains (up to +4.4 points on CommonsenseQA) by adjusting only about 1,000 scalar coefficients without requiring retraining.

🧠 Llama

AIBullisharXiv – CS AI · Mar 57/10

🧠

Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

Researchers introduce Draft-Conditioned Constrained Decoding (DCCD), a training-free method that improves structured output generation in large language models by up to 24 percentage points. The technique uses a two-step process that first generates an unconstrained draft, then applies constraints to ensure valid outputs like JSON and API calls.

AIBullisharXiv – CS AI · Mar 47/102

🧠

DiaBlo: Diagonal Blocks Are Sufficient For Finetuning

DiaBlo introduces a new Parameter-Efficient Fine-Tuning (PEFT) method that updates only diagonal blocks of weight matrices in large language models, offering better performance than LoRA while maintaining similar memory efficiency. The approach eliminates the need for low-rank matrix products and provides theoretical guarantees for convergence, showing competitive results across various AI tasks including reasoning and code generation.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models

Researchers introduce Uni-X, a novel architecture for unified multimodal AI models that addresses gradient conflicts between vision and text processing. The X-shaped design uses modality-specific processing at input/output layers while sharing middle layers, achieving superior efficiency and matching 7B parameter models with only 3B parameters.

$UNI

AIBullisharXiv – CS AI · Feb 277/106

🧠

Knowledge Fusion of Large Language Models Via Modular SkillPacks

Researchers introduce GraftLLM, a new method for transferring knowledge between large language models using 'SkillPack' format that preserves capabilities while avoiding catastrophic forgetting. The approach enables efficient model fusion and continual learning for heterogeneous models through modular knowledge storage.

AINeutralarXiv – CS AI · Jun 236/10

🧠

CADRE: Stable, Parameter Efficient Adaptation of Medical Vision Language Models with Bounded Forgetting and Prior Drift

Researchers present CADRE, a parameter-efficient adaptation framework for medical vision-language models that addresses catastrophic forgetting and model drift when updating deployed systems. By combining low-rank adaptation with elastic weight consolidation and prior-anchoring penalties, CADRE reduces forgetting sevenfold while training only 0.23% of parameters, demonstrating improved stability across different medical imaging modalities.

Page 1 of 3Next →