y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#parameter-efficiency News & Analysis

18 articles tagged with #parameter-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

18 articles
AIBullisharXiv โ€“ CS AI ยท Mar 167/10
๐Ÿง 

LightMoE: Reducing Mixture-of-Experts Redundancy through Expert Replacing

Researchers introduce LightMoE, a new framework that compresses Mixture-of-Experts language models by replacing redundant expert modules with parameter-efficient alternatives. The method achieves 30-50% compression rates while maintaining or improving performance, addressing the substantial memory demands that limit MoE model deployment.

AIBullisharXiv โ€“ CS AI ยท Mar 117/10
๐Ÿง 

Logos: An evolvable reasoning engine for rational molecular design

Researchers introduce Logos, a compact AI model that combines multi-step logical reasoning with chemical consistency for molecular design. The model achieves strong performance in structural accuracy and chemical validity while using fewer parameters than larger language models, and provides transparent reasoning that can be inspected by humans.

AIBullisharXiv โ€“ CS AI ยท Mar 57/10
๐Ÿง 

Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

Researchers introduce Draft-Conditioned Constrained Decoding (DCCD), a training-free method that improves structured output generation in large language models by up to 24 percentage points. The technique uses a two-step process that first generates an unconstrained draft, then applies constraints to ensure valid outputs like JSON and API calls.

AIBullisharXiv โ€“ CS AI ยท Mar 57/10
๐Ÿง 

Spectral Surgery: Training-Free Refinement of LoRA via Gradient-Guided Singular Value Reweighting

Researchers have developed Spectral Surgery, a training-free method to improve LoRA (Low-Rank Adaptation) model performance by reweighting singular values based on gradient sensitivity. The technique achieves significant performance gains (up to +4.4 points on CommonsenseQA) by adjusting only about 1,000 scalar coefficients without requiring retraining.

๐Ÿง  Llama
AIBullisharXiv โ€“ CS AI ยท Mar 47/102
๐Ÿง 

DiaBlo: Diagonal Blocks Are Sufficient For Finetuning

DiaBlo introduces a new Parameter-Efficient Fine-Tuning (PEFT) method that updates only diagonal blocks of weight matrices in large language models, offering better performance than LoRA while maintaining similar memory efficiency. The approach eliminates the need for low-rank matrix products and provides theoretical guarantees for convergence, showing competitive results across various AI tasks including reasoning and code generation.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models

Researchers introduce Uni-X, a novel architecture for unified multimodal AI models that addresses gradient conflicts between vision and text processing. The X-shaped design uses modality-specific processing at input/output layers while sharing middle layers, achieving superior efficiency and matching 7B parameter models with only 3B parameters.

$UNI
AIBullisharXiv โ€“ CS AI ยท Feb 277/106
๐Ÿง 

Knowledge Fusion of Large Language Models Via Modular SkillPacks

Researchers introduce GraftLLM, a new method for transferring knowledge between large language models using 'SkillPack' format that preserves capabilities while avoiding catastrophic forgetting. The approach enables efficient model fusion and continual learning for heterogeneous models through modular knowledge storage.

AIBullisharXiv โ€“ CS AI ยท 2d ago6/10
๐Ÿง 

New Hybrid Fine-Tuning Paradigm for LLMs: Algorithm Design and Convergence Analysis Framework

Researchers propose a novel hybrid fine-tuning method for Large Language Models that combines full parameter updates with Parameter-Efficient Fine-Tuning (PEFT) modules using zeroth-order and first-order optimization. The approach addresses computational constraints of full fine-tuning while overcoming PEFT's limitations in knowledge acquisition, backed by theoretical convergence analysis and empirical validation across multiple tasks.

AIBullisharXiv โ€“ CS AI ยท 6d ago6/10
๐Ÿง 

Instance-Adaptive Parametrization for Amortized Variational Inference

Researchers introduce Instance-Adaptive VAE (IA-VAE), a new framework that uses hypernetworks to generate input-specific parameter modulations for variational autoencoders, reducing the amortization gap while maintaining computational efficiency. The approach demonstrates improved posterior approximation accuracy on synthetic data and consistently better ELBO performance on image benchmarks compared to standard VAEs.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers

AdapterTune introduces a new method for efficiently fine-tuning Vision Transformers by using zero-initialized low-rank adapters that start at the pretrained function to prevent optimization instability. The technique achieves +14.9 point accuracy improvement over head-only transfer while using only 0.92% of parameters needed for full fine-tuning.

AIBullisharXiv โ€“ CS AI ยท Mar 166/10
๐Ÿง 

Tiny Recursive Reasoning with Mamba-2 Attention Hybrid

Researchers developed a hybrid model combining Mamba-2 state space operators with Transformer blocks for recursive reasoning, achieving a 2% improvement in pass@2 performance on ARC-AGI-1 tasks with only 6.83M parameters. The study demonstrates that Mamba-2 operators can preserve reasoning capabilities while improving solution candidate coverage in tiny neural networks.

AIBullisharXiv โ€“ CS AI ยท Mar 45/103
๐Ÿง 

GLoRIA: Gated Low-Rank Interpretable Adaptation for Dialectal ASR

Researchers developed GLoRIA, a parameter-efficient framework for automatic speech recognition that adapts to regional dialects using location metadata. The system achieves state-of-the-art performance while updating less than 10% of model parameters and demonstrates strong generalization to unseen dialects.

AIBullisharXiv โ€“ CS AI ยท Mar 36/107
๐Ÿง 

Polynomial Surrogate Training for Differentiable Ternary Logic Gate Networks

Researchers introduce Polynomial Surrogate Training (PST) to enable differentiable ternary logic gate networks, reducing parameters by 2,187x while maintaining performance. The method extends beyond binary logic gates to ternary systems with an UNKNOWN state for uncertainty handling, training 2-3x faster than binary networks.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

Symbol-Equivariant Recurrent Reasoning Models

Researchers introduced Symbol-Equivariant Recurrent Reasoning Models (SE-RRMs), a new neural network architecture that solves reasoning problems like Sudoku and ARC-AGI more efficiently than existing models. SE-RRMs achieve competitive performance with only 2 million parameters and can generalize across different puzzle sizes without requiring extensive data augmentation.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1018
๐Ÿง 

QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

Researchers propose QKAN-LSTM, a quantum-inspired neural network that integrates quantum variational activation functions into LSTM architecture for sequential modeling. The model achieves superior predictive accuracy with 79% fewer parameters than classical LSTMs while remaining executable on classical hardware.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

Raising Bars, Not Parameters: LilMoo Compact Language Model for Hindi

Researchers have developed LilMoo, a 0.6-billion parameter Hindi language model trained from scratch using a transparent, reproducible pipeline optimized for limited compute environments. The model outperforms similarly sized multilingual baselines like Qwen2.5-0.5B and Qwen3-0.6B, demonstrating that language-specific pretraining can rival larger multilingual models.

AINeutralarXiv โ€“ CS AI ยท Mar 44/102
๐Ÿง 

No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models

Researchers developed CDD (Contamination Detection via output Distribution) to identify data contamination in small language models by measuring output peakedness. The study found that CDD only works when fine-tuning produces verbatim memorization, failing at chance level with parameter-efficient methods like low-rank adaptation that avoid memorization.

AIBullishHugging Face Blog ยท Feb 105/104
๐Ÿง 

Parameter-Efficient Fine-Tuning using ๐Ÿค— PEFT

The article discusses parameter-efficient fine-tuning methods using Hugging Face's PEFT library. PEFT enables efficient adaptation of large language models by updating only a small subset of parameters rather than full model retraining.