y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#neural-networks News & Analysis

358 articles tagged with #neural-networks. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

358 articles
AIBullisharXiv โ€“ CS AI ยท Mar 36/104
๐Ÿง 

Phase-Aware Mixture of Experts for Agentic Reinforcement Learning

Researchers propose Phase-Aware Mixture of Experts (PA-MoE) to improve reinforcement learning for LLM agents by addressing simplicity bias where simple tasks dominate network parameters. The approach uses a phase router to maintain temporal consistency in expert assignments, allowing better specialization for complex tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 36/104
๐Ÿง 

Prompt and Parameter Co-Optimization for Large Language Models

Researchers introduce MetaTuner, a new framework that combines prompt optimization with fine-tuning for Large Language Models, using shared neural networks to discover optimal combinations of prompts and parameters. The approach addresses the discrete-continuous optimization challenge through supervised regularization and demonstrates consistent performance improvements across benchmarks.

AIBullisharXiv โ€“ CS AI ยท Mar 36/104
๐Ÿง 

Distillation of Large Language Models via Concrete Score Matching

Researchers propose Concrete Score Distillation (CSD), a new knowledge distillation method that improves efficiency of large language models by better preserving logit information compared to traditional softmax-based approaches. CSD demonstrates consistent performance improvements across multiple models including GPT-2, OpenLLaMA, and GEMMA while maintaining training stability.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

Hard-constraint physics-residual networks enable robust extrapolation for hydrogen crossover prediction in PEM water electrolyzers

Researchers developed a hard-constraint physics-residual network (PR-Net) that significantly improves hydrogen crossover prediction in water electrolyzers for green hydrogen production. The AI model achieves 99.57% accuracy and maintains performance when extrapolating beyond training conditions, outperforming traditional neural networks and physics-informed networks.

$NEAR
AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

Structure-Informed Estimation for Pilot-Limited MIMO Channels via Tensor Decomposition

Researchers developed a hybrid AI approach combining tensor decomposition with neural networks to improve MIMO channel estimation for 6G wireless systems under pilot signal limitations. The method achieves significant performance improvements over traditional approaches, with up to 13.11 dB better accuracy in specific scenarios.

AINeutralarXiv โ€“ CS AI ยท Mar 35/103
๐Ÿง 

FIRE: Frobenius-Isometry Reinitialization for Balancing the Stability-Plasticity Tradeoff

Researchers propose FIRE, a new reinitialization method for deep neural networks that balances stability and plasticity when learning from nonstationary data. The method uses mathematical optimization to maintain prior knowledge while adapting to new tasks, showing superior performance across visual learning, language modeling, and reinforcement learning domains.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

A Graph Meta-Network for Learning on Kolmogorov-Arnold Networks

Researchers developed WS-KAN, the first weight-space architecture designed specifically for Kolmogorov-Arnold Networks (KANs), which learns directly from neural network parameters. The study shows KANs share permutation symmetries with MLPs and introduces a graph representation to better understand their computation structure.

AIBullishDecrypt ยท Mar 37/107
๐Ÿง 

Human Brain Cells Learn to Play Doom in Cortical Labs Experiment

Cortical Labs successfully trained living human neurons to play the video game Doom, marking a significant advancement in biological computing. This experiment demonstrates the potential for using biological neural networks in computing applications, extending traditional engineering benchmarks into the realm of living tissue.

Human Brain Cells Learn to Play Doom in Cortical Labs Experiment
AIBullisharXiv โ€“ CS AI ยท Mar 27/1014
๐Ÿง 

ReDON: Recurrent Diffractive Optical Neural Processor with Reconfigurable Self-Modulated Nonlinearity

Researchers introduce ReDON, a new recurrent diffractive optical neural processor that overcomes limitations of traditional optical neural networks through reconfigurable self-modulated nonlinearity. The architecture demonstrates up to 20% improved accuracy on image recognition tasks while maintaining energy efficiency, establishing a new paradigm for non-von Neumann analog processors.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1010
๐Ÿง 

UPath: Universal Planner Across Topological Heterogeneity For Grid-Based Pathfinding

Researchers developed UPath, a universal AI-powered pathfinding algorithm that improves A* search performance by up to 2.2x across diverse grid environments. The deep learning model generalizes across different map types without retraining, achieving near-optimal solutions within 3% of optimal cost on unseen tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1012
๐Ÿง 

FedNSAM:Consistency of Local and Global Flatness for Federated Learning

Researchers propose FedNSAM, a new federated learning algorithm that improves global model performance by addressing the inconsistency between local and global flatness in distributed training environments. The algorithm uses global Nesterov momentum to harmonize local and global optimization, showing superior performance compared to existing FedSAM approaches.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1019
๐Ÿง 

Smoothing DiLoCo with Primal Averaging for Faster Training of LLMs

Researchers propose Generalized Primal Averaging (GPA), a new optimization method that improves training speed for large language models by 8-10% over standard AdamW while using less memory. GPA unifies and enhances existing averaging-based optimizers like DiLoCo by enabling smooth iterate averaging at every step without complex two-loop structures.

AINeutralarXiv โ€“ CS AI ยท Mar 27/1017
๐Ÿง 

Test-Time Training with KV Binding Is Secretly Linear Attention

Researchers reveal that Test-Time Training (TTT) with KV binding, previously understood as online meta-learning for memorization, can actually be reformulated as a learned linear attention operator. This new perspective explains previously puzzling behaviors and enables architectural simplifications and efficiency improvements.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1013
๐Ÿง 

Efficient Discovery of Approximate Causal Abstractions via Neural Mechanism Sparsification

Researchers have developed a new method to extract interpretable causal mechanisms from neural networks using structured pruning as a search technique. The approach reframes network pruning as finding approximate causal abstractions, yielding closed-form criteria for simplifying networks while maintaining their causal structure under interventions.

AINeutralarXiv โ€“ CS AI ยท Mar 26/1011
๐Ÿง 

Memory Caching: RNNs with Growing Memory

Researchers introduce Memory Caching (MC), a technique that enhances recurrent neural networks by allowing their memory capacity to grow with sequence length, bridging the gap between fixed-memory RNNs and growing-memory Transformers. The approach offers four variants and shows competitive performance with Transformers on language modeling and long-context tasks while maintaining better computational efficiency.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1016
๐Ÿง 

Activation Function Design Sustains Plasticity in Continual Learning

Researchers from arXiv demonstrate that activation function design is crucial for maintaining neural network plasticity in continual learning scenarios. They introduce two new activation functions (Smooth-Leaky and Randomized Smooth-Leaky) that help prevent models from losing their ability to adapt to new tasks over time.

$LINK
AIBullisharXiv โ€“ CS AI ยท Mar 26/1016
๐Ÿง 

Context and Diversity Matter: The Emergence of In-Context Learning in World Models

Researchers investigate in-context learning (ICL) in world models, identifying two core mechanisms - environment recognition and environment learning - that enable AI systems to adapt to new configurations. The study provides theoretical error bounds and empirical evidence showing that diverse environments and long context windows are crucial for developing self-adapting world models.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1010
๐Ÿง 

SHINE: Sequential Hierarchical Integration Network for EEG and MEG

Researchers developed SHINE, a Sequential Hierarchical Integration Network for analyzing brain signals (EEG/MEG) to detect speech from neural activity. The system achieved high F1-macro scores of 0.9155-0.9184 in the LibriBrain Competition 2025 by reconstructing speech-silence patterns from magnetoencephalography signals.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1017
๐Ÿง 

SceneTok: A Compressed, Diffusable Token Space for 3D Scenes

SceneTok introduces a novel 3D scene tokenizer that compresses view sets into permutation-invariant tokens, achieving 1-3 orders of magnitude better compression than existing methods while maintaining state-of-the-art reconstruction quality. The system enables efficient 3D scene generation in 5 seconds using a lightweight decoder that can render novel viewpoints.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1018
๐Ÿง 

QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

Researchers propose QKAN-LSTM, a quantum-inspired neural network that integrates quantum variational activation functions into LSTM architecture for sequential modeling. The model achieves superior predictive accuracy with 79% fewer parameters than classical LSTMs while remaining executable on classical hardware.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1017
๐Ÿง 

SemVideo: Reconstructs What You Watch from Brain Activity via Hierarchical Semantic Guidance

Researchers introduced SemVideo, a breakthrough AI framework that can reconstruct videos from brain activity using fMRI scans. The system uses hierarchical semantic guidance to overcome previous limitations in visual consistency and temporal coherence, achieving state-of-the-art results in brain-to-video reconstruction.

$RNDR
AIBullisharXiv โ€“ CS AI ยท Mar 26/1014
๐Ÿง 

BiKA: Kolmogorov-Arnold-Network-inspired Ultra Lightweight Neural Network Hardware Accelerator

Researchers propose BiKA, a new ultra-lightweight neural network accelerator inspired by Kolmogorov-Arnold Networks that uses binary thresholds instead of complex computations. The FPGA prototype demonstrates 27-51% reduction in hardware resource usage compared to existing binarized and quantized neural network accelerators while maintaining competitive accuracy.

AINeutralarXiv โ€“ CS AI ยท Feb 275/105
๐Ÿง 

Scaling Laws for Precision in High-Dimensional Linear Regression

Researchers developed theoretical scaling laws for low-precision AI model training, analyzing how quantization affects model performance in high-dimensional linear regression. The study reveals that multiplicative and additive quantization schemes have distinct effects on effective model size, with multiplicative maintaining full precision while additive reduces it.

AIBullisharXiv โ€“ CS AI ยท Feb 276/106
๐Ÿง 

Large Language Model Compression with Global Rank and Sparsity Optimization

Researchers propose a novel two-stage compression method for Large Language Models that uses global rank and sparsity optimization to significantly reduce model size. The approach combines low-rank and sparse matrix decomposition with probabilistic global allocation to automatically detect redundancy across different layers and manage component interactions.

AIBullisharXiv โ€“ CS AI ยท Feb 276/108
๐Ÿง 

Autoregressive Visual Decoding from EEG Signals

Researchers developed AVDE, a lightweight framework for decoding visual information from EEG brain signals using autoregressive generation. The system outperforms existing methods while using only 10% of the parameters, potentially advancing practical brain-computer interface applications.