y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#neural-networks News & Analysis

327 articles tagged with #neural-networks. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

327 articles
AINeutralarXiv โ€“ CS AI ยท Mar 35/103
๐Ÿง 

FIRE: Frobenius-Isometry Reinitialization for Balancing the Stability-Plasticity Tradeoff

Researchers propose FIRE, a new reinitialization method for deep neural networks that balances stability and plasticity when learning from nonstationary data. The method uses mathematical optimization to maintain prior knowledge while adapting to new tasks, showing superior performance across visual learning, language modeling, and reinforcement learning domains.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

A Graph Meta-Network for Learning on Kolmogorov-Arnold Networks

Researchers developed WS-KAN, the first weight-space architecture designed specifically for Kolmogorov-Arnold Networks (KANs), which learns directly from neural network parameters. The study shows KANs share permutation symmetries with MLPs and introduces a graph representation to better understand their computation structure.

AIBullishDecrypt ยท Mar 37/107
๐Ÿง 

Human Brain Cells Learn to Play Doom in Cortical Labs Experiment

Cortical Labs successfully trained living human neurons to play the video game Doom, marking a significant advancement in biological computing. This experiment demonstrates the potential for using biological neural networks in computing applications, extending traditional engineering benchmarks into the realm of living tissue.

Human Brain Cells Learn to Play Doom in Cortical Labs Experiment
AIBullisharXiv โ€“ CS AI ยท Mar 27/1017
๐Ÿง 

SemVideo: Reconstructs What You Watch from Brain Activity via Hierarchical Semantic Guidance

Researchers introduced SemVideo, a breakthrough AI framework that can reconstruct videos from brain activity using fMRI scans. The system uses hierarchical semantic guidance to overcome previous limitations in visual consistency and temporal coherence, achieving state-of-the-art results in brain-to-video reconstruction.

$RNDR
AINeutralarXiv โ€“ CS AI ยท Mar 27/1017
๐Ÿง 

Test-Time Training with KV Binding Is Secretly Linear Attention

Researchers reveal that Test-Time Training (TTT) with KV binding, previously understood as online meta-learning for memorization, can actually be reformulated as a learned linear attention operator. This new perspective explains previously puzzling behaviors and enables architectural simplifications and efficiency improvements.

AINeutralarXiv โ€“ CS AI ยท Mar 26/1011
๐Ÿง 

Memory Caching: RNNs with Growing Memory

Researchers introduce Memory Caching (MC), a technique that enhances recurrent neural networks by allowing their memory capacity to grow with sequence length, bridging the gap between fixed-memory RNNs and growing-memory Transformers. The approach offers four variants and shows competitive performance with Transformers on language modeling and long-context tasks while maintaining better computational efficiency.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1019
๐Ÿง 

Smoothing DiLoCo with Primal Averaging for Faster Training of LLMs

Researchers propose Generalized Primal Averaging (GPA), a new optimization method that improves training speed for large language models by 8-10% over standard AdamW while using less memory. GPA unifies and enhances existing averaging-based optimizers like DiLoCo by enabling smooth iterate averaging at every step without complex two-loop structures.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1018
๐Ÿง 

QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

Researchers propose QKAN-LSTM, a quantum-inspired neural network that integrates quantum variational activation functions into LSTM architecture for sequential modeling. The model achieves superior predictive accuracy with 79% fewer parameters than classical LSTMs while remaining executable on classical hardware.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1014
๐Ÿง 

BiKA: Kolmogorov-Arnold-Network-inspired Ultra Lightweight Neural Network Hardware Accelerator

Researchers propose BiKA, a new ultra-lightweight neural network accelerator inspired by Kolmogorov-Arnold Networks that uses binary thresholds instead of complex computations. The FPGA prototype demonstrates 27-51% reduction in hardware resource usage compared to existing binarized and quantized neural network accelerators while maintaining competitive accuracy.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1014
๐Ÿง 

ReDON: Recurrent Diffractive Optical Neural Processor with Reconfigurable Self-Modulated Nonlinearity

Researchers introduce ReDON, a new recurrent diffractive optical neural processor that overcomes limitations of traditional optical neural networks through reconfigurable self-modulated nonlinearity. The architecture demonstrates up to 20% improved accuracy on image recognition tasks while maintaining energy efficiency, establishing a new paradigm for non-von Neumann analog processors.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1010
๐Ÿง 

UPath: Universal Planner Across Topological Heterogeneity For Grid-Based Pathfinding

Researchers developed UPath, a universal AI-powered pathfinding algorithm that improves A* search performance by up to 2.2x across diverse grid environments. The deep learning model generalizes across different map types without retraining, achieving near-optimal solutions within 3% of optimal cost on unseen tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1012
๐Ÿง 

FedNSAM:Consistency of Local and Global Flatness for Federated Learning

Researchers propose FedNSAM, a new federated learning algorithm that improves global model performance by addressing the inconsistency between local and global flatness in distributed training environments. The algorithm uses global Nesterov momentum to harmonize local and global optimization, showing superior performance compared to existing FedSAM approaches.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1016
๐Ÿง 

Activation Function Design Sustains Plasticity in Continual Learning

Researchers from arXiv demonstrate that activation function design is crucial for maintaining neural network plasticity in continual learning scenarios. They introduce two new activation functions (Smooth-Leaky and Randomized Smooth-Leaky) that help prevent models from losing their ability to adapt to new tasks over time.

$LINK
AIBullisharXiv โ€“ CS AI ยท Mar 26/1010
๐Ÿง 

SHINE: Sequential Hierarchical Integration Network for EEG and MEG

Researchers developed SHINE, a Sequential Hierarchical Integration Network for analyzing brain signals (EEG/MEG) to detect speech from neural activity. The system achieved high F1-macro scores of 0.9155-0.9184 in the LibriBrain Competition 2025 by reconstructing speech-silence patterns from magnetoencephalography signals.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1013
๐Ÿง 

Efficient Discovery of Approximate Causal Abstractions via Neural Mechanism Sparsification

Researchers have developed a new method to extract interpretable causal mechanisms from neural networks using structured pruning as a search technique. The approach reframes network pruning as finding approximate causal abstractions, yielding closed-form criteria for simplifying networks while maintaining their causal structure under interventions.

AIBullisharXiv โ€“ CS AI ยท Mar 27/1017
๐Ÿง 

SceneTok: A Compressed, Diffusable Token Space for 3D Scenes

SceneTok introduces a novel 3D scene tokenizer that compresses view sets into permutation-invariant tokens, achieving 1-3 orders of magnitude better compression than existing methods while maintaining state-of-the-art reconstruction quality. The system enables efficient 3D scene generation in 5 seconds using a lightweight decoder that can render novel viewpoints.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1016
๐Ÿง 

Context and Diversity Matter: The Emergence of In-Context Learning in World Models

Researchers investigate in-context learning (ICL) in world models, identifying two core mechanisms - environment recognition and environment learning - that enable AI systems to adapt to new configurations. The study provides theoretical error bounds and empirical evidence showing that diverse environments and long context windows are crucial for developing self-adapting world models.

AIBullisharXiv โ€“ CS AI ยท Feb 276/107
๐Ÿง 

On Sample-Efficient Generalized Planning via Learned Transition Models

Researchers propose a new approach to generalized planning that learns explicit transition models rather than directly predicting action sequences. This method achieves better out-of-distribution performance with fewer training instances and smaller models compared to Transformer-based planners like PlanGPT.

AIBullisharXiv โ€“ CS AI ยท Feb 276/103
๐Ÿง 

DisQ-HNet: A Disentangled Quantized Half-UNet for Interpretable Multimodal Image Synthesis Applications to Tau-PET Synthesis from T1 and FLAIR MRI

Researchers developed DisQ-HNet, a new AI framework that synthesizes tau-PET brain scans from MRI data to detect Alzheimer's disease pathology. The method uses advanced neural network architectures to generate cost-effective alternatives to expensive PET imaging while maintaining diagnostic accuracy.

AIBullisharXiv โ€“ CS AI ยท Feb 276/106
๐Ÿง 

Large Language Model Compression with Global Rank and Sparsity Optimization

Researchers propose a novel two-stage compression method for Large Language Models that uses global rank and sparsity optimization to significantly reduce model size. The approach combines low-rank and sparse matrix decomposition with probabilistic global allocation to automatically detect redundancy across different layers and manage component interactions.

AIBullisharXiv โ€“ CS AI ยท Feb 276/108
๐Ÿง 

GRAU: Generic Reconfigurable Activation Unit Design for Neural Network Hardware Accelerators

Researchers propose GRAU, a new reconfigurable activation unit design for neural network hardware accelerators that uses piecewise linear fitting with power-of-two slopes. The design reduces LUT consumption by over 90% compared to traditional multi-threshold activators while supporting mixed-precision quantization and nonlinear functions.

AIBullisharXiv โ€“ CS AI ยท Feb 276/108
๐Ÿง 

Deep Sequence Modeling with Quantum Dynamics: Language as a Wave Function

Researchers introduce a quantum-inspired sequence modeling framework that uses complex-valued wave functions and quantum interference for language processing. The approach shows theoretical advantages over traditional recurrent neural networks by utilizing quantum dynamics and the Born rule for token probability extraction.

AINeutralarXiv โ€“ CS AI ยท Feb 276/107
๐Ÿง 

ReCoN-Ipsundrum: An Inspectable Recurrent Persistence Loop Agent with Affect-Coupled Control and Mechanism-Linked Consciousness Indicator Assays

Researchers developed ReCoN-Ipsundrum, an AI agent architecture designed to exhibit consciousness-like behaviors through recurrent persistence loops and affect-coupled control mechanisms. The study demonstrates how engineered systems can display preference stability, exploratory scanning, and sustained caution behaviors that mimic aspects of conscious experience.

$LINK
AINeutralarXiv โ€“ CS AI ยท Feb 275/105
๐Ÿง 

Scaling Laws for Precision in High-Dimensional Linear Regression

Researchers developed theoretical scaling laws for low-precision AI model training, analyzing how quantization affects model performance in high-dimensional linear regression. The study reveals that multiplicative and additive quantization schemes have distinct effects on effective model size, with multiplicative maintaining full precision while additive reduces it.

AIBullisharXiv โ€“ CS AI ยท Feb 276/108
๐Ÿง 

Autoregressive Visual Decoding from EEG Signals

Researchers developed AVDE, a lightweight framework for decoding visual information from EEG brain signals using autoregressive generation. The system outperforms existing methods while using only 10% of the parameters, potentially advancing practical brain-computer interface applications.