y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#neural-networks News & Analysis

358 articles tagged with #neural-networks. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

358 articles
AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

GPrune-LLM: Generalization-Aware Structured Pruning for Large Language Models

Researchers introduce GPrune-LLM, a new structured pruning framework that improves compression of large language models by addressing calibration bias and cross-task generalization issues. The method partitions neurons into behavior-consistent modules and uses adaptive metrics based on distribution sensitivity, showing consistent improvements in post-compression performance.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

IGU-LoRA: Adaptive Rank Allocation via Integrated Gradients and Uncertainty-Aware Scoring

Researchers introduce IGU-LoRA, a new parameter-efficient fine-tuning method for large language models that adaptively allocates ranks across layers using integrated gradients and uncertainty-aware scoring. The approach addresses limitations of existing methods like AdaLoRA by providing more stable and accurate layer importance estimates, consistently outperforming baselines across diverse tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

Collapse or Preserve: Data-Dependent Temporal Aggregation for Spiking Neural Network Acceleration

Researchers developed Temporal Aggregated Convolution (TAC) to accelerate spiking neural networks by aggregating spike frames before convolution, achieving 13.8x speedup on rate-coded data. The study reveals that optimal temporal aggregation strategies depend on data type - collapsing temporal dimensions for rate-coded data while preserving them for event-based data.

๐Ÿข Nvidia
AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

SimCert: Probabilistic Certification for Behavioral Similarity in Deep Neural Network Compression

Researchers developed SimCert, a probabilistic certification framework that verifies behavioral similarity between compressed neural networks and their original versions. The framework addresses critical safety challenges in deploying compressed DNNs on resource-constrained systems by providing quantitative safety guarantees with adjustable confidence levels.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

GradCFA: A Hybrid Gradient-Based Counterfactual and Feature Attribution Explanation Algorithm for Local Interpretation of Neural Networks

Researchers introduce GradCFA, a new hybrid AI explanation framework that combines counterfactual explanations and feature attribution to improve transparency in neural network decisions. The algorithm extends beyond binary classification to multi-class scenarios and demonstrates superior performance in generating feasible, plausible, and diverse explanations compared to existing methods.

AINeutralarXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

Conceptual Views of Neural Networks: A Framework for Neuro-Symbolic Analysis

Researchers introduce 'conceptual views' as a formal framework based on Formal Concept Analysis to globally explain neural networks. Testing on 24 ImageNet models and Fruits-360 datasets shows the framework can faithfully represent models, enable architecture comparison, and extract human-comprehensible rules from neurons.

AIBearisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

On the Adversarial Transferability of Generalized "Skip Connections"

Researchers discovered that skip connections in deep neural networks make adversarial attacks more transferable across different AI models. They developed the Skip Gradient Method (SGM) which exploits this vulnerability in ResNets, Vision Transformers, and even Large Language Models to create more effective adversarial examples.

AINeutralarXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

Protecting Deep Neural Network Intellectual Property with Chaos-Based White-Box Watermarking

Researchers have developed a new white-box watermarking framework that uses chaotic sequences to embed ownership information into deep neural network parameters for intellectual property protection. The method uses logistic maps and genetic algorithms to verify model ownership without degrading performance, showing effectiveness on MNIST and CIFAR-10 datasets.

AIBullishMarkTechPost ยท Mar 167/10
๐Ÿง 

Moonshot AI Releases ๐‘จ๐’•๐’•๐’†๐’๐’•๐’Š๐’๐’ ๐‘น๐’†๐’”๐’Š๐’…๐’–๐’‚๐’๐’” to Replace Fixed Residual Mixing with Depth-Wise Attention for Better Scaling in Transformers

Moonshot AI has released Attention Residuals, a new approach that replaces traditional fixed residual connections in Transformer architectures with depth-wise attention mechanisms. The innovation addresses structural problems in PreNorm architectures where all prior layer outputs are mixed equally, potentially improving model scaling capabilities.

Moonshot AI Releases ๐‘จ๐’•๐’•๐’†๐’๐’•๐’Š๐’๐’ ๐‘น๐’†๐’”๐’Š๐’…๐’–๐’‚๐’๐’” to Replace Fixed Residual Mixing with Depth-Wise Attention for Better Scaling in Transformers
AINeutralarXiv โ€“ CS AI ยท Mar 166/10
๐Ÿง 

Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency

Researchers propose Global Evolutionary Refined Steering (GER-steer), a new training-free framework for controlling Large Language Models without fine-tuning costs. The method addresses issues with existing activation engineering approaches by using geometric stability to improve steering vector accuracy and reduce noise.

AIBullisharXiv โ€“ CS AI ยท Mar 166/10
๐Ÿง 

DART: Input-Difficulty-AwaRe Adaptive Threshold for Early-Exit DNNs

Researchers introduce DART, a new framework for early-exit deep neural networks that achieves up to 3.3x speedup and 5.1x lower energy consumption while maintaining accuracy. The system uses input difficulty estimation and adaptive thresholds to optimize AI inference for resource-constrained edge devices.

AIBullisharXiv โ€“ CS AI ยท Mar 166/10
๐Ÿง 

Tiny Recursive Reasoning with Mamba-2 Attention Hybrid

Researchers developed a hybrid model combining Mamba-2 state space operators with Transformer blocks for recursive reasoning, achieving a 2% improvement in pass@2 performance on ARC-AGI-1 tasks with only 6.83M parameters. The study demonstrates that Mamba-2 operators can preserve reasoning capabilities while improving solution candidate coverage in tiny neural networks.

AIBullisharXiv โ€“ CS AI ยท Mar 126/10
๐Ÿง 

FAME: Formal Abstract Minimal Explanation for Neural Networks

Researchers introduce FAME (Formal Abstract Minimal Explanations), a new method for explaining neural network decisions that scales to large networks while producing smaller explanations. The approach uses abstract interpretation and dedicated perturbation domains to eliminate irrelevant features and converge to minimal explanations more efficiently than existing methods.

AINeutralarXiv โ€“ CS AI ยท Mar 126/10
๐Ÿง 

Contract And Conquer: How to Provably Compute Adversarial Examples for a Black-Box Model?

Researchers propose Contract And Conquer (CAC), a new method for provably generating adversarial examples against black-box neural networks using knowledge distillation and search space contraction. The approach provides theoretical guarantees for finding adversarial examples within a fixed number of iterations and outperforms existing methods on ImageNet datasets including vision transformers.

AIBullisharXiv โ€“ CS AI ยท Mar 126/10
๐Ÿง 

When Fine-Tuning Fails and when it Generalises: Role of Data Diversity and Mixed Training in LLM-based TTS

Research demonstrates that LoRA fine-tuning of large language models significantly improves text-to-speech systems, achieving up to 0.42 DNS-MOS gains and 34% SNR improvements when training data has sufficient acoustic diversity. The study establishes LoRA as an effective mechanism for speaker adaptation in compact LLM-based TTS systems, outperforming frozen base models across perceptual quality, speaker fidelity, and signal quality metrics.

AINeutralarXiv โ€“ CS AI ยท Mar 116/10
๐Ÿง 

CRANE: Causal Relevance Analysis of Language-Specific Neurons in Multilingual Large Language Models

Researchers introduce CRANE, a new framework for analyzing how multilingual large language models organize language capabilities at the neuron level. The method uses targeted interventions to identify language-specific neurons based on functional necessity rather than activation patterns, revealing asymmetric specialization where neurons contribute selectively to specific languages while maintaining broader functionality.

AINeutralarXiv โ€“ CS AI ยท Mar 116/10
๐Ÿง 

Latent Generative Models with Tunable Complexity for Compressed Sensing and other Inverse Problems

Researchers developed tunable-complexity priors for generative models (diffusion models, normalizing flows, and variational autoencoders) that can dynamically adjust complexity based on the specific inverse problem. The approach uses nested dropout and demonstrates superior performance across compressed sensing, inpainting, denoising, and phase retrieval tasks compared to fixed-complexity baselines.

AIBullisharXiv โ€“ CS AI ยท Mar 116/10
๐Ÿง 

Architectural Design and Performance Analysis of FPGA based AI Accelerators: A Comprehensive Review

This comprehensive review examines FPGA-based AI accelerators as a promising solution for deep learning workloads, addressing the limitations of ASIC and GPU accelerators. The paper analyzes hardware optimizations including loop pipelining, parallelism, and quantization techniques that make FPGAs attractive for AI applications requiring high performance and energy efficiency.

AIBullisharXiv โ€“ CS AI ยท Mar 96/10
๐Ÿง 

Boosting deep Reinforcement Learning using pretraining with Logical Options

Researchers propose Hybrid Hierarchical RL (HยฒRL), a new framework that combines symbolic logic with deep reinforcement learning to address misalignment issues in AI agents. The method uses logical option-based pretraining to improve long-horizon decision-making and prevent agents from over-exploiting short-term rewards.

AINeutralarXiv โ€“ CS AI ยท Mar 96/10
๐Ÿง 

Towards Neural Graph Data Management

Researchers introduce NGDBench, a comprehensive benchmark for evaluating neural networks' ability to work with graph databases across five domains including finance and medicine. The benchmark supports full Cypher query language capabilities and reveals significant limitations in current AI models when handling structured graph data, noise, and complex analytical tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 96/10
๐Ÿง 

Maximizing Asynchronicity in Event-based Neural Networks

Researchers have developed EVA (EVent Asynchronous feature learning), a new framework that improves event-based neural networks by adapting language modeling techniques to process asynchronous visual data from event cameras. EVA demonstrates superior performance on recognition and detection tasks, achieving breakthrough results including 0.477 mAP on the Gen1 dataset for demanding detection applications.

AINeutralarXiv โ€“ CS AI ยท Mar 55/10
๐Ÿง 

Zono-Conformal Prediction: Zonotope-Based Uncertainty Quantification for Regression and Classification Tasks

Researchers introduce zono-conformal prediction, a new uncertainty quantification method for machine learning that uses zonotope-based prediction sets instead of traditional intervals. The approach is more computationally efficient and less conservative than existing conformal prediction methods while maintaining statistical coverage guarantees for both regression and classification tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 55/10
๐Ÿง 

Weight Space Representation Learning via Neural Field Adaptation

Researchers have developed a new approach using multiplicative LoRA (Low-Rank Adaptation) weights for neural field representation learning, achieving improved quality in reconstruction, generation, and analysis tasks. The method constrains optimization space through pre-trained base models, creating structured weight representations that outperform existing weight-space methods when used with latent diffusion models.

AIBullisharXiv โ€“ CS AI ยท Mar 55/10
๐Ÿง 

JPmHC Dynamical Isometry via Orthogonal Hyper-Connections

Researchers propose JPmHC (Jacobian-spectrum Preserving manifold-constrained Hyper-Connections), a new deep learning framework that improves upon existing Hyper-Connections by replacing identity skips with trainable linear mixers while controlling gradient conditioning. The framework addresses training instability and memory overhead issues in current deep learning architectures through constrained optimization on specific mathematical manifolds.

AIBullisharXiv โ€“ CS AI ยท Mar 45/103
๐Ÿง 

Quantum-Inspired Fine-Tuning for Few-Shot AIGC Detection via Phase-Structured Reparameterization

Researchers propose Q-LoRA, a quantum-enhanced fine-tuning method that integrates quantum neural networks into LoRA adapters for improved AI-generated content detection. The study also introduces H-LoRA, a classical variant using Hilbert transforms that achieves similar 5%+ accuracy improvements over standard LoRA at lower computational cost.