AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers developed NextHAM, a deep learning method for predicting electronic-structure Hamiltonians of materials, offering significant computational efficiency advantages over traditional DFT methods. The system introduces neural E(3)-symmetry architecture and a new dataset Materials-HAM-SOC with 17,000 material structures spanning 68 elements.
AIBullishOpenAI News · Apr 237/105
🧠Researchers have developed the Sparse Transformer, a deep neural network that achieves new performance records in sequence prediction for text, images, and sound. The model uses an improved attention mechanism that can process sequences 30 times longer than previously possible.
AIBullisharXiv – CS AI · 1d ago6/10
🧠Researchers introduce the Bond Smoothness Characterization Test (BSCT), a new evaluation metric for Machine Learning Interatomic Potentials that efficiently detects physical inaccuracies in quantum potential energy surfaces. By combining BSCT with architectural refinements like differentiable k-nearest neighbors and temperature-controlled attention, the team demonstrates how systematic model design can achieve both low regression errors and stable molecular dynamics simulations.
AIBullisharXiv – CS AI · 1d ago6/10
🧠Researchers introduce SpikeWFM, a hybrid neural architecture combining spiking neural networks with transformer-based models for wireless communications. The approach aims to improve noise resilience and energy efficiency in wireless foundation models while maintaining strong performance across diverse prediction tasks like channel estimation and positioning.
AINeutralarXiv – CS AI · 1d ago6/10
🧠Researchers introduce LALE, a lightweight transformer architecture for remote sensing image segmentation that achieves strong efficiency-performance trade-offs by separating high-resolution local feature processing (via ConvMixer) from low-resolution global context modeling (via transformers). The approach demonstrates that a 1.6M parameter model can match near-SOTA performance while requiring 4.5x fewer parameters and 17x fewer computational operations.
AIBullisharXiv – CS AI · 1d ago6/10
🧠Researchers introduce DAStatFormer, a hybrid Transformer model that dramatically improves Distributed Acoustic Sensing (DAS) event classification by extracting 24 statistical features per channel instead of processing raw signals, achieving 99.4% accuracy on benchmark datasets while reducing computational requirements significantly compared to existing deep learning approaches.
AINeutralarXiv – CS AI · 1d ago6/10
🧠GIRL-DETR introduces a novel reinforcement learning approach for video moment retrieval that addresses the optimization gap between training losses and evaluation metrics. By freezing backbone networks and applying progressive RL only to detection heads, the method achieves significant accuracy improvements while protecting learned feature representations in lightweight models.
AINeutralarXiv – CS AI · 1d ago6/10
🧠Researchers propose a novel offline meta-reinforcement learning framework combining information-theoretic task representation learning with Transformer-based world models to address distribution shifts in sparse-reward environments. The approach extracts behavior-invariant task representations and applies conservative value penalties to prevent model exploitation, demonstrating improved generalization over existing methods.
AIBullisharXiv – CS AI · 1d ago6/10
🧠Researchers propose Morlet Spectral Transformer (MST), a novel neural network architecture for detecting emotions from EEG brain signals across different subjects. The method outperforms larger pretrained models by using specialized wavelet-based signal processing and frequency-specific spatial analysis, demonstrating that intelligent representation design can replace computationally expensive pretraining approaches.
AINeutralarXiv – CS AI · 1d ago5/10
🧠Researchers introduce HRTFformer, a transformer-based neural network that improves the spatial upsampling of Head-Related Transfer Functions (HRTFs) used in immersive audio applications. By leveraging attention mechanisms and spherical harmonic domain processing, the model reconstructs high-fidelity spatial audio from sparse measurements with improved accuracy and realistic spatial coherence.
AINeutralarXiv – CS AI · 1d ago6/10
🧠Researchers developed deep learning models using BLSTM and transformer architectures to predict full-body human posture during dynamic load-reaching tasks. A novel cost function enforcing constant body segment lengths improved prediction accuracy by 8-21%, with transformer models achieving 58% better long-term performance than LSTM alternatives.
AINeutralarXiv – CS AI · 2d ago5/10
🧠ConTrans, a novel neural network architecture, advances zero-shot temporal action localization by combining convolutional and transformer layers to capture both local frame dependencies and long-range video context. The approach achieves new benchmark performance on standard datasets, addressing limitations in existing methods that underutilize local correlations between frames.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers propose Bottom-up Policy Optimization (BuPO), a novel reinforcement learning approach that optimizes internal layers of language models rather than treating them as unified policies. The study reveals that LLMs contain distinct internal policy structures with different entropy patterns across layers, offering new insights into how transformer-based models process reasoning tasks.
🧠 Llama
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers propose block-based double decoders, a transformer architecture that combines the training efficiency of decoder-only models with the inference speed advantages of encoder-decoder models. The innovation uses doubly-causal block-based attention masks to enable full loss supervision and static sequence packing, achieving 2/3 reduction in KV-cache memory and per-token compute at inference time.
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers introduce the Cognitive Categorical Transformer (CCT), a 306M-parameter language model that applies category-theoretic principles to improve upon GPT-2 Small, achieving 12% relative perplexity reduction on WikiText-103. The work provides empirical validation that simplicial message passing enhances language modeling performance and identifies a distinction between topology-adding versus consistency-enforcing categorical priors.
🏢 Perplexity
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers introduce CosmicFish-HRM, a compact language model that uses a Hierarchical Reasoning Module to dynamically adjust computational effort during inference based on input complexity. The approach challenges the assumption that larger models are necessary for advanced reasoning, suggesting adaptive computation depth could offer efficiency gains as model scale increases.
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers introduce GASP, a framework that enhances Vision-Language Models' 3D spatial reasoning by injecting geometric priors directly into transformer layers rather than relying on 3D VQA datasets. The approach uses contrastive learning on point correspondences and depth consistency supervision, achieving 70%+ correspondence accuracy and 18-29% improvements on spatial benchmarks without any 3D VQA training data.
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers propose In-Context Reward Adaptation, a transformer-based framework that dynamically models diverse human preferences without costly retraining. By incorporating human response time as an auxiliary signal, the approach enables language models to adapt to unseen preference domains on-the-fly, addressing a critical limitation of static reward models used in RLHF systems.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers introduce Trinity, a transformer-based AI system that unifies terrain and semantic segmentation for outdoor robots using synthetic data. The approach enables robot-agnostic terrain understanding without predefined labels, improving transferability across different robotic platforms and reducing annotation costs.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers introduce EigeNet, a geometry-informed deep learning framework for predicting Room Impulse Response (RIR) in spatial audio from limited observations. The model combines transformer architecture with acoustic ray tracing principles to achieve state-of-the-art performance in few-shot novel view RIR prediction and demonstrates strong sim-to-real generalization capabilities.
AIBullisharXiv – CS AI · 6d ago6/10
🧠Researchers propose LaneRoPE, a novel technique that enables multiple parallel language model sequences to coordinate and share information during generation, improving reasoning accuracy without significant architectural changes or inference overhead.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers have developed methods to identify which attention heads in Large Language Models are responsible for specific reasoning steps, revealing that only ~3% of heads handle factual retrieval while higher layers coordinate multi-step reasoning algorithms. This work provides insights into how LLMs learn logical reasoning from limited demonstrations and could improve model interpretability and design.
AIBullisharXiv – CS AI · May 276/10
🧠Researchers introduce HRVConformer, a deep learning model combining convolutional and Transformer architectures to classify neonatal hypoxic-ischemic encephalopathy (HIE) from heart rate signals. The model achieves 83.23% AUC and 74.56% accuracy, outperforming traditional baselines by automating HIE detection without requiring handcrafted features.
AINeutralarXiv – CS AI · May 276/10
🧠Researchers introduce PaGeR, a framework that adapts 3D foundation models trained on perspective images to work with panoramic imagery, enabling geometry estimation from 360-degree scenes. The unified model predicts depth, surface normals, and sky masks from both standard and panoramic images in a single pass, achieving state-of-the-art performance on indoor and outdoor scenes.
AINeutralarXiv – CS AI · May 276/10
🧠Researchers propose CAT (Cross-scale Aligned Transformer), a new GAN training method that addresses the cross-scale trajectory misalignment problem in multi-stage image generation. By adding consistency regularization between intermediate and final outputs, CAT achieves state-of-the-art results on ImageNet-256 with one-step inference, reaching FID-50K of 1.56 after just 60 training epochs.