🧠

AI

21,454 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

21454 articles

AIBullisharXiv – CS AI · Mar 37/108

🧠

Fully-analog array signal processor using 3D aperture engineering

Researchers developed a fully-analog array signal processor (FASP) using 3D aperture engineering with cascaded metasurface layers that achieves N times higher angular resolution than the Rayleigh diffraction limit. The system can perform super-resolution direction-of-arrival estimation and multi-channel source separation, demonstrating 20 dB radar jamming suppression and 13.5x communication capacity enhancement at 36-41 GHz frequencies.

AIBullisharXiv – CS AI · Mar 36/109

🧠

AWE: Adaptive Agents for Dynamic Web Penetration Testing

Researchers introduced AWE, a memory-augmented multi-agent framework for autonomous web penetration testing that outperforms existing tools on injection vulnerabilities. AWE achieved 87% XSS success and 66.7% blind SQL injection success on benchmark tests, demonstrating superior accuracy and efficiency compared to general-purpose AI penetration testing tools.

AIBullisharXiv – CS AI · Mar 36/107

🧠

RepoRepair: Leveraging Code Documentation for Repository-Level Automated Program Repair

RepoRepair is a new AI-powered automated program repair system that uses hierarchical code documentation to fix bugs across entire software repositories. The system achieves a 45.7% repair rate on SWE-bench Lite at $0.44 per fix by leveraging LLMs like DeepSeek-V3 and Claude-4 for fault localization and code repair.

AIBullisharXiv – CS AI · Mar 36/109

🧠

Data-Free PINNs for Compressible Flows: Mitigating Spectral Bias and Gradient Pathologies via Mach-Guided Scaling and Hybrid Convolutions

Researchers developed a data-free Physics-Informed Neural Network (PINN) that can solve compressible flows around circular cylinders at extreme speeds up to Mach 15. The system uses hybrid convolutions and Mach-guided scaling to overcome traditional limitations and successfully captures shock waves without requiring training data.

AIBullisharXiv – CS AI · Mar 36/109

🧠

Improving Text-to-Image Generation with Intrinsic Self-Confidence Rewards

Researchers introduced ARC (Adaptive Rewarding by self-Confidence), a new framework for improving text-to-image generation models through self-confidence signals rather than external rewards. The method uses internal self-denoising probes to evaluate model accuracy and converts this into scalar rewards for unsupervised optimization, showing improvements in compositional generation and text-image alignment.

AIBearisharXiv – CS AI · Mar 36/109

🧠

Prompt Sensitivity and Answer Consistency of Small Open-Source Large Language Models on Clinical Question Answering: Implications for Low-Resource Healthcare Deployment

Research evaluated five small open-source language models on clinical question answering, finding that high consistency doesn't guarantee accuracy - models can be reliably wrong. Llama 3.2 showed the best balance of accuracy and reliability, while roleplay prompts consistently reduced performance across all models.

$NEAR

AIBullisharXiv – CS AI · Mar 36/105

🧠

AMDS: Attack-Aware Multi-Stage Defense System for Network Intrusion Detection with Two-Stage Adaptive Weight Learning

Researchers developed AMDS, an attack-aware multi-stage defense system for network intrusion detection that uses adaptive weight learning to counter adversarial attacks. The system achieved 94.2% AUC and improved classification accuracy by 4.5 percentage points over existing adversarially trained ensembles by learning attack-specific detection strategies.

$CRV

AIBearisharXiv – CS AI · Mar 37/107

🧠

Artificial Superintelligence May be Useless: Equilibria in the Economy of Multiple AI Agents

A new research paper analyzes economic equilibria between AI and human agents in trading scenarios, finding that unless agents can at least double their marginal utility from purchases, no trading will occur. The study reveals that more powerful AI agents may contribute zero utility to less capable agents in certain equilibria.

AIBearisharXiv – CS AI · Mar 36/106

🧠

Knowledge without Wisdom: Measuring Misalignment between LLMs and Intended Impact

Research reveals that leading foundation models (LLMs) perform poorly on real-world educational tasks despite excelling on AI benchmarks. The study found that 50% of misalignment errors are shared across models due to common pretraining approaches, with model ensembles actually worsening performance on learning outcomes.

AIBullisharXiv – CS AI · Mar 37/106

🧠

MultiPUFFIN: A Multimodal Domain-Constrained Foundation Model for Molecular Property Prediction of Small Molecules

Researchers introduce MultiPUFFIN, a multimodal AI foundation model that predicts molecular properties for drug discovery and materials science. The model combines multiple data types and thermodynamic principles to achieve superior performance while using 2000x fewer training molecules than existing models like ChemBERTa-2.

AIBullisharXiv – CS AI · Mar 37/108

🧠

CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning

Researchers introduce CHIMERA, a compact 9K-sample synthetic dataset that enables smaller AI models to achieve reasoning performance comparable to much larger models. The dataset addresses key challenges in training reasoning-capable LLMs through automated generation and cross-validation across 8 scientific disciplines.

AIBullisharXiv – CS AI · Mar 37/108

🧠

PARCER as an Operational Contract to Reduce Variance, Cost, and Risk in LLM Systems

Researchers propose PARCER, a new framework that acts as an operational contract to address major governance challenges in Large Language Model systems. The framework uses structured YAML configurations to reduce variance, improve cost control, and enhance predictability in LLM operations through seven operational phases and decision hygiene practices.

AINeutralarXiv – CS AI · Mar 37/107

🧠

Constitutional Black-Box Monitoring for Scheming in LLM Agents

Researchers developed constitutional black-box monitors to detect scheming behavior in LLM agents using only observable inputs and outputs. The study found that monitors trained on synthetic data can generalize to realistic environments, but performance improvements plateau quickly with simple optimization techniques outperforming complex methods.

AINeutralarXiv – CS AI · Mar 36/107

🧠

A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations

Researchers propose a new gauge-theoretic framework for understanding superposition in large language models, replacing traditional single-dictionary approaches with local semantic charts. The method introduces three measurable obstructions to interpretability and demonstrates results on Llama 3.2 3B model with various datasets.

AIBullisharXiv – CS AI · Mar 36/109

🧠

QANTIS: A Hardware-Validated Quantum Platform for POMDP Planning and Multi-Target Data Association

QANTIS is a hardware-validated quantum computing platform that demonstrates quadratic improvements in autonomous navigation planning problems and multi-target data association tasks. The research shows successful implementation on IBM quantum hardware, achieving 5.1x amplification of rare observation probabilities while maintaining Bayesian posterior accuracy.

AINeutralarXiv – CS AI · Mar 37/106

🧠

Identifying and Characterising Response in Clinical Trials: Development and Validation of a Machine Learning Approach in Colorectal Cancer

Researchers developed a machine learning approach combining Virtual Twins method with survLIME to identify patient subgroups who respond differently to treatments in clinical trials. The method achieved 0.77 AUC for identifying treatment responders in colorectal cancer trials, finding genetic mutations, metastasis sites, and ethnicity as key response factors.

$CRV

AIBullisharXiv – CS AI · Mar 36/107

🧠

ContextCov: Deriving and Enforcing Executable Constraints from Agent Instruction Files

Researchers have developed ContextCov, a framework that converts passive natural language instructions for AI agents into active, executable guardrails to prevent code violations. The system addresses 'Context Drift' where AI agents deviate from project guidelines, creating automated compliance checks across static code analysis, runtime commands, and architectural validation.

$COMP

AIBullisharXiv – CS AI · Mar 36/107

🧠

Stroke outcome and evolution prediction from CT brain using a spatiotemporal diffusion autoencoder

Researchers developed a spatiotemporal diffusion autoencoder using CT brain images to predict stroke outcomes and evolution. The AI model achieved best-in-class performance for predicting next-day severity and functional outcomes using a dataset of 5,824 CT images from 3,573 patients across two medical centers.

AIBullisharXiv – CS AI · Mar 36/109

🧠

MM-DeepResearch: A Simple and Effective Multimodal Agentic Search Baseline

Researchers introduce MM-DeepResearch, a multimodal AI agent that combines visual and textual reasoning for complex research tasks. The system addresses key challenges in multimodal AI through novel training methods including hypergraph-based data generation and offline search engine optimization.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Curvature-Weighted Capacity Allocation: A Minimum Description Length Framework for Layer-Adaptive Large Language Model Optimization

Researchers developed a new mathematical framework called Curvature-Weighted Capacity Allocation that optimizes large language model performance by identifying which layers contribute most to loss reduction. The method uses the Minimum Description Length principle to make principled decisions about layer pruning and capacity allocation under hardware constraints.

$NEAR

AINeutralarXiv – CS AI · Mar 37/107

🧠

A Comprehensive Evaluation of LLM Unlearning Robustness under Multi-Turn Interaction

Researchers found that machine unlearning in large language models, which aims to remove specific training data influence, is less effective in interactive settings than previously thought. Knowledge that appears forgotten in static tests can often be recovered through multi-turn conversations and self-correction interactions.

AIBullisharXiv – CS AI · Mar 37/106

🧠

General Proximal Flow Networks

Researchers introduce General Proximal Flow Networks (GPFNs), a generalization of Bayesian Flow Networks that allows for arbitrary divergence functions instead of fixed Kullback-Leibler divergence. The framework enables iterative generative modeling with improved generation quality when divergence functions are adapted to underlying data geometry.

$LINK

AIBearisharXiv – CS AI · Mar 36/106

🧠

LangGap: Diagnosing and Closing the Language Gap in Vision-Language-Action Models

Researchers reveal that state-of-the-art Vision-Language-Action (VLA) models largely ignore language instructions despite achieving 95% success on standard benchmarks. The new LangGap benchmark exposes significant language understanding deficits, with targeted data augmentation only partially addressing the fundamental challenge of diverse instruction comprehension.

AIBullisharXiv – CS AI · Mar 36/108

🧠

AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution

Researchers introduced AlignVAR, a new visual autoregressive framework for image super-resolution that delivers 10x faster inference with 50% fewer parameters than leading diffusion-based approaches. The system addresses key challenges in image reconstruction through improved spatial consistency and hierarchical constraints, establishing a more efficient paradigm for high-quality image enhancement.

AINeutralarXiv – CS AI · Mar 36/107

🧠

Theory of Code Space: Do Code Agents Understand Software Architecture?

Researchers introduce Theory of Code Space (ToCS), a new benchmark that evaluates AI agents' ability to understand software architecture across multi-file codebases. The study reveals significant performance gaps between frontier LLM agents and rule-based baselines, with F1 scores ranging from 0.129 to 0.646.

← PrevPage 557 of 859Next →