🧠

AI

12,980 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

12980 articles

AIBullisharXiv – CS AI · Mar 37/108

🧠

Unified Vision-Language Modeling via Concept Space Alignment

Researchers introduce V-SONAR, a vision-language embedding system that extends text-only SONAR to support 1500+ languages with vision capabilities. The system demonstrates state-of-the-art performance on video captioning and multilingual vision tasks through V-LCM, which combines vision and language processing in a unified framework.

AIBearisharXiv – CS AI · Mar 36/107

🧠

Hide&Seek: Remove Image Watermarks with Negligible Cost via Pixel-wise Reconstruction

Researchers have developed HIDE&SEEK (HS), a new attack method that can effectively remove watermarks from machine-generated images while maintaining visual quality. This research exposes vulnerabilities in current state-of-the-art proactive image watermarking defenses, highlighting the ongoing arms race between watermarking protection and removal techniques.

AIBullisharXiv – CS AI · Mar 36/107

🧠

RepoRepair: Leveraging Code Documentation for Repository-Level Automated Program Repair

RepoRepair is a new AI-powered automated program repair system that uses hierarchical code documentation to fix bugs across entire software repositories. The system achieves a 45.7% repair rate on SWE-bench Lite at $0.44 per fix by leveraging LLMs like DeepSeek-V3 and Claude-4 for fault localization and code repair.

AINeutralarXiv – CS AI · Mar 36/1012

🧠

Silo-Bench: A Scalable Environment for Evaluating Distributed Coordination in Multi-Agent LLM Systems

Researchers introduce Silo-Bench, a benchmark revealing that multi-agent LLM systems can exchange information effectively but fail to integrate distributed data for correct reasoning. The study shows coordination overhead increases with scale, challenging the assumption that adding more agents can solve context limitations.

AIBullisharXiv – CS AI · Mar 36/109

🧠

MM-DeepResearch: A Simple and Effective Multimodal Agentic Search Baseline

Researchers introduce MM-DeepResearch, a multimodal AI agent that combines visual and textual reasoning for complex research tasks. The system addresses key challenges in multimodal AI through novel training methods including hypergraph-based data generation and offline search engine optimization.

AIBearisharXiv – CS AI · Mar 37/106

🧠

Turning Black Box into White Box: Dataset Distillation Leaks

Researchers discovered that dataset distillation, a technique for compressing large datasets into smaller synthetic ones, has serious privacy vulnerabilities. The study introduces an Information Revelation Attack (IRA) that can extract sensitive information from synthetic datasets, including predicting the distillation algorithm, model architecture, and recovering original training samples.

AIBullisharXiv – CS AI · Mar 36/106

🧠

One-Token Verification for Reasoning Correctness Estimation

Researchers introduce One-Token Verification (OTV), a new method that estimates reasoning correctness in large language models during a single forward pass, reducing computational overhead. OTV reduces token usage by up to 90% through early termination while improving accuracy on mathematical reasoning tasks compared to existing verification methods.

AIBullisharXiv – CS AI · Mar 36/108

🧠

A Deep Learning Framework for Heat Demand Forecasting using Time-Frequency Representations of Decomposed Features

Researchers developed a deep learning framework using Continuous Wavelet Transform and CNNs for heat demand forecasting in district heating systems. The model achieved 36-43% reduction in forecasting errors compared to existing methods, reaching up to 95% accuracy in predicting day-ahead heat demand across multiple European cities.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Thoth: Mid-Training Bridges LLMs to Time Series Understanding

Researchers have developed Thoth, the first family of Large Language Models specifically designed to understand and reason about time series data through a mid-training approach. The model uses a specialized corpus called Book-of-Thoth to bridge the gap between temporal data and natural language, significantly outperforming existing LLMs in time series analysis tasks.

AIBullisharXiv – CS AI · Mar 36/109

🧠

Data-Free PINNs for Compressible Flows: Mitigating Spectral Bias and Gradient Pathologies via Mach-Guided Scaling and Hybrid Convolutions

Researchers developed a data-free Physics-Informed Neural Network (PINN) that can solve compressible flows around circular cylinders at extreme speeds up to Mach 15. The system uses hybrid convolutions and Mach-guided scaling to overcome traditional limitations and successfully captures shock waves without requiring training data.

AINeutralarXiv – CS AI · Mar 37/108

🧠

AG-REPA: Causal Layer Selection for Representation Alignment in Audio Flow Matching

Researchers introduce AG-REPA, a new method for improving audio generation models by strategically selecting which neural network layers to align with teacher models. The approach identifies that layers storing the most information aren't necessarily the most important for generation, leading to better performance in speech and audio synthesis.

AIBullisharXiv – CS AI · Mar 37/108

🧠

FastCode: Fast and Cost-Efficient Code Understanding and Reasoning

Researchers introduce FastCode, a new framework for AI-assisted software engineering that improves code understanding and reasoning efficiency. The system uses structural scouting to navigate codebases without full-text ingestion, significantly reducing computational costs while maintaining accuracy across multiple benchmarks.

AIBullisharXiv – CS AI · Mar 37/108

🧠

Fully-analog array signal processor using 3D aperture engineering

Researchers developed a fully-analog array signal processor (FASP) using 3D aperture engineering with cascaded metasurface layers that achieves N times higher angular resolution than the Rayleigh diffraction limit. The system can perform super-resolution direction-of-arrival estimation and multi-channel source separation, demonstrating 20 dB radar jamming suppression and 13.5x communication capacity enhancement at 36-41 GHz frequencies.

AINeutralarXiv – CS AI · Mar 37/107

🧠

Forgetting is Competition: Rethinking Unlearning as Representation Interference in Diffusion Models

Researchers introduce SurgUn, a surgical unlearning method for text-to-image diffusion models that enables precise removal of specific visual concepts while preserving other capabilities. The approach addresses challenges in copyright compliance and content policy enforcement by applying targeted weight-space updates based on retroactive interference theory.

AIBullisharXiv – CS AI · Mar 36/109

🧠

AWE: Adaptive Agents for Dynamic Web Penetration Testing

Researchers introduced AWE, a memory-augmented multi-agent framework for autonomous web penetration testing that outperforms existing tools on injection vulnerabilities. AWE achieved 87% XSS success and 66.7% blind SQL injection success on benchmark tests, demonstrating superior accuracy and efficiency compared to general-purpose AI penetration testing tools.

AIBullisharXiv – CS AI · Mar 36/107

🧠

An Open-Source Modular Benchmark for Diffusion-Based Motion Planning in Closed-Loop Autonomous Driving

Researchers developed an open-source modular benchmark for evaluating diffusion-based motion planners in real-world autonomous driving systems. The system integrates with Autoware ROS 2 stack and achieves 3.2x latency reduction through encoder caching while improving accuracy by 41% with second-order solving.

AINeutralarXiv – CS AI · Mar 37/107

🧠

EraseAnything++: Enabling Concept Erasure in Rectified Flow Transformers Leveraging Multi-Object Optimization

Researchers introduced EraseAnything++, a new framework for removing unwanted concepts from advanced AI image and video generation models like Stable Diffusion v3 and Flux. The method uses multi-objective optimization to balance concept removal while preserving overall generative quality, showing superior performance compared to existing approaches.

AIBullisharXiv – CS AI · Mar 36/109

🧠

Improving Text-to-Image Generation with Intrinsic Self-Confidence Rewards

Researchers introduced ARC (Adaptive Rewarding by self-Confidence), a new framework for improving text-to-image generation models through self-confidence signals rather than external rewards. The method uses internal self-denoising probes to evaluate model accuracy and converts this into scalar rewards for unsupervised optimization, showing improvements in compositional generation and text-image alignment.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Curvature-Weighted Capacity Allocation: A Minimum Description Length Framework for Layer-Adaptive Large Language Model Optimization

Researchers developed a new mathematical framework called Curvature-Weighted Capacity Allocation that optimizes large language model performance by identifying which layers contribute most to loss reduction. The method uses the Minimum Description Length principle to make principled decisions about layer pruning and capacity allocation under hardware constraints.

$NEAR

AIBullisharXiv – CS AI · Mar 37/109

🧠

SimAB: Simulating A/B Tests with Persona-Conditioned AI Agents for Rapid Design Evaluation

SimAB is a new system that uses persona-conditioned AI agents to simulate A/B tests for rapid design evaluation without requiring real user traffic. The system achieved 67% overall accuracy against 47 historical A/B tests, rising to 83% for high-confidence cases, potentially transforming how companies validate design decisions.

AIBullisharXiv – CS AI · Mar 36/107

🧠

TC-SSA: Token Compression via Semantic Slot Aggregation for Gigapixel Pathology Reasoning

Researchers propose TC-SSA, a token compression framework that enables large vision-language models to process gigapixel pathology images by reducing visual tokens to 1.7% of original size while maintaining diagnostic accuracy. The method achieves 78.34% overall accuracy on SlideBench and demonstrates strong performance across multiple cancer classification tasks.

AIBearisharXiv – CS AI · Mar 36/109

🧠

Prompt Sensitivity and Answer Consistency of Small Open-Source Large Language Models on Clinical Question Answering: Implications for Low-Resource Healthcare Deployment

Research evaluated five small open-source language models on clinical question answering, finding that high consistency doesn't guarantee accuracy - models can be reliably wrong. Llama 3.2 showed the best balance of accuracy and reliability, while roleplay prompts consistently reduced performance across all models.

$NEAR

AIBearisharXiv – CS AI · Mar 37/107

🧠

Artificial Superintelligence May be Useless: Equilibria in the Economy of Multiple AI Agents

A new research paper analyzes economic equilibria between AI and human agents in trading scenarios, finding that unless agents can at least double their marginal utility from purchases, no trading will occur. The study reveals that more powerful AI agents may contribute zero utility to less capable agents in certain equilibria.

AINeutralarXiv – CS AI · Mar 36/107

🧠

A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations

Researchers propose a new gauge-theoretic framework for understanding superposition in large language models, replacing traditional single-dictionary approaches with local semantic charts. The method introduces three measurable obstructions to interpretability and demonstrates results on Llama 3.2 3B model with various datasets.

AINeutralarXiv – CS AI · Mar 37/107

🧠

A Comprehensive Evaluation of LLM Unlearning Robustness under Multi-Turn Interaction

Researchers found that machine unlearning in large language models, which aims to remove specific training data influence, is less effective in interactive settings than previously thought. Knowledge that appears forgotten in static tests can often be recovered through multi-turn conversations and self-correction interactions.

← PrevPage 230 of 520Next →