🧠

AI

12,908 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

12908 articles

AIBullisharXiv – CS AI · Mar 96/10

🧠

The EpisTwin: A Knowledge Graph-Grounded Neuro-Symbolic Architecture for Personal AI

Researchers introduce EpisTwin, a neuro-symbolic AI framework that creates Personal Knowledge Graphs from fragmented user data across applications. The system combines Graph Retrieval-Augmented Generation with visual refinement to enable complex reasoning over personal semantic data, addressing current limitations in personal AI systems.

AIBullisharXiv – CS AI · Mar 96/10

🧠

CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal

Researchers introduce CARE (Contrastive Anchored REflection), a new AI training framework that improves multimodal reasoning by learning from failures rather than just successes. The method achieved 4.6 point accuracy improvements on visual-reasoning benchmarks and reached state-of-the-art results on MathVista and MMMU-Pro when tested on Qwen models.

AIBullisharXiv – CS AI · Mar 96/10

🧠

A-3PO: Accelerating Asynchronous LLM Training with Staleness-aware Proximal Policy Approximation

Researchers developed A-3PO, an optimization technique for training large language models that eliminates computational overhead in reinforcement learning algorithms. The approach achieves 1.8x training speedup while maintaining comparable performance by approximating proximal policy through interpolation rather than explicit computation.

AIBullisharXiv – CS AI · Mar 96/10

🧠

Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views

Researchers introduce 3DThinker, a new framework that enables vision-language models to perform 3D spatial reasoning from limited 2D views without requiring 3D training data. The system uses a two-stage training approach to align 3D representations with foundation models and demonstrates superior performance across multiple benchmarks.

AIBullisharXiv – CS AI · Mar 96/10

🧠

Reasoned Safety Alignment: Ensuring Jailbreak Defense via Answer-Then-Check

Researchers introduce Answer-Then-Check, a novel safety alignment approach for large language models that enables them to evaluate response safety before outputting to users. The method uses a new 80K-sample dataset called Reasoned Safety Alignment (ReSA) and demonstrates improved jailbreak defense while maintaining general reasoning capabilities.

🏢 Hugging Face

AIBullisharXiv – CS AI · Mar 96/10

🧠

MAP: Mitigating Hallucinations in Large Vision-Language Models with Map-Level Attention Processing

Researchers developed MAP (Map-Level Attention Processing), a training-free method to reduce hallucinations in Large Vision-Language Models by treating hidden states as 2D semantic maps. The approach uses attention-based operations to better leverage factual information and improve consistency between generated text and visual inputs.

AIBullisharXiv – CS AI · Mar 96/10

🧠

VLMQ: Token Saliency-Driven Post-Training Quantization for Vision-language Models

Researchers introduced VLMQ, a post-training quantization framework specifically designed for vision-language models that addresses visual over-representation and modality gaps. The method achieves significant performance improvements, including 16.45% better results on MME-RealWorld under 2-bit quantization compared to existing approaches.

AIBullisharXiv – CS AI · Mar 96/10

🧠

Maximizing Asynchronicity in Event-based Neural Networks

Researchers have developed EVA (EVent Asynchronous feature learning), a new framework that improves event-based neural networks by adapting language modeling techniques to process asynchronous visual data from event cameras. EVA demonstrates superior performance on recognition and detection tasks, achieving breakthrough results including 0.477 mAP on the Gen1 dataset for demanding detection applications.

AINeutralarXiv – CS AI · Mar 96/10

🧠

KramaBench: A Benchmark for AI Systems on Data-to-Insight Pipelines over Data Lakes

Researchers introduce KramaBench, a comprehensive benchmark testing AI systems' ability to execute end-to-end data processing pipelines on real-world data lakes. The study reveals significant limitations in current AI systems, with the best performing system achieving only 55% accuracy in full data-lake scenarios and leading LLMs implementing just 20% of individual data tasks correctly.

AIBullisharXiv – CS AI · Mar 96/10

🧠

Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence

This research survey examines Federated Learning (FL), a distributed machine learning approach that enables collaborative AI model training without centralizing sensitive data. The paper covers FL's technical challenges, privacy mechanisms, and applications across healthcare, finance, and IoT systems.

AIBullisharXiv – CS AI · Mar 96/10

🧠

XR-DT: Extended Reality-Enhanced Digital Twin for Safe Motion Planning via Human-Aware Model Predictive Path Integral Control

Researchers developed XR-DT, an Extended Reality-enhanced Digital Twin framework that combines augmented, virtual, and mixed reality to improve human-robot interaction in shared workspaces. The system uses a novel Human-Aware Model Predictive Path Integral control model with ATLAS, a Transformer-based trajectory prediction system, to enable safer and more interpretable robot navigation around humans.

AIBearisharXiv – CS AI · Mar 96/10

🧠

From Tokenizer Bias to Backbone Capability: A Controlled Study of LLMs for Time Series Forecasting

Researchers conducted a controlled study examining the effectiveness of large language models (LLMs) for time series forecasting, finding that existing approaches often overfit to small datasets. Despite some promise, LLMs did not consistently outperform models specifically trained on large-scale time series data.

AIBullisharXiv – CS AI · Mar 96/10

🧠

Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation

A comprehensive survey examines how large multimodal language models are transforming scientific research across five key areas: literature search, idea generation, content creation, multimodal artifact production, and peer review evaluation. The research highlights both the potential for AI-assisted scientific discovery and the ethical concerns regarding research integrity and misuse of generative models.

AIBullisharXiv – CS AI · Mar 96/10

🧠

A Cognitive Explainer for Fetal ultrasound images classifier Based on Medical Concepts

Researchers developed an interpretable AI framework for fetal ultrasound image classification that incorporates medical concepts and clinical knowledge. The system uses graph convolutional networks to establish relationships between key medical concepts, providing explanations that align with clinicians' cognitive processes rather than just pixel-level analysis.

AINeutralarXiv – CS AI · Mar 96/10

🧠

MERIT Feedback Elicits Better Bargaining in LLM Negotiators

Researchers introduce AgoraBench, a new framework for improving Large Language Models' bargaining and negotiation capabilities through utility-based feedback mechanisms. The study reveals that current LLMs struggle with strategic depth in negotiations and proposes human-aligned metrics and training methods to enhance their performance.

AINeutralarXiv – CS AI · Mar 96/10

🧠

The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation

A systematic literature review of 346 papers reveals critical flaws in AI data annotation practices, arguing that treating human disagreement as 'noise' rather than meaningful signal undermines model quality. The study proposes pluralistic annotation frameworks that embrace diverse human perspectives instead of forcing artificial consensus.

AINeutralarXiv – CS AI · Mar 96/10

🧠

Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!

This position paper argues against anthropomorphizing intermediate tokens generated by language models as 'reasoning traces' or 'thoughts'. The authors contend that treating these computational outputs as human-like thinking processes is misleading and potentially harmful to AI research and understanding.

AIBullisharXiv – CS AI · Mar 96/10

🧠

RAMoEA-QA: Hierarchical Specialization for Robust Respiratory Audio Question Answering

Researchers introduced RAMoEA-QA, a new AI system that uses hierarchical specialization to answer questions about respiratory audio recordings from mobile devices. The system employs a two-stage routing approach with Audio Mixture-of-Experts and Language Mixture-of-Adapters to handle diverse recording conditions and query types, achieving 0.72 test accuracy compared to 0.61-0.67 for existing baselines.

AINeutralarXiv – CS AI · Mar 96/10

🧠

VisioMath: Benchmarking Figure-based Mathematical Reasoning in LMMs

Researchers introduced VisioMath, a new benchmark with 1,800 K-12 math problems designed to test Large Multimodal Models' ability to distinguish between visually similar diagrams. The study reveals that current state-of-the-art models struggle with fine-grained visual reasoning, often relying on shallow positional heuristics rather than proper image-text alignment.

AIBullisharXiv – CS AI · Mar 96/10

🧠

PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations

Researchers introduce PONTE, a human-in-the-loop framework that creates personalized, trustworthy AI explanations by combining user preference modeling with verification modules. The system addresses the challenge of one-size-fits-all AI explanations by adapting to individual user expertise and cognitive needs while maintaining faithfulness and reducing hallucinations.

AIBullisharXiv – CS AI · Mar 96/10

🧠

Artificial Intelligence for Detecting Fetal Orofacial Clefts and Advancing Medical Education

Researchers developed an AI system that can detect fetal orofacial clefts in ultrasound images with over 93% sensitivity and 95% specificity, matching senior radiologist performance. The system was trained on 45,139 ultrasound images from 9,215 fetuses across 22 hospitals and can also improve junior radiologist diagnostic accuracy by 6%.

🏢 Microsoft

AINeutralarXiv – CS AI · Mar 96/10

🧠

ESAA-Security: An Event-Sourced, Verifiable Architecture for Agent-Assisted Security Audits of AI-Generated Code

Researchers have developed ESAA-Security, a new architecture for conducting secure, verifiable audits of AI-generated code using structured agent workflows rather than unstructured LLM conversations. The system creates an immutable audit trail through event-sourcing and produces comprehensive security reports across 26 tasks and 95 executable checks.

AIBullisharXiv – CS AI · Mar 96/10

🧠

Dynamic Chunking Diffusion Transformer

Researchers introduce Dynamic Chunking Diffusion Transformer (DC-DiT), a new AI model that adaptively processes images by allocating more computational resources to detail-rich regions and fewer to uniform backgrounds. The system improves image generation quality while reducing computational costs by up to 16x compared to traditional diffusion transformers.

AIBearisharXiv – CS AI · Mar 96/10

🧠

Discerning What Matters: A Multi-Dimensional Assessment of Moral Competence in LLMs

Researchers developed a new framework to assess moral competence in large language models, finding that current evaluations may overestimate AI moral reasoning capabilities. While LLMs outperformed humans on standard ethical scenarios, they performed significantly worse when required to identify morally relevant information from noisy data.

AIBullisharXiv – CS AI · Mar 96/10

🧠

Prompt Group-Aware Training for Robust Text-Guided Nuclei Segmentation

Researchers developed a new training method to improve the robustness of AI foundation models like SAM3 for medical image segmentation by reducing sensitivity to prompt variations. The approach groups semantically similar prompts together and uses consistency constraints to ensure more reliable predictions across different prompt formulations.

← PrevPage 214 of 517Next →