85 articles tagged with #explainable-ai. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv – CS AI · 1d ago7/10
🧠Researchers propose a two-stage LLM framework that uses one model to translate XAI technical outputs into natural language and a second model to verify accuracy, faithfulness, and completeness before delivering explanations to users. The framework includes iterative refinement mechanisms and demonstrates improved reliability across multiple XAI techniques and LLM families.
AINeutralarXiv – CS AI · Mar 277/10
🧠A user study with 200 participants found that while explanation correctness in AI systems affects human understanding, the relationship is not linear - performance drops significantly at 70% correctness but doesn't degrade further below that threshold. The research challenges assumptions that higher computational correctness metrics automatically translate to better human comprehension of AI decisions.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed FairMed-XGB, a machine learning framework that reduces gender bias in healthcare AI models by 40-72% while maintaining predictive accuracy. The system uses Bayesian optimization and explainable AI to ensure equitable treatment decisions in critical care settings.
AINeutralarXiv – CS AI · Mar 177/10
🧠A research paper argues that the most valuable capabilities of large language models are precisely those that cannot be captured by human-readable rules. The thesis is supported by proof showing that if LLM capabilities could be fully rule-encoded, they would be equivalent to expert systems, which have been proven historically weaker than LLMs.
AIBullisharXiv – CS AI · Mar 127/10
🧠Researchers have developed a new method to detect and eliminate backdoor triggers in neural networks using active path analysis. The approach shows promising results in experiments with machine learning models used for intrusion detection, addressing a critical cybersecurity vulnerability.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers introduce RAG-Driver, a retrieval-augmented multi-modal large language model designed for autonomous driving that can provide explainable decisions and control predictions. The system addresses data scarcity and generalization challenges in AI-driven autonomous vehicles by using in-context learning and expert demonstration retrieval.
AINeutralarXiv – CS AI · Mar 97/10
🧠Researchers demonstrate that traditional explainable AI methods designed for static predictions fail when applied to agentic AI systems that make sequential decisions over time. The study shows attribution-based explanations work well for static tasks but trace-based diagnostics are needed to understand failures in multi-step AI agent behaviors.
AIBullisharXiv – CS AI · Mar 46/103
🧠Researchers developed a Neuro-Symbolic Agentic Framework combining machine learning with LLM-based reasoning to predict colorectal cancer drug responses. The system achieved significant predictive accuracy (r=0.504) and introduces 'Inverse Reasoning' for simulating genomic edits to predict drug sensitivity changes.
AIBullisharXiv – CS AI · Mar 46/103
🧠Researchers developed COOL-MC, a tool that combines reinforcement learning with model checking to verify and explain AI policies for platelet inventory management in blood banks. The system achieved a 2.9% stockout probability while providing transparent decision-making explanations for safety-critical healthcare applications.
AINeutralarXiv – CS AI · Mar 46/102
🧠Researchers propose PURE, a new framework for AI-powered recommendation systems that addresses preference-inconsistent explanations - where AI provides factually correct but unconvincing reasoning that conflicts with user preferences. The system uses a select-then-generate approach to improve both evidence selection and explanation generation, demonstrating reduced hallucinations while maintaining recommendation accuracy.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers developed a new graph concept bottleneck layer (GCBM) that can be integrated into Graph Neural Networks to make their decision-making process more interpretable. The method treats graph concepts as 'words' and uses language models to improve understanding of how GNNs make predictions, achieving state-of-the-art performance in both classification accuracy and interpretability.
AIBullisharXiv – CS AI · Mar 37/105
🧠Researchers have developed DeepMedix-R1, a foundation model for chest X-ray interpretation that provides transparent, step-by-step reasoning alongside accurate diagnoses to address the black-box problem in medical AI. The model uses reinforcement learning to align diagnostic outputs with clinical plausibility and significantly outperforms existing models in report generation and visual question answering tasks.
AINeutralarXiv – CS AI · Feb 277/105
🧠Researchers propose Geodesic Integrated Gradients (GIG), a new method for explaining AI model decisions that uses curved paths instead of straight lines to compute feature importance. The method addresses flawed attributions in existing approaches by integrating gradients along geodesic paths under a model-induced Riemannian metric.
AIBullishOpenAI News · May 97/106
🧠Researchers used GPT-4 to automatically generate explanations for how individual neurons behave in large language models and to evaluate the quality of those explanations. They have released a comprehensive dataset containing explanations and quality scores for every neuron in GPT-2, advancing AI interpretability research.
AINeutralarXiv – CS AI · 1d ago6/10
🧠Researchers propose a pattern reduction framework for explainable clustering that eliminates redundant k-relaxed frequent patterns (k-RFPs) while maintaining cluster quality. The approach uses formal characterization and optimization strategies to reduce computational complexity in knowledge-driven unsupervised learning systems.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers introduce an interactive workflow combining Sparse Autoencoders (SAE) and activation steering to make AI explainability actionable for practitioners. Through expert interviews with debugging tasks on CLIP, the study reveals that activation steering enables hypothesis testing and intervention-based debugging, though practitioners emphasize trust in observed model behavior over explanation plausibility and identify risks like ripple effects and limited generalization.
$XRP
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers introduce CREAM (Concept Reasoning Models), an advanced framework for Concept Bottleneck Models that allows explicit encoding of concept relationships and concept-to-task mappings. The model maintains interpretability while achieving competitive performance even with incomplete concept sets through an optional side-channel, addressing a key limitation in explainable AI systems.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers have developed a novel algorithm for detecting invariant manifolds in ReLU-based recurrent neural networks (RNNs), enabling analysis of dynamical system behavior through topological and geometrical properties. The method identifies basin boundaries, multistability, and chaotic dynamics, with applications to scientific computing and explainable AI.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers introduce X-SYS, a reference architecture for building interactive explanation systems that operationalize explainable AI (XAI) across production environments. The framework addresses the gap between XAI algorithms and deployable systems by organizing around four quality attributes (scalability, traceability, responsiveness, adaptability) and five service components, with SemanticLens as a concrete implementation for vision-language models.
AINeutralarXiv – CS AI · 2d ago6/10
🧠This academic paper proposes a neuro-symbolic approach for AGI robots combining neural networks with formal logic reasoning using Belnap's 4-valued logic system. The framework enables robots to handle unknown information, inconsistencies, and paradoxes while maintaining controlled security through axiom-based logic inference.
AINeutralarXiv – CS AI · 2d ago6/10
🧠A comprehensive review examines explainable AI methods for human activity recognition (HAR) systems across wearable, ambient, and physiological sensors. The paper addresses the critical gap between deep learning's performance improvements and the opacity that limits real-world deployment, proposing a unified framework for understanding XAI mechanisms in HAR applications.
AINeutralarXiv – CS AI · 2d ago6/10
🧠A new thesis examines explainable AI planning (XAIP) for hybrid systems, addressing the critical challenge of making autonomous planning decisions interpretable in safety-critical applications. As AI automation expands into domains like autonomous vehicles, energy grids, and healthcare, the ability to explain system reasoning becomes essential for trust and regulatory compliance.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers introduce REVEAL, an explainable AI framework for detecting AI-generated images through forensic evidence chains and expert-grounded reinforcement learning. The approach addresses the growing challenge of distinguishing synthetic images from authentic ones while providing transparent, verifiable reasoning for detection decisions.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers propose using Inductive Learning of Answer Set Programs (ILASP) to create interpretable approximations of neural networks trained on preference learning tasks. The approach combines dimensionality reduction through Principal Component Analysis with logic-based explanations, addressing the challenge of explaining black-box AI models while maintaining computational efficiency.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers introduce chain-of-illocution (CoI) prompting to improve source faithfulness in retrieval-augmented language models, achieving up to 63% gains in source adherence for programming education tasks. The study reveals that standard RAG systems exhibit low fidelity to source materials, with non-RAG models performing worse, while a user study confirms improved faithfulness does not compromise user satisfaction.