#explainable-ai News & Analysis

123 articles tagged with #explainable-ai. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

123 articles

AINeutralarXiv – CS AI · 3d ago7/10

🧠

Towards Faithful Agentic XAI: A Verification Method and an Open-World Benchmark for Better Model Faithfulness

Researchers propose Faithful Agentic XAI (FAX), a framework that improves the reliability of AI explanations generated by large language models through explicit verification mechanisms. The study introduces CRAFTER-XAI-Bench, a new benchmark for testing explanation faithfulness in complex environments, demonstrating that current XAI systems can produce plausible but inaccurate explanations that mislead users.

AIBearisharXiv – CS AI · May 97/10

🧠

Evaluating Explainability in Safety-Critical ATR Systems: Limitations of Post-Hoc Methods and Paths Toward Robust XAI

A peer-reviewed study evaluates explainability methods in AI systems used for automatic target recognition in safety-critical applications, revealing that popular post-hoc explanation techniques have significant limitations including spurious explanations and vulnerability to manipulation. The research argues that current XAI approaches are insufficient for deployment in high-stakes environments and calls for more robust, causally-grounded methods that prioritize system assurance over visual plausibility.

AINeutralarXiv – CS AI · Apr 207/10

🧠

Towards Intrinsic Interpretability of Large Language Models:A Survey of Design Principles and Architectures

A new survey examines intrinsic interpretability approaches for Large Language Models, categorizing design methods that build transparency directly into model architectures rather than applying post-hoc explanations. The research identifies five key paradigms—functional transparency, concept alignment, representational decomposability, explicit modularization, and latent sparsity induction—addressing the critical challenge of making LLMs more trustworthy and safer for deployment.

AIBullisharXiv – CS AI · Apr 157/10

🧠

A Two-Stage LLM Framework for Accessible and Verified XAI Explanations

Researchers propose a two-stage LLM framework that uses one model to translate XAI technical outputs into natural language and a second model to verify accuracy, faithfulness, and completeness before delivering explanations to users. The framework includes iterative refinement mechanisms and demonstrates improved reliability across multiple XAI techniques and LLM families.

AINeutralarXiv – CS AI · Mar 277/10

🧠

Does Explanation Correctness Matter? Linking Computational XAI Evaluation to Human Understanding

A user study with 200 participants found that while explanation correctness in AI systems affects human understanding, the relationship is not linear - performance drops significantly at 70% correctness but doesn't degrade further below that threshold. The research challenges assumptions that higher computational correctness metrics automatically translate to better human comprehension of AI decisions.

AINeutralarXiv – CS AI · Mar 177/10

🧠

Why the Valuable Capabilities of LLMs Are Precisely the Unexplainable Ones

A research paper argues that the most valuable capabilities of large language models are precisely those that cannot be captured by human-readable rules. The thesis is supported by proof showing that if LLM capabilities could be fully rule-encoded, they would be equivalent to expert systems, which have been proven historically weaker than LLMs.

AIBullisharXiv – CS AI · Mar 177/10

🧠

FairMed-XGB: A Bayesian-Optimised Multi-Metric Framework with Explainability for Demographic Equity in Critical Healthcare Data

Researchers developed FairMed-XGB, a machine learning framework that reduces gender bias in healthcare AI models by 40-72% while maintaining predictive accuracy. The system uses Bayesian optimization and explainable AI to ensure equitable treatment decisions in critical care settings.

AIBullisharXiv – CS AI · Mar 127/10

🧠

Detecting and Eliminating Neural Network Backdoors Through Active Paths with Application to Intrusion Detection

Researchers have developed a new method to detect and eliminate backdoor triggers in neural networks using active path analysis. The approach shows promising results in experiments with machine learning models used for intrusion detection, addressing a critical cybersecurity vulnerability.

AINeutralarXiv – CS AI · Mar 97/10

🧠

From Features to Actions: Explainability in Traditional and Agentic AI Systems

Researchers demonstrate that traditional explainable AI methods designed for static predictions fail when applied to agentic AI systems that make sequential decisions over time. The study shows attribution-based explanations work well for static tasks but trace-based diagnostics are needed to understand failures in multi-step AI agent behaviors.

AIBullisharXiv – CS AI · Mar 97/10

🧠

RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model

Researchers introduce RAG-Driver, a retrieval-augmented multi-modal large language model designed for autonomous driving that can provide explainable decisions and control predictions. The system addresses data scarcity and generalization challenges in AI-driven autonomous vehicles by using in-context learning and expert demonstration retrieval.

AIBullisharXiv – CS AI · Mar 46/103

🧠

Contextual Invertible World Models: A Neuro-Symbolic Agentic Framework for Colorectal Cancer Drug Response

Researchers developed a Neuro-Symbolic Agentic Framework combining machine learning with LLM-based reasoning to predict colorectal cancer drug responses. The system achieved significant predictive accuracy (r=0.504) and introduces 'Inverse Reasoning' for simulating genomic edits to predict drug sensitivity changes.

AINeutralarXiv – CS AI · Mar 46/102

🧠

Beyond Factual Correctness: Mitigating Preference-Inconsistent Explanations in Explainable Recommendation

Researchers propose PURE, a new framework for AI-powered recommendation systems that addresses preference-inconsistent explanations - where AI provides factually correct but unconvincing reasoning that conflicts with user preferences. The system uses a select-then-generate approach to improve both evidence selection and explanation generation, demonstrating reduced hallucinations while maintaining recommendation accuracy.

AIBullisharXiv – CS AI · Mar 46/103

🧠

COOL-MC: Verifying and Explaining RL Policies for Platelet Inventory Management

Researchers developed COOL-MC, a tool that combines reinforcement learning with model checking to verify and explain AI policies for platelet inventory management in blood banks. The system achieved a 2.9% stockout probability while providing transparent decision-making explanations for safety-critical healthcare applications.

AINeutralarXiv – CS AI · Mar 37/104

🧠

Revealing Combinatorial Reasoning of GNNs via Graph Concept Bottleneck Layer

Researchers developed a new graph concept bottleneck layer (GCBM) that can be integrated into Graph Neural Networks to make their decision-making process more interpretable. The method treats graph concepts as 'words' and uses language models to improve understanding of how GNNs make predictions, achieving state-of-the-art performance in both classification accuracy and interpretability.

AIBullisharXiv – CS AI · Mar 37/105

🧠

Toward Clinically Explainable AI for Medical Diagnosis: A Foundation Model with Human-Compatible Reasoning via Reinforcement Learning

Researchers have developed DeepMedix-R1, a foundation model for chest X-ray interpretation that provides transparent, step-by-step reasoning alongside accurate diagnoses to address the black-box problem in medical AI. The model uses reinforcement learning to align diagnostic outputs with clinical plausibility and significantly outperforms existing models in report generation and visual question answering tasks.

AINeutralarXiv – CS AI · Feb 277/105

🧠

Using the Path of Least Resistance to Explain Deep Networks

Researchers propose Geodesic Integrated Gradients (GIG), a new method for explaining AI model decisions that uses curved paths instead of straight lines to compute feature importance. The method addresses flawed attributions in existing approaches by integrating gradients along geodesic paths under a model-induced Riemannian metric.

AIBullishOpenAI News · May 97/106

🧠

Language models can explain neurons in language models

Researchers used GPT-4 to automatically generate explanations for how individual neurons behave in large language models and to evaluate the quality of those explanations. They have released a comprehensive dataset containing explanations and quality scores for every neuron in GPT-2, advancing AI interpretability research.

AINeutralarXiv – CS AI · 2d ago6/10

🧠

Comparing Post-Hoc Explainable AI Methods for Interpreting Black-Box EEG Models in Depression Detection

Researchers compared five post-hoc explainability methods for interpreting deep learning models trained to detect Major Depressive Disorder from EEG data. While different attribution approaches showed partially overlapping patterns emphasizing frontal and temporal brain regions, the study reveals methodological assumptions significantly influence interpretability results, cautioning against treating findings as definitive clinical biomarkers.

AINeutralarXiv – CS AI · 2d ago6/10

🧠

Evolving Features vs Evolving Entire Trees with GP for Interpretable Survival Analysis

Researchers propose using genetic programming to evolve interpretable feature sets and tree structures for survival analysis models, demonstrating improved predictive performance while maintaining shallow, explainable decision trees. The approach addresses the fundamental trade-off between accuracy and interpretability in medical survival prediction by optimizing both feature construction and tree logic simultaneously.

AINeutralarXiv – CS AI · 2d ago6/10

🧠

Toward User Preference Alignment in LLM Recommendation via Explicit Context Feedback

Researchers propose integrating explicit user feedback (comments, reviews, verbal text) into Large Language Model-based recommendation systems to better align with actual user preferences. The approach addresses limitations in traditional recommender systems that rely solely on implicit signals like clicks and purchases, potentially reducing filter bubbles and improving transparency.

AIBullisharXiv – CS AI · 2d ago6/10

🧠

Learn from A Rationalist: Distilling Intermediate Interpretable Rationales

Researchers propose REKD (Rationale Extraction with Knowledge Distillation), a method that improves the interpretability and performance of smaller deep neural networks by having them learn from larger teacher models' rationales and predictions. The approach demonstrates significant performance gains across language and vision tasks, offering a practical framework for making AI systems more transparent and verifiable in high-stakes applications.

AINeutralarXiv – CS AI · 2d ago6/10

🧠

Evolutionary Rule Extraction from Corporate Default Prediction Models

Researchers developed DEXiRE-EVO, an evolutionary rule extraction framework combining machine learning with explainable AI to predict SME defaults in Italy. The approach outperforms traditional logistic regression while maintaining interpretability, identifying key risk factors like weak liquidity, high leverage, and operational inefficiency across 50,718 firms from 2015-2024.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Identifying Explicit Parsimonious Piece-wise Polynomial Relationships in Industrial time-series: Application to manipulator robots

Researchers have developed an algorithm to identify parsimonious explicit piece-wise polynomial relationships in industrial time-series data, with application to robotic manipulator control. The method derives simpler, interpretable models that outperform deep neural networks on unseen contexts while maintaining computational efficiency.

AIBullisharXiv – CS AI · 3d ago6/10

🧠

Tell Me a Story! Narrative-Driven XAI with Large Language Models

Researchers introduce XAIstories, a framework that uses Large Language Models to convert complex AI explanations (SHAP values and counterfactual explanations) into human-readable narratives. User studies show over 90% of general audiences find these AI-generated stories convincing, with data scientists viewing them as valuable for explaining AI decisions to non-technical stakeholders.

AIBullisharXiv – CS AI · 3d ago6/10

🧠

Case-Aware Medical Image Classification with Multimodal Knowledge Graphs and Reliability-Guided Refinement

Researchers propose a case-aware medical image classification framework that leverages multimodal knowledge graphs to retrieve similar historical cases and integrate external clinical knowledge, improving diagnostic accuracy through interpretable evidence-based reasoning rather than relying solely on isolated visual analysis.

Page 1 of 5Next →