AINeutralarXiv – CS AI · 3d ago7/10
🧠Researchers propose Faithful Agentic XAI (FAX), a framework that improves the reliability of AI explanations generated by large language models through explicit verification mechanisms. The study introduces CRAFTER-XAI-Bench, a new benchmark for testing explanation faithfulness in complex environments, demonstrating that current XAI systems can produce plausible but inaccurate explanations that mislead users.
AIBearisharXiv – CS AI · May 97/10
🧠A peer-reviewed study evaluates explainability methods in AI systems used for automatic target recognition in safety-critical applications, revealing that popular post-hoc explanation techniques have significant limitations including spurious explanations and vulnerability to manipulation. The research argues that current XAI approaches are insufficient for deployment in high-stakes environments and calls for more robust, causally-grounded methods that prioritize system assurance over visual plausibility.
AINeutralarXiv – CS AI · Apr 207/10
🧠A new survey examines intrinsic interpretability approaches for Large Language Models, categorizing design methods that build transparency directly into model architectures rather than applying post-hoc explanations. The research identifies five key paradigms—functional transparency, concept alignment, representational decomposability, explicit modularization, and latent sparsity induction—addressing the critical challenge of making LLMs more trustworthy and safer for deployment.
AIBullisharXiv – CS AI · Apr 157/10
🧠Researchers propose a two-stage LLM framework that uses one model to translate XAI technical outputs into natural language and a second model to verify accuracy, faithfulness, and completeness before delivering explanations to users. The framework includes iterative refinement mechanisms and demonstrates improved reliability across multiple XAI techniques and LLM families.
AINeutralarXiv – CS AI · Mar 277/10
🧠A user study with 200 participants found that while explanation correctness in AI systems affects human understanding, the relationship is not linear - performance drops significantly at 70% correctness but doesn't degrade further below that threshold. The research challenges assumptions that higher computational correctness metrics automatically translate to better human comprehension of AI decisions.
AINeutralarXiv – CS AI · Mar 177/10
🧠A research paper argues that the most valuable capabilities of large language models are precisely those that cannot be captured by human-readable rules. The thesis is supported by proof showing that if LLM capabilities could be fully rule-encoded, they would be equivalent to expert systems, which have been proven historically weaker than LLMs.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed FairMed-XGB, a machine learning framework that reduces gender bias in healthcare AI models by 40-72% while maintaining predictive accuracy. The system uses Bayesian optimization and explainable AI to ensure equitable treatment decisions in critical care settings.
AIBullisharXiv – CS AI · Mar 127/10
🧠Researchers have developed a new method to detect and eliminate backdoor triggers in neural networks using active path analysis. The approach shows promising results in experiments with machine learning models used for intrusion detection, addressing a critical cybersecurity vulnerability.
AINeutralarXiv – CS AI · Mar 97/10
🧠Researchers demonstrate that traditional explainable AI methods designed for static predictions fail when applied to agentic AI systems that make sequential decisions over time. The study shows attribution-based explanations work well for static tasks but trace-based diagnostics are needed to understand failures in multi-step AI agent behaviors.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers introduce RAG-Driver, a retrieval-augmented multi-modal large language model designed for autonomous driving that can provide explainable decisions and control predictions. The system addresses data scarcity and generalization challenges in AI-driven autonomous vehicles by using in-context learning and expert demonstration retrieval.
AIBullisharXiv – CS AI · Mar 46/103
🧠Researchers developed a Neuro-Symbolic Agentic Framework combining machine learning with LLM-based reasoning to predict colorectal cancer drug responses. The system achieved significant predictive accuracy (r=0.504) and introduces 'Inverse Reasoning' for simulating genomic edits to predict drug sensitivity changes.
AINeutralarXiv – CS AI · Mar 46/102
🧠Researchers propose PURE, a new framework for AI-powered recommendation systems that addresses preference-inconsistent explanations - where AI provides factually correct but unconvincing reasoning that conflicts with user preferences. The system uses a select-then-generate approach to improve both evidence selection and explanation generation, demonstrating reduced hallucinations while maintaining recommendation accuracy.
AIBullisharXiv – CS AI · Mar 46/103
🧠Researchers developed COOL-MC, a tool that combines reinforcement learning with model checking to verify and explain AI policies for platelet inventory management in blood banks. The system achieved a 2.9% stockout probability while providing transparent decision-making explanations for safety-critical healthcare applications.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers developed a new graph concept bottleneck layer (GCBM) that can be integrated into Graph Neural Networks to make their decision-making process more interpretable. The method treats graph concepts as 'words' and uses language models to improve understanding of how GNNs make predictions, achieving state-of-the-art performance in both classification accuracy and interpretability.
AIBullisharXiv – CS AI · Mar 37/105
🧠Researchers have developed DeepMedix-R1, a foundation model for chest X-ray interpretation that provides transparent, step-by-step reasoning alongside accurate diagnoses to address the black-box problem in medical AI. The model uses reinforcement learning to align diagnostic outputs with clinical plausibility and significantly outperforms existing models in report generation and visual question answering tasks.
AINeutralarXiv – CS AI · Feb 277/105
🧠Researchers propose Geodesic Integrated Gradients (GIG), a new method for explaining AI model decisions that uses curved paths instead of straight lines to compute feature importance. The method addresses flawed attributions in existing approaches by integrating gradients along geodesic paths under a model-induced Riemannian metric.
AIBullishOpenAI News · May 97/106
🧠Researchers used GPT-4 to automatically generate explanations for how individual neurons behave in large language models and to evaluate the quality of those explanations. They have released a comprehensive dataset containing explanations and quality scores for every neuron in GPT-2, advancing AI interpretability research.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers compared five post-hoc explainability methods for interpreting deep learning models trained to detect Major Depressive Disorder from EEG data. While different attribution approaches showed partially overlapping patterns emphasizing frontal and temporal brain regions, the study reveals methodological assumptions significantly influence interpretability results, cautioning against treating findings as definitive clinical biomarkers.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers propose using genetic programming to evolve interpretable feature sets and tree structures for survival analysis models, demonstrating improved predictive performance while maintaining shallow, explainable decision trees. The approach addresses the fundamental trade-off between accuracy and interpretability in medical survival prediction by optimizing both feature construction and tree logic simultaneously.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers propose integrating explicit user feedback (comments, reviews, verbal text) into Large Language Model-based recommendation systems to better align with actual user preferences. The approach addresses limitations in traditional recommender systems that rely solely on implicit signals like clicks and purchases, potentially reducing filter bubbles and improving transparency.
AIBullisharXiv – CS AI · 2d ago6/10
🧠Researchers propose REKD (Rationale Extraction with Knowledge Distillation), a method that improves the interpretability and performance of smaller deep neural networks by having them learn from larger teacher models' rationales and predictions. The approach demonstrates significant performance gains across language and vision tasks, offering a practical framework for making AI systems more transparent and verifiable in high-stakes applications.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers developed DEXiRE-EVO, an evolutionary rule extraction framework combining machine learning with explainable AI to predict SME defaults in Italy. The approach outperforms traditional logistic regression while maintaining interpretability, identifying key risk factors like weak liquidity, high leverage, and operational inefficiency across 50,718 firms from 2015-2024.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers have developed an algorithm to identify parsimonious explicit piece-wise polynomial relationships in industrial time-series data, with application to robotic manipulator control. The method derives simpler, interpretable models that outperform deep neural networks on unseen contexts while maintaining computational efficiency.
AIBullisharXiv – CS AI · 3d ago6/10
🧠Researchers introduce XAIstories, a framework that uses Large Language Models to convert complex AI explanations (SHAP values and counterfactual explanations) into human-readable narratives. User studies show over 90% of general audiences find these AI-generated stories convincing, with data scientists viewing them as valuable for explaining AI decisions to non-technical stakeholders.
AIBullisharXiv – CS AI · 3d ago6/10
🧠Researchers propose a case-aware medical image classification framework that leverages multimodal knowledge graphs to retrieve similar historical cases and integrate external clinical knowledge, improving diagnostic accuracy through interpretable evidence-based reasoning rather than relying solely on isolated visual analysis.