AIBearisharXiv โ CS AI ยท 6d ago6/103
๐ง
GNN Explanations that do not Explain and How to find Them
Researchers have identified critical failures in Self-explainable Graph Neural Networks (SE-GNNs) where explanations can be completely unrelated to how the models actually make predictions. The study reveals that these degenerate explanations can hide the use of sensitive attributes and can emerge both maliciously and naturally, while existing faithfulness metrics fail to detect them.