AINeutralarXiv โ CS AI ยท 10h ago5/10
๐ง
Jacobian Scopes: token-level causal attributions in LLMs
Researchers introduce Jacobian Scopes, a new gradient-based method for interpreting how individual tokens influence Large Language Model predictions. The technique uses perturbation theory and information geometry to reveal model biases, translation strategies, and learning mechanisms, with open-source implementations and an interactive demo available.
๐ข Hugging Face