ABLE: Representing and Mapping LLMs via Attribution-Based Large-model Embedding
Researchers introduce ABLE, a framework that represents and compares large language models through gradient-based feature attributions rather than parameter analysis or output comparison. The training-free method achieves competitive performance on model comparison tasks across 239 open-source LLMs while providing theoretical stability guarantees.
ABLE addresses a critical infrastructure problem in the rapidly fragmenting LLM ecosystem: the lack of standardized methods for comparing heterogeneous models across different architectures and tokenizers. As organizations deploy increasingly diverse LLMs for different use cases, the ability to systematically audit model provenance, assess security properties, and select appropriate models becomes operationally essential. Current approaches fail to bridge the gap between parameter-level analysis—which requires architectural compatibility—and output-level comparison, which obscures meaningful differences between functionally similar models.
This research emerges from the broader trend of LLM standardization and governance. As enterprises move beyond proof-of-concept deployments, they require reliable methods for model evaluation and selection. ABLE's innovation lies in its use of the interpretability space via gradient-based attributions, capturing how models respond to inputs at a finer granularity than behavioral outputs alone. The tokenizer-agnostic word-level alignment is particularly valuable given the proliferation of specialized tokenizers across different model families.
The practical implications are significant for model researchers, AI infrastructure teams, and security auditors. Organizations can now more reliably detect model similarity, route inference appropriately across model portfolios, and predict benchmark performance without retraining. The theoretical contribution—demonstrating Lipschitz-continuous parameter-to-embedding mappings with finite-sample convergence—provides confidence that the approach is mathematically sound rather than empirically lucky.
The experiments across 239 models suggest the method scales effectively across the ecosystem's growing diversity. The training-free nature makes adoption frictionless. As LLM governance frameworks mature, standardized representation methods become foundational tools for compliance, security analysis, and efficient resource allocation.
- →ABLE provides a training-free method for comparing LLMs across different architectures using gradient-based feature attributions.
- →The approach achieves competitive performance on model routing, relation prediction, and benchmark score prediction tasks.
- →Tokenizer-agnostic word-level alignment enables meaningful comparison across models with incompatible tokenization schemes.
- →Theoretical analysis guarantees Lipschitz-continuous embeddings with finite-sample convergence under standard differentiable transformer assumptions.
- →Scalability demonstrated across 239 open-source LLMs suggests applicability to real-world model governance and selection workflows.