LLM attribution analysis across different fine-tuning strategies and model scales for automated code compliance
Researchers conducted a comparative study of how large language models trained with different fine-tuning methods (full fine-tuning, LoRA, and quantized LoRA) interpret code compliance tasks. The study reveals that full fine-tuning produces more focused attribution patterns than parameter-efficient methods, and larger models develop distinct interpretive strategies despite performance gains plateauing above 7B parameters.
This research addresses a critical gap in LLM development for regulated industries by moving beyond performance metrics to examine interpretability. The study's focus on attribution analysis—understanding which input features drive model decisions—matters because compliance tasks in architecture, engineering, and construction require explainability, not just accuracy. Regulators and practitioners need to understand not just whether models produce correct outputs, but why they generate those outputs.
The findings reveal important trade-offs between efficiency and interpretability. Full fine-tuning creates more concentrated, focused attribution patterns compared to parameter-efficient methods like LoRA, suggesting these approaches may sacrifice interpretive clarity for computational efficiency. This tension becomes crucial in high-stakes domains where stakeholders must audit model reasoning. Additionally, the discovery that larger models develop specialized strategies—prioritizing numerical constraints and rule identifiers—indicates that scale itself shapes how models process regulatory text, a phenomenon previously overlooked in practical deployments.
For the AEC industry and similar regulated sectors, these insights suggest that model selection requires balancing computational costs against explainability demands. The performance plateau above 7B parameters is particularly valuable, indicating that larger models may offer diminishing returns on compliance accuracy while potentially increasing audit complexity. Organizations implementing LLMs for code compliance should consider whether their use cases justify the interpretability costs of parameter-efficient fine-tuning methods.
Future work should examine whether these attribution patterns correlate with actual compliance errors in real-world applications, and whether industry-specific explanations can be systematically validated through domain expert review.
- →Full fine-tuning produces more focused and interpretable attribution patterns than parameter-efficient alternatives like LoRA
- →Model scale shapes interpretive strategies, with larger models prioritizing numerical constraints and rule identifiers in compliance tasks
- →Performance gains in semantic similarity plateau above 7B parameters, questioning the value of scaling for code compliance applications
- →Parameter-efficient fine-tuning trades interpretability for computational efficiency, creating trade-offs for regulated industry applications
- →Attribution analysis provides crucial explainability insights for deploying LLMs in high-stakes regulatory and compliance contexts