y0news
← Feed
Back to feed
🧠 AI🟢 BullishImportance 7/10

Localizing Input Uncertainty Quantification for Large Language Models via Shapley Values

arXiv – CS AI|Seongjun Lee, Suwan Yoon, Changhee Lee|
🤖AI Summary

Researchers introduce ShaQ, a Shapley-value-based framework that identifies which specific parts of user input cause uncertainty in large language models, rather than just flagging overall uncertainty. The method achieves state-of-the-art ambiguity detection on multiple benchmarks and demonstrates practical value in high-stakes domains like clinical settings by enabling targeted input clarification.

Analysis

This research addresses a critical gap in LLM deployment: distinguishing whether model errors stem from inherent knowledge limitations or poorly specified user inputs. Current uncertainty quantification methods provide only aggregate confidence scores, leaving users without actionable guidance on improving their queries. ShaQ solves this by decomposing input-induced uncertainty at the span level, treating ambiguous text segments as cooperative game players whose individual contributions are quantified through Shapley values. This game-theoretic approach captures complex interactions between input spans—something coarser input-level approaches cannot accomplish. The framework's principled decomposition ensures attributions sum exactly to total input uncertainty, providing mathematical rigor absent in ad-hoc alternatives.

The research reflects growing industry recognition that LLM safety requires understanding failure modes beyond raw accuracy metrics. As organizations deploy LLMs in high-stakes domains—medical diagnostics, legal analysis, financial advisory—distinguishing user ambiguity from model incompetence becomes essential for liability and trust. The evaluation on AmbigQA, AmbiEnt, and notably MediTOD datasets demonstrates practical applicability across diverse domains.

For developers and enterprises, ShaQ enables building better human-AI collaboration systems where users receive specific prompts to clarify problematic input sections rather than generic confidence warnings. This reduces back-and-forth iterations and improves decision quality. The work influences how future LLM interfaces should provide transparency—moving beyond binary confidence indicators toward diagnostic uncertainty attribution that guides user behavior.

Expect follow-up research extending these methods to multimodal models and exploring computational efficiency for production deployments.

Key Takeaways
  • ShaQ uses Shapley values to localize input-induced uncertainty at the span level rather than providing only aggregate scores.
  • The framework achieves state-of-the-art performance on ambiguity detection benchmarks including AmbigQA and AmbiEnt.
  • Mathematical decomposition ensures individual span attributions sum exactly to total input uncertainty, providing principled uncertainty accounting.
  • Clinical trial results on MediTOD demonstrate practical utility in high-stakes domains requiring human-AI collaboration.
  • The approach distinguishes model knowledge gaps from input ambiguity, enabling targeted clarification strategies.
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles