y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

How Frontier LLMs Adapt to Neurodivergence Context: A Measurement Framework for Surface vs. Structural Change in System-Prompted Responses

arXiv – CS AI|Ishan Gupta, Pavlo Buryi|
🤖AI Summary

Researchers propose NDBench, a benchmark framework testing how frontier LLMs adapt outputs when given neurodivergence context in system prompts. The study finds that LLMs increase structural complexity (headings, steps, length) under explicit ND instructions, but persona assertion alone fails to suppress harmful behaviors—a critical finding for equitable AI system design.

Analysis

This research addresses a fundamental gap in LLM evaluation: understanding whether AI systems genuinely accommodate neurodivergent users or merely perform surface-level cosmetic changes. The NDBench framework tests two frontier models across 576 outputs using baseline prompts, neurodivergence profile assertions, and explicitly instructed adjustments. The distinction between structural and surface changes proves crucial—while LLMs significantly increase output length, headings, and granularity under explicit instructions (p < 10^-8), persona assertion alone produces minimal behavioral shifts. This reveals a critical limitation: models may recognize ND context linguistically but lack genuine adaptation without explicit operational directives. The finding that masking-reinforcement decreases only 36-44% in explicitly instructed conditions, barely changing in persona-only scenarios, suggests current systems cannot reliably self-correct potentially harmful outputs when told about user needs. The research also highlights measurement challenges—only two of six harm-assessment dimensions achieved acceptable inter-judge reliability (alpha ≥ 0.67), indicating that evaluating AI equity requires rigorous methodology development. For developers and organizations building accessible AI systems, this work demonstrates that prompt-based accommodation strategies are insufficient without explicit behavioral constraints. The public release of NDBench, including prompts, outputs, and code, establishes reproducible standards for future LLM auditing. This represents important foundational work establishing that accessibility requires intentional system design rather than relying on models to infer user needs from context alone.

Key Takeaways
  • LLMs adapt structurally (more headings, detail) under explicit neurodivergence instructions but show minimal behavioral change from context assertion alone
  • Harmful tendency suppression requires explicit instructions, not just neurodivergence persona acknowledgment
  • Measuring LLM adaptation to accessibility needs demands rigorous methodology—only 2 of 6 harm-assessment dimensions proved reliably measurable
  • NDBench provides the first reproducible benchmark framework for auditing LLM accessibility and equity across frontier models
  • Current system-prompt-based accommodation strategies are insufficient for truly accessible AI without explicit operational constraints
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles