AeroSpectra Sentinel: An Auditable LLM Prompt-Chaining Decision-Support Workflow for Acute Asthma Risk Assessment from Respiratory Sounds and Clinical Signals
AeroSpectra Sentinel is a research prototype that combines STFT audio analysis, machine learning, and LLM prompt-chaining to assist in acute asthma risk assessment from respiratory sounds and clinical signals. Evaluated on respiratory sound datasets, the system achieved up to 91.10% binary accuracy with random forest models, while structured prompting with guardrails and FHIR validation showed strongest safety consistency in simulated clinical scenarios.
AeroSpectra Sentinel addresses a critical gap in clinical decision-support by automating the interpretation of respiratory sounds—a task requiring speed and transparency in acute asthma assessment. The system architecture separates concerns across signal processing, machine learning screening, and clinical reasoning stages, enabling auditability at each step. This modular design reflects growing healthcare sector demand for explainable AI systems that clinicians can trust and regulators can scrutinize.
The research demonstrates substantial performance variance across model architectures, with traditional random forest and MLP classifiers outperforming end-to-end CNN approaches on this task. This finding underscores that domain-specific feature engineering remains competitive against deep learning for medical audio classification, particularly when datasets are limited. The five-stage LLM prompt-chaining methodology—especially the guardrail-plus-FHIR variant—shows how structured reasoning can improve consistency and documentation quality beyond single-pass language model outputs.
For healthcare AI development, this prototype illustrates a pragmatic path forward: hybrid systems combining lightweight ML screening with guardrailed LLM reasoning produce safer, more interpretable outputs than black-box approaches. The explicit FHIR integration positions outputs for healthcare information exchange systems. However, the authors' clear disclaimer that this is a research prototype, not a validated medical device, reflects appropriate caution—clinical deployment would require validation on diverse populations, regulatory approval, and integration into established clinical workflows.
- →Random forest models achieved 91.10% binary accuracy for asthma screening on respiratory sounds, outperforming CNN approaches on this dataset.
- →Structured LLM prompt-chaining with guardrails and FHIR schema validation demonstrated superior safety and documentation consistency versus simpler prompting strategies.
- →The modular architecture separating signal processing, ML screening, and clinical reasoning enables auditability at each decision stage.
- →Hybrid approaches combining traditional ML with LLM reasoning appear more practical for regulated healthcare domains than end-to-end deep learning.
- →Researchers explicitly positioned this as a research prototype, not a clinically validated medical device, reflecting appropriate regulatory awareness.