9 articles tagged with #human-in-the-loop. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AINeutralarXiv โ CS AI ยท Apr 146/10
๐ง Researchers propose a reactor-model-of-computation approach using the Lingua Franca framework to address nondeterminism challenges in AI-powered human-in-the-loop cyber-physical systems. The study uses an agentic driving coach as a case study to demonstrate how foundation models like LLMs can be deployed in safety-critical applications while maintaining deterministic behavior despite unpredictable human and environmental variables.
AIBullisharXiv โ CS AI ยท Mar 166/10
๐ง Researchers developed a human-in-the-loop LLM system for grading handwritten mathematics assessments that reduces grading time by 23% while maintaining accuracy comparable to manual grading. The system combines automated scanning, multi-pass LLM scoring, consistency checks, and mandatory human verification to handle pen-and-paper tests at scale.
AIBullisharXiv โ CS AI ยท Mar 116/10
๐ง Researchers introduce DexHiL, a human-in-the-loop framework for improving Vision-Language-Action models in robotic dexterous manipulation tasks. The system allows real-time human corrections during robot execution and demonstrates 25% better success rates compared to standard offline training methods.
AIBullisharXiv โ CS AI ยท Mar 96/10
๐ง Researchers introduce PONTE, a human-in-the-loop framework that creates personalized, trustworthy AI explanations by combining user preference modeling with verification modules. The system addresses the challenge of one-size-fits-all AI explanations by adapting to individual user expertise and cognitive needs while maintaining faithfulness and reducing hallucinations.
AIBullisharXiv โ CS AI ยท Feb 276/107
๐ง Researchers developed a framework for analyzing AI diagnostic systems in clinical settings by preserving original AI inferences and comparing them with physician corrections. The study of 21 dermatological cases showed 71.4% exact agreement between AI and physicians, with 100% comprehensive concordance when using structured analysis methods.
AINeutralarXiv โ CS AI ยท Apr 135/10
๐ง MuTSE is an interactive web application designed to evaluate Large Language Model outputs for text simplification tasks across multiple prompting strategies and proficiency levels. The tool addresses a methodological gap in NLP research by providing researchers and educators with a structured, visual framework for comparing prompt-model combinations in real-time.
AIBullisharXiv โ CS AI ยท Apr 74/10
๐ง Researchers developed CODE-GEN, a human-in-the-loop AI system that uses retrieval-augmented generation to create multiple-choice programming questions for educational purposes. The system achieved 79.9% to 98.6% success rates across seven pedagogical dimensions when evaluated by subject-matter experts, demonstrating strong performance in computational verification tasks while still requiring human expertise for complex instructional design.
AINeutralarXiv โ CS AI ยท Mar 125/10
๐ง Research comparing human-in-the-loop versus automated chain-of-thought prompting for behavioral interview evaluation found that human involvement significantly outperforms automated methods. The human approach required 5x fewer iterations, achieved 100% success rate versus 84% for automated methods, and showed substantial improvements in confidence and authenticity scores.
AINeutralarXiv โ CS AI ยท Mar 94/10
๐ง Researchers conducted a qualitative study analyzing Human-in-the-Loop (HITL) themes in AI application development through diary studies and expert interviews. The study identified four key themes around AI governance, iterative refinement, system lifecycle constraints, and human-AI collaboration to guide future HITL framework design.