AIBullisharXiv – CS AI · 6d ago7/10
🧠OmniDrive-R1 is a new Vision-Language Model framework that addresses critical reliability failures in autonomous driving by combining perception and reasoning through an interleaved multi-modal chain-of-thought mechanism, achieving significant accuracy improvements (37.81% to 73.62%) without requiring dense localization labels.
AIBullisharXiv – CS AI · Apr 147/10
🧠Researchers have developed an LLM-based framework that automatically generates safety-critical driving scenarios for autonomous vehicle testing using the CARLA simulator and realistic video synthesis. The system uses few-shot code generation to create diverse edge cases like pedestrian occlusions and vehicle cut-ins, bridging simulation and real-world realism through advanced video generation techniques.
AINeutralarXiv – CS AI · Apr 137/10
🧠Researchers introduce PilotBench, a benchmark evaluating large language models on safety-critical aviation tasks using 708 real-world flight trajectories. The study reveals a fundamental trade-off: traditional forecasters achieve superior numerical precision (7.01 MAE) while LLMs provide better instruction-following (86-89%) but with significantly degraded prediction accuracy (11-14 MAE), exposing brittleness in implicit physics reasoning for embodied AI applications.
AINeutralarXiv – CS AI · 7h ago6/10
🧠Researchers present a novel Safety-by-Design method to define Operational Design Domains (ODDs) for safety-critical AI systems using data-driven approaches rather than traditional expert-led design. The approach uses kernel-based representations to retroactively characterize environmental conditions from collected data and is validated through aviation collision-avoidance system testing, potentially enabling future certification of AI systems in critical domains.
AINeutralarXiv – CS AI · Apr 136/10
🧠Researchers introduce VOLTA, a simplified deep learning approach for uncertainty quantification that outperforms ten established baselines including ensemble methods and MC Dropout. The method achieves superior calibration with expected calibration error of 0.010 and competitive accuracy across multiple datasets, suggesting that complex auxiliary losses may be unnecessary for reliable uncertainty estimation in safety-critical applications.
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers developed SimCert, a probabilistic certification framework that verifies behavioral similarity between compressed neural networks and their original versions. The framework addresses critical safety challenges in deploying compressed DNNs on resource-constrained systems by providing quantitative safety guarantees with adjustable confidence levels.
AINeutralarXiv – CS AI · Mar 174/10
🧠Researchers developed a symbolic machine learning approach for predicting failures in chemical processes, specifically testing on ethylene oxidation. The method outperformed traditional AI models while maintaining interpretability through rule-based systems, addressing safety concerns in chemical industries where black-box AI models are unsuitable.