166 articles tagged with #medical-ai. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullishOpenAI News · Jan 77/105
🧠OpenAI has launched ChatGPT Health, a specialized version of its AI assistant designed to securely integrate with health data and applications. The platform emphasizes privacy protections and incorporates physician-informed design principles for healthcare applications.
AIBullishGoogle DeepMind Blog · Oct 237/103
🧠Google has launched a new 27 billion parameter foundation model for single-cell analysis, built on the Gemma family of open models. The model has reportedly helped discover a new potential cancer therapy pathway, demonstrating practical medical applications of AI technology.
AIBullishGoogle Research Blog · Jul 97/108
🧠Google has released MedGemma, described as their most capable open-source models specifically designed for health AI development. This represents a significant advancement in making specialized medical AI tools accessible to developers and researchers in the healthcare sector.
AIBullishOpenAI News · May 127/106
🧠HealthBench is a new evaluation benchmark for AI in healthcare that assesses models in realistic clinical scenarios. Developed with input from over 250 physicians, it aims to establish standardized performance and safety metrics for healthcare AI models.
AIBullishWall Street Journal – Tech · Jan 277/103
🧠LinkedIn co-founder Reid Hoffman has raised $24.6 million to launch Manas AI, a startup focused on AI-driven cancer research. The venture partners with Siddhartha Mukherjee, renowned oncologist and author of 'The Emperor of All Maladies,' combining Hoffman's tech expertise with medical authority.
AIBullisharXiv – CS AI · 1d ago6/10
🧠Researchers propose INFORM-CT, an AI framework combining large language models and vision-language models to automate detection and reporting of incidental findings in abdominal CT scans. The system uses a planner-executor approach that outperforms traditional manual inspection and existing pure vision-based models in accuracy and efficiency.
AINeutralarXiv – CS AI · 6d ago6/10
🧠SymptomWise introduces a deterministic reasoning framework that separates language understanding from diagnostic inference in AI-driven medical systems, combining expert-curated knowledge with constrained LLM use to improve reliability and reduce hallucinations. The system achieved 88% accuracy in placing correct diagnoses in top-five differentials on challenging pediatric neurology cases, demonstrating how structured approaches can enhance AI safety in critical domains.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers propose an ethical framework for sensor-fused health AI agents that combine biometric data with large language models. The paper identifies critical risks at the user-facing layer where sensor data is translated into health guidance, arguing that the perceived objectivity of biometrics can mask AI errors and turn them into harmful medical directives.
AIBearisharXiv – CS AI · 6d ago6/10
🧠Researchers introduce MedDialBench, a comprehensive benchmark testing how large language models maintain diagnostic accuracy when patients exhibit adversarial behaviors across five dimensions. The study reveals that fabricating symptoms causes 1.7-3.4x larger accuracy drops than withholding information, with worst-case performance degradation ranging from 38.8 to 54.1 percentage points across tested models.
AIBullisharXiv – CS AI · Apr 76/10
🧠Researchers introduced VERT, a new LLM-based metric for evaluating radiology reports that shows up to 11.7% better correlation with radiologist judgments compared to existing methods. The study demonstrates that fine-tuned smaller models can achieve significant performance gains while reducing inference time by up to 37.2 times.
AIBullisharXiv – CS AI · Apr 76/10
🧠Researchers introduce a new framework for evaluating adaptive AI models in medical devices, using three key measurements: learning, potential, and retention. The approach addresses challenges in assessing AI systems that continuously update, providing insights for regulatory oversight of adaptive medical AI safety and effectiveness.
AIBullisharXiv – CS AI · Mar 276/10
🧠Photon is a new framework that efficiently processes 3D medical imaging for AI visual question answering by using variable-length token sequences and adaptive compression. The system reduces computational costs while maintaining accuracy through instruction-conditioned token scheduling and custom gradient propagation techniques.
AIBullisharXiv – CS AI · Mar 276/10
🧠DeepFAN, a transformer-based AI model, achieved 93.9% diagnostic accuracy for lung nodule classification and significantly improved junior radiologists' performance by 10.9% in clinical trials. The model was trained on over 10,000 pathology-confirmed nodules and validated across 400 cases at three medical institutions.
🏢 Meta
AIBullisharXiv – CS AI · Mar 276/10
🧠Researchers successfully fine-tuned LLaMA 3.1-8B for medical transcription in Finnish, a low-resource language, achieving strong semantic similarity despite low n-gram overlap. The study used simulated clinical conversations from students and demonstrates the feasibility of privacy-oriented domain-specific language models for clinical documentation in underrepresented languages.
AINeutralarXiv – CS AI · Mar 276/10
🧠Researchers benchmarked 20 multimodal AI models on neuroimaging tasks using MRI and CT scans, finding that while technical attributes like imaging modality are nearly solved, diagnostic reasoning remains challenging. Gemini-2.5-Pro and GPT-5-Chat showed strongest diagnostic performance, while open-source MedGemma-1.5-4B demonstrated promising results under few-shot prompting.
🏢 Meta🧠 GPT-5🧠 Gemini
AIBullisharXiv – CS AI · Mar 266/10
🧠Researchers introduce Learning to Guide (LTG), a new AI framework where machines provide interpretable guidance to human decision-makers rather than making automated decisions. The SLOG approach transforms vision-language models into guidance generators using human feedback, showing promise in medical diagnosis applications.
AIBullisharXiv – CS AI · Mar 266/10
🧠Researchers have introduced MedAidDialog, a multilingual medical dialogue dataset covering seven languages, and developed MedAidLM, a conversational AI model for preliminary medical consultations. The system uses parameter-efficient fine-tuning on small language models to enable deployment without high-end computational infrastructure while incorporating patient context for personalized consultations.
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers introduce OpenHospital, a new interactive arena designed to develop and benchmark Large Language Model-based Collective Intelligence through physician-patient agent interactions. The platform uses a data-in-agent-self paradigm to rapidly enhance AI agent capabilities while providing evaluation metrics for medical proficiency and system efficiency.
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers developed PREBA, a retrieval-augmented framework that uses PCA-weighted retrieval and Bayesian averaging to improve surgical duration prediction accuracy by up to 40% using large language models. The system grounds LLM predictions in institution-specific clinical data without requiring computationally intensive training, achieving performance competitive with supervised machine learning methods.
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers developed LUMINA, a new Graph Convolutional Network architecture that improves AI-driven diagnosis of neurodevelopmental disorders using fMRI brain data. The system achieved 84.66% accuracy for ADHD and 88.41% for autism spectrum disorder detection by addressing traditional GCN limitations in capturing neural connection dynamics.
AINeutralarXiv – CS AI · Mar 176/10
🧠Researchers introduced QuarkMedBench, a new benchmark for evaluating large language models on real-world medical queries using over 20,000 queries across clinical care scenarios. The benchmark addresses limitations of current medical AI evaluations that rely on multiple-choice questions by using an automated scoring framework that achieves 91.8% concordance with clinical expert assessments.
AIBullisharXiv – CS AI · Mar 166/10
🧠Researchers introduce DeCode, a training-free framework that adapts large language models to provide better contextualized medical answers by decoupling content from delivery. The system significantly improves clinical question answering performance, boosting zero-shot results from 28.4% to 49.8% on medical benchmarks.
🏢 OpenAI
AIBullisharXiv – CS AI · Mar 166/10
🧠Researchers developed UniPrompt-CL, a new continual learning method specifically designed for medical AI that addresses the limitations of existing approaches when applied to medical data. The method uses a unified prompt pool design and regularization to achieve better performance while reducing computational costs, improving accuracy by 1-3 percentage points in domain-incremental learning settings.
AIBullisharXiv – CS AI · Mar 96/10
🧠Researchers developed an AI system that can detect fetal orofacial clefts in ultrasound images with over 93% sensitivity and 95% specificity, matching senior radiologist performance. The system was trained on 45,139 ultrasound images from 9,215 fetuses across 22 hospitals and can also improve junior radiologist diagnostic accuracy by 6%.
🏢 Microsoft
AIBullisharXiv – CS AI · Mar 96/10
🧠Researchers introduced RAMoEA-QA, a new AI system that uses hierarchical specialization to answer questions about respiratory audio recordings from mobile devices. The system employs a two-stage routing approach with Audio Mixture-of-Experts and Language Mixture-of-Adapters to handle diverse recording conditions and query types, achieving 0.72 test accuracy compared to 0.61-0.67 for existing baselines.