AINeutralarXiv – CS AI · 5d ago7/10
🧠Researchers introduced MMBU, the largest biomedical vision-language benchmark covering 35 medical imaging modalities with structured metadata. Testing 15 open-weight and 2 frontier VLMs revealed that while medical adaptation helps some models, high reported accuracy on existing benchmarks masks significant deficiencies in visual perception and domain generalization.
AIBullisharXiv – CS AI · Jun 57/10
🧠Researchers propose biomedical world models as an AI paradigm that learns dynamic representations of biological systems to simulate future states and predict responses to interventions. These models could accelerate drug discovery, personalized medicine, and surgical planning by enabling simulation-based experimentation before real-world testing.
AIBullisharXiv – CS AI · Jun 27/10
🧠Researchers introduce Ryze, an automated system that converts biomedical papers into evidence-enriched training datasets for specialized vision-language models. The resulting BioVLM-8B model achieves 48.0% accuracy on LAB-Bench, outperforming GPT-4V by 3.8 percentage points while costing under $200 to develop.
🧠 GPT-5
AIBullisharXiv – CS AI · Jun 27/10
🧠EvoPool is an evolutionary multi-agent framework that generates specialized annotation code to label training data more efficiently than LLMs for domain-specific tasks. The system operates 4,500-31,000x faster than LLM annotation while achieving superior performance across biomedical, legal, and reasoning tasks, with improvements up to +0.301 macro-F1 on specialized benchmarks.
AIBullisharXiv – CS AI · May 287/10
🧠Researchers introduce CaMBRAIN, a causal state space model based on Mamba architecture that enables real-time, continuous EEG signal processing with linear-time complexity. The model achieves state-of-the-art results across multiple datasets while processing signals >10x faster than existing attention-based methods, overcoming critical limitations in handling variable-length brain activity recordings.
AIBullisharXiv – CS AI · Mar 56/10
🧠Researchers developed Uni-NTFM, a new foundation model for EEG signal analysis that incorporates biological neural mechanisms and achieved record-breaking 1.9 billion parameters. The model was pre-trained on 28,000 hours of EEG data and outperformed existing models across nine downstream tasks by aligning architecture with actual brain functionality.
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers demonstrate that pretrained biomedical language models fail catastrophically at cross-domain discrimination, assigning high similarity scores (0.76-0.92) to unrelated concepts. They propose BODHI, a contrastive learning approach that improves domain separation 2.3x while maintaining correlation accuracy, and show that optimized inference achieves 133x latency reduction on specialized hardware.
AIBullisharXiv – CS AI · Jun 46/10
🧠Researchers introduce BioManus, an AI agent system that uses graph-based planning and standardized Model Context Protocol (MCP) servers to automate biomedical workflows. The system addresses scalability challenges by organizing bioinformatics tools into structured capability graphs rather than relying on flat prompt-based retrieval, achieving significant improvements in execution accuracy and context efficiency.
AIBullisharXiv – CS AI · Jun 26/10
🧠Researchers propose a unified deep learning framework combining ResNet-based CNNs with attention mechanisms and novel data augmentation techniques for analyzing biomedical time-series signals like ECG and EEG. The approach achieves near-perfect accuracy (99.78-100%) on benchmark datasets while remaining lightweight enough for wearable deployment, addressing critical gaps in multi-signal analysis and class imbalance handling.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers introduce UF-AMA, a unified framework for cross-domain emotion recognition using multimodal physiological signals like EEG and eye-tracking data. The model employs adaptive alignment mechanisms and multi-level domain adaptation to achieve state-of-the-art performance in cross-subject and cross-session emotion recognition tasks.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers demonstrate that large language models fail to accurately predict gene expression changes in cellular perturbation experiments despite producing biologically plausible explanations. They introduce CORE, a contrastive learning method that significantly improves prediction accuracy by organizing evidence from related perturbations rather than evaluating them in isolation.
AINeutralarXiv – CS AI · Jun 16/10
🧠HypoAgent is a new AI framework that uses multiple specialized agents to generate logical hypotheses from knowledge graphs through interactive dialogue. The system excels at understanding evolving user intent across multi-turn conversations and diagnosing why generated hypotheses fail, achieving state-of-the-art performance on both commonsense and biomedical knowledge graphs.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers introduce ClinPivot, a benchmark testing whether clinical AI models adjust treatment decisions when patient contexts change. The study reveals that strong medical QA performance does not correlate with sound clinical decision-making, with leading models often failing to modify treatment choices appropriately when clinical constraints shift.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers propose a multi-dimensional evaluation framework for EEG foundation models that tests performance under realistic biomedical constraints like limited labeled data and reduced sensor coverage. Analysis of models including LaBraM, CSBrain, and CBraMod reveals foundation models excel at long-context tasks but struggle with short-window Brain-Computer Interface applications and channel constraints compared to supervised alternatives.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers introduce BIRDNet, a neurosymbolic deep learning architecture that mines Boolean implication relationships from tabular data and encodes them as sparse, interpretable neural networks. The model achieves near-baseline performance on biomedical datasets while using 96× fewer active parameters and maintaining human-readable symbolic rules without external rule bases.
AINeutralarXiv – CS AI · May 276/10
🧠Researchers introduce SCENE, a multi-agent AI framework that transforms general biomedical knowledge into specific, evidence-supported hypotheses grounded in experimental data. The system successfully identifies patient subgroups with different treatment responses in clinical trials and context-specific biological responses in genomic studies, bridging the gap between broad theoretical knowledge and actionable dataset-specific insights.
AIBullisharXiv – CS AI · May 276/10
🧠BioFormer, a new machine learning framework, addresses cross-subject generalization in biomedical time-series analysis by using spectral structural alignment to suppress individual variability. The model achieves 6% F1-score improvements over 12 baselines through frequency-band alignment and adaptive normalization techniques.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce TESSERA, a neuro-symbolic framework that combines Large Language Models with Monte Carlo Tree Search to extract multi-step explanations from knowledge graphs, specifically for drug-disease mechanism discovery. The system uses LLMs for local judgments rather than autonomous generation, enforcing structural constraints through knowledge graphs while employing MCTS for principled credit assignment across extended reasoning chains.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduced PrimeKG-CL, a benchmark dataset for continual graph learning built from nine biomedical databases with 129K+ nodes and 8.1M+ edges across two temporal snapshots (2021-2023). The work evaluates how different machine learning strategies handle evolving biomedical knowledge graphs, revealing that decoder choice and learning strategy interact significantly and that standard metrics fail to distinguish between retaining valid facts and forgetting outdated ones.
🏢 Hugging Face
AINeutralarXiv – CS AI · May 116/10
🧠Researchers introduce PPI-Net, a hierarchical graph neural network that integrates protein-protein interaction networks with biological pathway data to predict cancer outcomes and mechanisms. Demonstrating over 90% balanced accuracy across ten cancer types, the model reveals how molecular changes propagate through biological systems to drive disease, offering both predictive power and mechanistic interpretability.
AIBullisharXiv – CS AI · Mar 266/10
🧠Researchers developed PLACID, a privacy-preserving system using small on-device AI models (2B-10B parameters) for clinical acronym disambiguation in healthcare settings. The cascaded approach combines general-purpose models for detection with domain-specific biomedical models, achieving 81% expansion accuracy while keeping sensitive health data local.
AIBullisharXiv – CS AI · Mar 26/1010
🧠Researchers developed SHINE, a Sequential Hierarchical Integration Network for analyzing brain signals (EEG/MEG) to detect speech from neural activity. The system achieved high F1-macro scores of 0.9155-0.9184 in the LibriBrain Competition 2025 by reconstructing speech-silence patterns from magnetoencephalography signals.