y0news
#multimodal6 articles
6 articles
AIBullisharXiv โ€“ CS AI ยท 4h ago3
๐Ÿง 

SleepLM: Natural-Language Intelligence for Human Sleep

Researchers have developed SleepLM, a family of AI foundation models that combine natural language processing with sleep analysis using polysomnography data. The system can interpret and describe sleep patterns in natural language, trained on over 100K hours of sleep data from 10,000+ individuals, enabling new capabilities like language-guided sleep event detection and zero-shot generalization to novel sleep analysis tasks.

AIBullisharXiv โ€“ CS AI ยท 4h ago4
๐Ÿง 

MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs

Researchers introduce MMKG-RDS, a framework that uses multimodal knowledge graphs to synthesize high-quality training data for improving AI model reasoning abilities. Testing on Qwen3 models showed 9.2% improvement in reasoning accuracy, with applications for complex benchmark construction involving tables and formulas.

AIBullisharXiv โ€“ CS AI ยท 4h ago4
๐Ÿง 

Brain-OF: An Omnifunctional Foundation Model for fMRI, EEG and MEG

Researchers have developed Brain-OF, the first omnifunctional brain foundation model that can process fMRI, EEG, and MEG data simultaneously within a unified framework. The model introduces novel techniques like Any-Resolution Neural Signal Sampler and Masked Temporal-Frequency Modeling, trained on 40 datasets to achieve superior performance across diverse neuroscience tasks.

AIBullisharXiv โ€“ CS AI ยท 4h ago2
๐Ÿง 

Pseudo Contrastive Learning for Diagram Comprehension in Multimodal Models

Researchers propose a new training method called pseudo contrastive learning to improve diagram comprehension in multimodal AI models like CLIP. The approach uses synthetic diagram samples to help models better understand fine-grained structural differences in diagrams, showing significant improvements in flowchart understanding tasks.

AIBullisharXiv โ€“ CS AI ยท 4h ago6
๐Ÿง 

MINT: Multimodal Imaging-to-Speech Knowledge Transfer for Early Alzheimer's Screening

Researchers developed MINT, a framework that transfers knowledge from MRI brain scans to speech analysis for early Alzheimer's detection. The system achieves comparable performance to speech-only methods while being grounded in neuroimaging biomarkers, enabling population-scale screening without requiring expensive MRI scans at inference.

AIBullisharXiv โ€“ CS AI ยท 4h ago2
๐Ÿง 

Multimodal Optimal Transport for Unsupervised Temporal Segmentation in Surgical Robotics

Researchers developed TASOT, an unsupervised AI method for surgical phase recognition that combines visual and textual information without requiring expensive large-scale pre-training. The approach showed significant improvements over existing zero-shot methods across multiple surgical datasets, demonstrating that effective surgical AI can be achieved with more efficient training methods.