89 articles tagged with #foundation-models. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv – CS AI · Mar 46/103
🧠Researchers propose RL3DEdit, a reinforcement learning framework that addresses multi-view consistency challenges in 3D scene editing by using 2D diffusion model priors with novel reward signals from 3D foundation models. The method achieves stable multi-view consistency and outperforms existing approaches in editing quality and efficiency.
AIBullisharXiv – CS AI · Mar 47/103
🧠Researchers propose a new IMPRINT framework for transfer learning that improves foundation model adaptation to new tasks without parameter optimization. The framework identifies three key components and introduces a clustering-based variant that outperforms existing methods by 4%.
AIBullisharXiv – CS AI · Mar 47/102
🧠Researchers have released MedXIAOHE, a new medical vision-language AI foundation model that achieves state-of-the-art performance across medical benchmarks and surpasses leading closed-source systems. The model incorporates advanced features like entity-aware pretraining, reinforcement learning for medical reasoning, and evidence-grounded report generation to improve reliability in clinical applications.
AIBullisharXiv – CS AI · Mar 47/103
🧠Researchers introduce OptMerge, a new benchmark and method for combining multiple expert Multimodal Large Language Models (MLLMs) into single, more capable models without requiring additional training data. The approach achieves 2.48% average performance gains while reducing storage and serving costs by merging models across different modalities like vision, audio, and video.
AIBullisharXiv – CS AI · Mar 37/104
🧠GeneZip is a new DNA compression model that achieves 137.6x compression with minimal performance loss by recognizing that genomic information is highly imbalanced. The system enables training of much larger AI models for genomic analysis using single GPU setups instead of expensive multi-GPU configurations.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers have developed MagicAgent, a series of foundation models designed for generalized AI agent planning that outperforms existing sub-100B models and even surpasses leading ultra-scale models like GPT-5.2. The models achieve superior performance through a novel synthetic data framework and two-stage training paradigm that addresses gradient interference in multi-task learning.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers developed a new Brain-to-Text (BIT) framework that uses cross-species neural foundation models to decode speech from brain activity with significantly improved accuracy. The system reduces word error rates from 24.69% to 10.22% compared to previous methods and enables seamless translation of both attempted and imagined speech into text.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers from Stanford introduce the Relational Transformer (RT), a new AI architecture that can work with relational databases without task-specific fine-tuning. The 22M parameter model achieves 93% performance of fully supervised models on binary classification tasks, significantly outperforming a 27B parameter LLM at 84%.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers developed UrbanFM, a foundation model for urban spatio-temporal data that can analyze traffic patterns and city dynamics across over 100 global cities. The model demonstrates zero-shot generalization capabilities, meaning it can make predictions for unseen cities without additional training, potentially revolutionizing urban planning and smart city applications.
AIBullisharXiv – CS AI · Mar 37/105
🧠Researchers have developed DeepMedix-R1, a foundation model for chest X-ray interpretation that provides transparent, step-by-step reasoning alongside accurate diagnoses to address the black-box problem in medical AI. The model uses reinforcement learning to align diagnostic outputs with clinical plausibility and significantly outperforms existing models in report generation and visual question answering tasks.
AIBullisharXiv – CS AI · Feb 277/106
🧠Researchers developed a method to improve foundation models in medical histopathology by introducing robustness losses during training, reducing sensitivity to technical variations while maintaining accuracy. The approach was tested on over 27,000 whole slide images from 6,155 patients across eight popular foundation models, showing improved robustness and prediction accuracy without requiring retraining of the foundation models themselves.
AIBullisharXiv – CS AI · Feb 277/106
🧠Researchers introduce Zatom-1, the first foundation model that unifies generative and predictive learning for both 3D molecules and materials using a multimodal flow matching approach. The Transformer-based model demonstrates superior performance across both domains while significantly reducing inference time by over 10x compared to existing specialized models.
$ATOM
AINeutralarXiv – CS AI · Feb 277/107
🧠Researchers introduce SC-ARENA, a new natural language evaluation framework for testing large language models in single-cell biology research. The framework addresses limitations in existing benchmarks by incorporating biological knowledge and real-world task formats to better assess AI models' understanding of cellular biology.
AIBullisharXiv – CS AI · Feb 277/107
🧠Researchers introduce OmniGAIA, a comprehensive benchmark for evaluating omni-modal AI agents that can process video, audio, and image data simultaneously with complex reasoning capabilities. They also propose OmniAtlas, a foundation agent that enhances existing open-source models' ability to use tools across multiple modalities, marking progress toward more capable AI assistants.
AINeutralarXiv – CS AI · 6d ago6/10
🧠ConceptTracer is an interactive tool for analyzing neural network representations through human-interpretable concepts, using information-theoretic measures to identify neurons responsive to specific ideas. The tool demonstrates how foundation models like TabPFN encode conceptual information, advancing mechanistic interpretability research.
AIBullisharXiv – CS AI · Apr 76/10
🧠Researchers introduce VLA-Forget, a new unlearning framework for vision-language-action (VLA) models used in robotic manipulation. The hybrid approach addresses the challenge of removing unsafe or unwanted behaviors from embodied AI foundation models while preserving their core perception, language, and action capabilities.
AIBearisharXiv – CS AI · Apr 76/10
🧠A new research study reveals that major large language models exhibit systematic bias toward American English over British English across training data, tokenization, and outputs. The research introduces DiAlign, a method for measuring dialectal alignment, and finds evidence of linguistic homogenization that could impact global AI equity.
AIBullishMarkTechPost · Apr 56/10
🧠MaxToki is a new AI foundation model that can predict cellular aging patterns and trajectories, addressing a key limitation in existing biological models that only analyze cells as static snapshots. The technology represents a significant advancement in computational biology by incorporating temporal dynamics into cellular analysis.
AINeutralarXiv – CS AI · Mar 176/10
🧠Research shows that synthetic data designed to enhance in-context learning capabilities in AI models doesn't necessarily improve performance. The study found that while targeted training can increase specific neural mechanisms, it doesn't make them more functionally important compared to natural training approaches.
🏢 Perplexity
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers propose ES-Merging, a new framework for combining specialized biological multimodal large language models (MLLMs) by using embedding space signals rather than traditional parameter-based methods. The approach estimates merging coefficients at both layer-wise and element-wise granularities, outperforming existing merging techniques and even task-specific fine-tuned models on cross-modal scientific problems.
AIBullisharXiv – CS AI · Mar 176/10
🧠Researchers introduce MVHOI, a new AI framework that significantly improves human-object interaction video generation by handling complex 3D manipulations through a two-stage process using 3D foundation models. The system can create realistic long-duration videos showing intricate object manipulations from multiple viewpoints, addressing limitations of existing approaches that struggle with non-planar movements.
AINeutralarXiv – CS AI · Mar 166/10
🧠Researchers propose integrating causal methods into machine learning systems to balance competing objectives like fairness, privacy, robustness, accuracy, and explainability. The paper argues that addressing these principles in isolation leads to conflicts and suboptimal solutions, while causal approaches can help navigate trade-offs in both trustworthy ML and foundation models.
AINeutralarXiv – CS AI · Mar 126/10
🧠Researchers propose RandMark, a new method for watermarking visual foundation models to protect intellectual property rights. The approach uses a small encoder-decoder network to embed random digital watermarks into internal representations, enabling ownership verification with low false detection rates.
AIBullisharXiv – CS AI · Mar 116/10
🧠FALCON introduces a novel vision-language-action model that bridges the spatial reasoning gap by injecting 3D spatial tokens into action heads while preserving language reasoning capabilities. The system achieves state-of-the-art performance across simulation benchmarks and real-world tasks by leveraging spatial foundation models to provide geometric priors from RGB input alone.
AIBullisharXiv – CS AI · Mar 96/10
🧠Researchers developed a new training method to improve the robustness of AI foundation models like SAM3 for medical image segmentation by reducing sensitivity to prompt variations. The approach groups semantically similar prompts together and uses consistency constraints to ensure more reliable predictions across different prompt formulations.