Real-time AI-curated news from 31,649+ articles across 50+ sources. Sentiment analysis, importance scoring, and key takeaways — updated every 15 minutes.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers developed ZeroDVFS, a system that uses Large Language Models to optimize power management in embedded systems without requiring extensive profiling. The system achieves 7.09 times better energy efficiency and enables zero-shot deployment for new workloads in under 5 seconds through LLM-based code analysis.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers introduce MAS-Orchestra, a new framework for multi-agent AI systems that uses reinforcement learning to orchestrate multiple AI agents more efficiently. The system achieves 10x efficiency improvements over existing methods and includes a benchmark (MASBENCH) to better understand when multi-agent systems outperform single-agent approaches.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers have developed MSP-LLM, a unified large language model framework for complete material synthesis planning that addresses both precursor prediction and synthesis operation prediction. The system outperforms existing methods by breaking down the complex task into structured subproblems with chemical consistency.
AIBullisharXiv – CS AI · Mar 37/105
🧠Researchers introduce Arbor, a framework that decomposes large language model decision-making into specialized node-level tasks for critical applications like healthcare triage. The system improves accuracy by 29.4 percentage points while reducing latency by 57.1% and costs by 14.4x compared to single-prompt approaches.
AIBullisharXiv – CS AI · Mar 37/104
🧠Surge AI introduces CoreCraft, the first environment in EnterpriseBench for training AI agents on realistic enterprise workflows. Training GLM 4.6 on this high-fidelity customer support simulation improved task performance from 25% to 37% and showed positive transfer to other benchmarks, demonstrating that quality training environments enable generalizable AI capabilities.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers have developed MagicAgent, a series of foundation models designed for generalized AI agent planning that outperforms existing sub-100B models and even surpasses leading ultra-scale models like GPT-5.2. The models achieve superior performance through a novel synthetic data framework and two-stage training paradigm that addresses gradient interference in multi-task learning.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers developed OS-Det3D, a two-stage framework for camera-based 3D object detection in autonomous vehicles that can identify unknown objects beyond predefined categories. The system uses LiDAR geometric cues and a joint selection module to discover novel objects while improving detection of known objects, addressing safety risks in real-world driving scenarios.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers introduce GLEE, a new framework for studying how Large Language Models behave in economic games and strategic interactions. The study reveals that LLM performance in economic scenarios depends heavily on market parameters and model selection, with complex interdependent effects on outcomes.
AINeutralarXiv – CS AI · Mar 37/103
🧠Researchers introduce FSW-GNN, the first Message Passing Neural Network that is fully bi-Lipschitz with respect to standard WL-equivalent graph metrics. This addresses the limitation where standard MPNNs produce poorly distinguishable outputs for separable graphs, with empirical results showing competitive performance and superior accuracy in long-range tasks.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers have developed a new AI architecture that learns high-level symbolic skills from minimal low-level demonstrations, enabling robots to manipulate objects and execute complex tasks in unseen environments. The system combines neural networks for symbol discovery with visual language models for high-level planning and gradient-based methods for low-level execution.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers introduce Interaction2Code, the first benchmark for evaluating Multimodal Large Language Models' ability to generate interactive webpage code from prototypes. The study identifies four critical limitations in current MLLMs and proposes enhancement strategies to improve their performance on dynamic web interactions.
AIBearisharXiv – CS AI · Mar 37/103
🧠Researchers introduce Multi-PA, a comprehensive benchmark for evaluating privacy risks in Large Vision-Language Models (LVLMs), covering 26 personal privacy categories, 15 trade secrets, and 18 state secrets across 31,962 samples. Testing 21 open-source and 2 closed-source LVLMs revealed significant privacy vulnerabilities, with models generally posing high risks of facilitating privacy breaches across different privacy categories.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers developed new activation functions for deep neural networks based on polynomial and trigonometric orthonormal bases that can successfully train models like GPT-2 and ConvNeXt. The work addresses gradient problems common with polynomial activations and shows these networks can be interpreted as multivariate polynomial mappings.
AINeutralarXiv – CS AI · Mar 37/103
🧠Researchers have introduced WorldSense, the first benchmark for evaluating multimodal AI systems that process visual, audio, and text inputs simultaneously. The benchmark contains 1,662 synchronized audio-visual videos across 67 subcategories and 3,172 QA pairs, revealing that current state-of-the-art models achieve only 65.1% accuracy on real-world understanding tasks.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers developed a novel algorithm using topological derivatives to automatically determine where and how to add new layers to neural networks during training. The approach uses mathematical principles from optimal control theory and topology optimization to adaptively grow network architecture, showing superior performance compared to baseline networks and other adaptation strategies.
AIBullisharXiv – CS AI · Mar 37/102
🧠Researchers introduce Sparse Shift Autoencoders (SSAEs), a new method for improving large language model interpretability by learning sparse representations of differences between embeddings rather than the embeddings themselves. This approach addresses the identifiability problem in current sparse autoencoder techniques, potentially enabling more precise control over specific AI behaviors without unintended side effects.
AIBullisharXiv – CS AI · Mar 37/102
🧠Researchers propose GradientStabilizer, a new technique to address training instability in deep learning by replacing gradient magnitude with statistically stabilized estimates while preserving direction. The method outperforms gradient clipping across multiple AI training scenarios including LLM pre-training, reinforcement learning, and computer vision tasks.
AINeutralarXiv – CS AI · Mar 37/104
🧠New research analyzing 92 open-source language models reveals that factors beyond model size and training data significantly impact performance. The study shows that incorporating design features like data composition and architectural choices can improve performance prediction by 3-28% compared to using scale alone.
AIBullisharXiv – CS AI · Mar 37/104
🧠Open-Sora 2.0 is a commercial-level video generation model that achieves performance comparable to leading models like Runway Gen-3 Alpha while costing only $200k to train. The fully open-source model demonstrates significant cost reduction in AI video generation training through optimized data curation, architecture, and training strategies.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers propose ROMA, a new hardware accelerator for running large language models on edge devices using QLoRA. The system uses ROM storage for quantized base models and SRAM for LoRA weights, achieving over 20,000 tokens/s generation speed without external memory.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers introduce AdaRank, a new AI model merging framework that adaptively selects optimal singular directions from task vectors to combine multiple fine-tuned models. The technique addresses cross-task interference issues in existing SVD-based approaches by dynamically pruning problematic components during test-time, achieving state-of-the-art performance with nearly 1% gap from individual fine-tuned models.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers analyzed compression effects on large reasoning models (LRMs) through quantization, distillation, and pruning methods. They found that dynamically quantized 2.51-bit models maintain near-original performance, while identifying critical weight components and showing that protecting just 2% of excessively compressed weights can improve accuracy by 6.57%.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers released two open-source datasets, SwallowCode and SwallowMath, that significantly improve large language model performance in coding and mathematics through systematic data rewriting rather than filtering. The datasets boost Llama-3.1-8B performance by +17.0 on HumanEval for coding and +12.4 on GSM8K for math tasks.
AINeutralarXiv – CS AI · Mar 37/104
🧠New research connects initial guessing bias in untrained deep neural networks to established mean field theories, proving that optimal initialization for learning requires systematic bias toward specific classes rather than neutral initialization. The study demonstrates that efficient training is fundamentally linked to architectural prejudices present before data exposure.
AIBullisharXiv – CS AI · Mar 37/105
🧠Researchers introduce SEAM, a novel defense mechanism that makes large language models 'self-destructive' when adversaries attempt harmful fine-tuning attacks. The system allows models to function normally for legitimate tasks but causes catastrophic performance degradation when fine-tuned on harmful data, creating robust protection against malicious modifications.