y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#machine-learning News & Analysis

2514 articles tagged with #machine-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2514 articles
AIBullisharXiv – CS AI · Mar 36/104
🧠

Reliable Fine-Grained Evaluation of Natural Language Math Proofs

Researchers have developed ProofGrader, a new AI system that can reliably evaluate natural language mathematical proofs generated by large language models on a fine-grained 0-7 scale. The system was trained using ProofBench, the first expert-annotated dataset of proof ratings covering 145 competition math problems and 435 LLM solutions, achieving significant improvements over basic evaluation methods.

AIBullisharXiv – CS AI · Mar 36/104
🧠

Contribution-aware Token Compression for Efficient Video Understanding via Reinforcement Learning

Researchers developed CaCoVID, a reinforcement learning-based algorithm that compresses video tokens for large language models by selecting tokens based on their actual contribution to correct predictions rather than attention scores. The method uses combinatorial policy optimization to reduce computational overhead while maintaining video understanding performance.

AIBullisharXiv – CS AI · Mar 36/104
🧠

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Researchers developed EditReward, a human-aligned reward model for instruction-guided image editing trained on over 200K preference pairs. The model demonstrates superior performance on established benchmarks and can effectively filter high-quality training data, addressing a key bottleneck in open-source image editing models.

AIBullisharXiv – CS AI · Mar 36/104
🧠

Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading

Researchers have developed AI models that can decode readers' information-seeking goals solely from their eye movements while reading text. The study introduces new evaluation frameworks using large-scale eye tracking data and demonstrates success in both selecting correct goals from options and reconstructing precise goal formulations.

AIBullisharXiv – CS AI · Mar 36/104
🧠

Scalable Multi-Task Learning through Spiking Neural Networks with Adaptive Task-Switching Policy for Intelligent Autonomous Agents

Researchers have developed SwitchMT, a novel methodology using Spiking Neural Networks with adaptive task-switching for multi-task learning in autonomous agents. The approach addresses task interference issues and demonstrates competitive performance in multiple Atari games while maintaining low power consumption and network complexity.

AINeutralarXiv – CS AI · Mar 27/1017
🧠

Exploring Robust Intrusion Detection: A Benchmark Study of Feature Transferability in IoT Botnet Attack Detection

Researchers conducted a benchmark study on IoT botnet intrusion detection systems, finding that models trained on one network domain suffer significant performance degradation when applied to different environments. The study evaluated three feature sets across four IoT datasets and provided guidelines for improving cross-domain robustness through better feature engineering and algorithm selection.

AIBullisharXiv – CS AI · Mar 26/1012
🧠

Enhancing Continual Learning for Software Vulnerability Prediction: Addressing Catastrophic Forgetting via Hybrid-Confidence-Aware Selective Replay for Temporal LLM Fine-Tuning

Researchers developed Hybrid Class-Aware Selective Replay (Hybrid-CASR), a continual learning method that improves AI-based software vulnerability detection by addressing catastrophic forgetting in temporal scenarios. The method achieved 0.667 Macro-F1 score while reducing training time by 17% compared to baseline approaches on CVE data from 2018-2024.

AINeutralarXiv – CS AI · Mar 27/1019
🧠

Biases in the Blind Spot: Detecting What LLMs Fail to Mention

Researchers have developed an automated pipeline to detect hidden biases in Large Language Models that don't appear in their reasoning explanations. The system discovered previously unknown biases like Spanish fluency and writing formality across seven LLMs in hiring, loan approval, and university admission tasks.

AIBullisharXiv – CS AI · Mar 27/1019
🧠

Smoothing DiLoCo with Primal Averaging for Faster Training of LLMs

Researchers propose Generalized Primal Averaging (GPA), a new optimization method that improves training speed for large language models by 8-10% over standard AdamW while using less memory. GPA unifies and enhances existing averaging-based optimizers like DiLoCo by enabling smooth iterate averaging at every step without complex two-loop structures.

AIBullisharXiv – CS AI · Mar 27/1012
🧠

The Geometry of Transfer: Unlocking Medical Vision Manifolds for Training-Free Model Ranking

Researchers developed a new framework for selecting optimal medical AI foundation models without costly fine-tuning, achieving 31% better performance than existing methods. The topology-driven approach evaluates manifold tractability rather than statistical overlap to better assess model transferability for medical image segmentation tasks.

AIBullisharXiv – CS AI · Mar 27/1012
🧠

FedNSAM:Consistency of Local and Global Flatness for Federated Learning

Researchers propose FedNSAM, a new federated learning algorithm that improves global model performance by addressing the inconsistency between local and global flatness in distributed training environments. The algorithm uses global Nesterov momentum to harmonize local and global optimization, showing superior performance compared to existing FedSAM approaches.

AIBullisharXiv – CS AI · Mar 27/1016
🧠

TradeFM: A Generative Foundation Model for Trade-flow and Market Microstructure

Researchers introduced TradeFM, a 524M-parameter generative AI model that learns from billions of trade events across 9,000+ equities to understand market microstructure. The model can generate synthetic market data and generalizes across different markets without asset-specific calibration, potentially enabling new applications in trading and market simulation.

$COMP
AINeutralarXiv – CS AI · Mar 27/1010
🧠

From Static Benchmarks to Dynamic Protocol: Agent-Centric Text Anomaly Detection for Evaluating LLM Reasoning

Researchers propose a dynamic agent-centric benchmarking system for evaluating large language models that replaces static datasets with autonomous agents that generate, validate, and solve problems iteratively. The protocol uses teacher, orchestrator, and student agents to create progressively challenging text anomaly detection tasks that expose reasoning errors missed by conventional benchmarks.

AIBullisharXiv – CS AI · Mar 26/1012
🧠

TRIZ-RAGNER: A Retrieval-Augmented Large Language Model for TRIZ-Aware Named Entity Recognition in Patent-Based Contradiction Mining

Researchers developed TRIZ-RAGNER, a retrieval-augmented large language model framework that improves patent analysis and systematic innovation by extracting technical contradictions from patent documents. The system achieved 84.2% F1-score accuracy, outperforming existing methods by 7.3 percentage points through better integration of domain-specific knowledge.

AIBullisharXiv – CS AI · Mar 26/1018
🧠

QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

Researchers propose QKAN-LSTM, a quantum-inspired neural network that integrates quantum variational activation functions into LSTM architecture for sequential modeling. The model achieves superior predictive accuracy with 79% fewer parameters than classical LSTMs while remaining executable on classical hardware.

AIBullisharXiv – CS AI · Mar 26/1012
🧠

See, Act, Adapt: Active Perception for Unsupervised Cross-Domain Visual Adaptation via Personalized VLM-Guided Agent

Researchers introduce Sea² (See, Act, Adapt), a novel approach that improves AI perception models in new environments by using an intelligent pose-control agent rather than retraining the models themselves. The method keeps perception modules frozen and uses a vision-language model as a controller, achieving significant performance improvements of 13-27% across visual tasks without requiring additional training data.

AIBullisharXiv – CS AI · Mar 27/1015
🧠

PointCoT: A Multi-modal Benchmark for Explicit 3D Geometric Reasoning

Researchers introduce PointCoT, a new AI framework that enables multimodal large language models to perform explicit geometric reasoning on 3D point cloud data using Chain-of-Thought methodology. The framework addresses current limitations where AI models suffer from geometric hallucinations by implementing a 'Look, Think, then Answer' paradigm with 86k instruction-tuning samples.

AINeutralarXiv – CS AI · Mar 27/1013
🧠

Learning to maintain safety through expert demonstrations in settings with unknown constraints: A Q-learning perspective

Researchers propose SafeQIL, a new Q-learning algorithm that learns safe policies from expert demonstrations in constrained environments where safety constraints are unknown. The approach balances maximizing task rewards while maintaining safety by learning from demonstrated trajectories that successfully complete tasks without violating hidden constraints.

AIBullisharXiv – CS AI · Mar 27/1016
🧠

SMAC: Score-Matched Actor-Critics for Robust Offline-to-Online Transfer

Researchers developed Score Matched Actor-Critic (SMAC), a new offline reinforcement learning method that enables smooth transition to online RL algorithms without performance drops. SMAC achieved successful transfer in all 6 D4RL tasks tested and reduced regret by 34-58% in 4 of 6 environments compared to best baselines.

AIBullisharXiv – CS AI · Mar 26/1014
🧠

WisPaper: Your AI Scholar Search Engine

WisPaper is a new AI-powered academic search system that combines semantic search capabilities with automated paper validation and organization tools. The system achieved 22.26% recall on TaxoBench and 93.70% validation accuracy, addressing key limitations in current academic search engines by integrating discovery, organization, and monitoring workflows.

AIBullisharXiv – CS AI · Mar 27/1019
🧠

SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation

Researchers developed SocialNav, a foundation model for socially-aware robot navigation that uses a hierarchical architecture to understand social norms and generate compliant movement paths. The model was trained on 7 million samples and achieved 38% better success rates and 46% improved social compliance compared to existing methods.

← PrevPage 54 of 101Next →