AI Pulse News

Models, papers, tools. 34,451 articles with AI-powered sentiment analysis and key takeaways.

34451 articles

AINeutralarXiv – CS AI · Jun 56/10

🧠

Learning What Matters: Probabilistic Task Selection via Mutual Information for Model Finetuning

Researchers introduce TaskPGM, a framework that optimizes how training data is distributed across multiple tasks when fine-tuning large language models by modeling task relationships through an energy-based probabilistic approach. The method balances task coverage against redundancy, demonstrating improvements over conventional uniform or size-proportional sampling strategies across multiple model families and evaluation benchmarks.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

Researchers present a theoretical framework analyzing scaling laws for shallow neural networks in the feature learning regime, deriving phase diagrams that connect sample complexity and weight decay to risk exponents. The work bridges empirical observations in deep learning with rigorous mathematical analysis, establishing links between weight spectrum properties and generalization performance through matrix compressed sensing and LASSO theory.

AINeutralarXiv – CS AI · Jun 56/10

🧠

A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

Researchers conducted an in-depth study of 14 open-source large language model projects through developer interviews, revealing how collaboration, governance, and participation evolve across different development stages. The study maps motivations ranging from democratizing AI to expanding language representation, showing that openness in open-source AI emerges from complex interactions between artifact domains, lifecycle stages, and institutional contexts rather than being a uniform property.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Correcting Prompt Dependence in LLM Benchmarks: A Bayesian Hierarchical Model with Embedding-Space Clustering

Researchers propose a Bayesian hierarchical model with embedding-space clustering to correct fundamental flaws in LLM benchmarking methodology. The approach addresses two critical issues—insufficient evaluation samples and non-independent test prompts—improving performance metric accuracy by 4-73% in mean absolute errors, particularly relevant for adversarial robustness evaluation.

AINeutralarXiv – CS AI · Jun 56/10

🧠

CTIConnect: A Benchmark for Retrieval-Augmented LLMs over Heterogeneous Cyber Threat Intelligence

Researchers introduce CTIConnect, a benchmark for evaluating retrieval-augmented large language models on cyber threat intelligence tasks. The study integrates five heterogeneous CTI sources into 1,860 expert-verified QA pairs across nine tasks, revealing that different task categories require fundamentally different retrieval strategies and that domain-specific approaches outperform generic retrieval methods.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Efficient Asynchronous Federated Evaluation with Strategy Similarity Awareness for Intent-Based Networking in Industrial Internet of Things

Researchers propose FEIBN, a federated learning framework that combines large language models with distributed strategy evaluation for Intent-Based Networking in industrial IoT environments. The system introduces SSAFL, a mechanism that optimizes federated learning through strategy similarity awareness and asynchronous updates, reducing communication overhead and improving convergence speed while maintaining privacy across heterogeneous nodes.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Using street view images and visual LLMs to predict heritage values for governance support: Risks, ethics, and policy implications

Swedish authorities are using visual Large Language Models to analyze 154,710 street view images across Sweden to identify buildings with heritage values, supporting the EU's Energy Performance of Buildings Directive implementation. The research addresses Sweden's lack of a comprehensive heritage building register while raising critical concerns about LLM transparency, error detection, and potential misuse in government governance.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Reward Learning through Ranking Mean Squared Error

Researchers introduce R4 (Ranked Return Regression for RL), a new reinforcement learning method that learns reward functions from human ratings rather than binary preferences. The approach uses a novel ranking mean squared error loss and provides formal mathematical guarantees about solution completeness and minimality, demonstrating competitive or superior performance against existing methods on robotic benchmarks.

🏢 OpenAI🏢 Google

AIBullisharXiv – CS AI · Jun 56/10

🧠

A2RAG: Adaptive Agentic Graph Retrieval for Cost-Aware and Reliable Reasoning

Researchers introduce A2RAG, an adaptive framework that improves Graph-Retrieval-Augmented Generation (Graph-RAG) for multi-hop question answering by dynamically adjusting retrieval effort based on query difficulty. The system reduces token consumption and latency by ~50% while achieving significant accuracy gains, addressing practical deployment challenges in AI reasoning systems.

AIBullisharXiv – CS AI · Jun 56/10

🧠

Toward Culturally Aligned LLMs through Ontology-Guided Multi-Agent Reasoning

Researchers introduce OG-MAR, a framework that uses cultural ontologies and multi-agent reasoning to align Large Language Models with diverse cultural values derived from the World Values Survey. The system improves LLM cultural sensitivity and consistency by grounding outputs in structured demographic profiles and enforcing value relationships at inference time.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Beyond Rewards in Reinforcement Learning for Cyber Defence

Researchers demonstrate that sparse reward functions outperform dense, engineered rewards when training autonomous cyber defence agents using deep reinforcement learning. The study reveals that sparse rewards produce more reliable training, lower-risk policies, and better alignment with defender objectives without explicit penalties for costly actions.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Aligning Tree-Search Policies with Fixed Token Budgets in Test-Time Scaling of LLMs

Researchers propose Budget-Guided MCTS, a tree-search algorithm that optimizes large language model inference by dynamically adjusting exploration and refinement strategies based on remaining token budgets. The method addresses a practical deployment challenge where fixed computational budgets vary across use cases, outperforming budget-agnostic approaches on mathematical and physics reasoning tasks.

AINeutralarXiv – CS AI · Jun 56/10

🧠

MAviS: A Multimodal Conversational Assistant For Avian Species

Researchers introduce MAviS, a specialized multimodal AI system combining image, audio, and text data for avian species identification and ecological monitoring. The system includes a large dataset covering 1,000+ bird species, a fine-tuned language model, and a comprehensive benchmark, demonstrating state-of-the-art performance in domain-specific biodiversity conservation applications.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Learning to Theorize the World from Observation

Researchers introduce Learning-to-Theorize, a new AI paradigm that builds explicit explanatory theories of the world from observations rather than simply predicting future states. The Neural Theorizer (NEO) model represents understanding as executable, compositional programs whose learned primitives can be recombined to explain novel phenomena, enabling explanation-driven generalization.

AINeutralarXiv – CS AI · Jun 55/10

🧠

Fault tolerance estimation in digital circuits with visualised generative networks

Researchers propose a novel computational method using Generative Adversarial Networks (GANs) to estimate fault tolerance in digital circuits. The approach compares ideal digital outputs against realistic signals to identify and quantify how various failure modes—such as missing or malfunctioning logical gates—affect circuit robustness.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Towards Generalization of Block Attention via Automatic Segmentation and Block Distillation

Researchers introduce SemanticSeg, a large semantic segmentation dataset, and block distillation framework to improve block attention mechanisms for long-context language models. The approach uses a frozen full-attention teacher to train block-attention students more efficiently, addressing key challenges in KV cache reuse for applications like RAG.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Surrogate Neural Architecture Codesign Package (SNAC-Pack)

SNAC-Pack is an open-source AutoML framework that automates neural architecture design for FPGA deployment by combining hardware-aware search with quantization and pruning. The tool reduces design cycles from months to hours while matching or exceeding baseline performance on tasks like jet classification and quantum computing applications.

AIBullisharXiv – CS AI · Jun 56/10

🧠

Scalable Reinforcement Learning via Adaptive Batch Scaling

Researchers propose Adaptive Batch Scaling (ABS), a technique that dynamically adjusts batch sizes during reinforcement learning training by measuring policy stability through a novel 'Behavioral Divergence' metric. The approach challenges the conventional belief that large batches are incompatible with RL, demonstrating that combining larger networks with larger batch sizes can achieve superior performance when batch size adapts to training phase stability.

AIBullisharXiv – CS AI · Jun 56/10

🧠

Reflex: Reinforcement Learning with Reflection Symmetry Exploitation in State-Based Continuous Control

Researchers introduce Reflex, a reinforcement learning framework that exploits reflection symmetry in state-based continuous control tasks to improve sample efficiency. The method integrates with both on-policy (PPO) and off-policy (SAC) algorithms and demonstrates superior performance on standard benchmarks compared to baseline approaches.

🏢 OpenAI🏢 Google

AINeutralarXiv – CS AI · Jun 56/10

🧠

Extreme Region Policy Distillation

Researchers propose Extreme Region Policy Distillation (ERPD), a two-stage framework that improves reinforcement learning efficiency for large language models by first extracting maximum training signals through aggressive off-policy optimization, then distilling those signals into a base policy with tighter constraints. The approach achieves comparable or better performance with significantly reduced KL divergence, addressing a fundamental trade-off between sample efficiency and asymptotic performance in LLM training.

AINeutralarXiv – CS AI · Jun 56/10

🧠

When Gradients Collide: Failure Modes of Multi-Objective Prompt Optimization for LLM Judges

Researchers identify critical failure modes in multi-objective prompt optimization for LLM judges, finding that jointly optimizing across multiple evaluation criteria reduces gradient task-focus by 59% and combining single-objective prompts degrades performance by 27%. The study reveals fundamental limitations in extending textual gradient methods to multi-criteria scenarios, constraining practical applications of automated LLM judge customization.

GeneralNeutralCrypto Briefing · Jun 56/10

📰

SpaceX sets $135 price for blockbuster IPO, challenging Wall Street norms

SpaceX has set its IPO price at $135 per share, marking a significant departure from traditional Wall Street underwriting practices. The pricing strategy reflects the company's confidence in direct market control and could reshape how major tech companies approach future public offerings.

AI × CryptoBullishCrypto Briefing · Jun 56/10

🤖

Hut 8 hires NextEra veteran Mark Eidelman to lead investor relations amid AI infrastructure pivot

Hut 8 Mining has appointed Mark Eidelman, a veteran from NextEra Energy, to lead investor relations as the company pivots toward AI infrastructure. This strategic repositioning aims to reduce capital expenditures and strengthen competitive positioning in the emerging AI hardware market.

AI × CryptoBullishCrypto Briefing · Jun 56/10

🤖

Microchip receives US export license for advanced FPGA R&D in Armenia

Microchip Technology has obtained a US export license to conduct advanced FPGA research and development in Armenia, strengthening its technological capabilities and competitive positioning. The approval signals regulatory confidence in the operation and carries implications for cryptocurrency mining infrastructure and the broader semiconductor sector.

GeneralBullishCrypto Briefing · Jun 46/10

📰

Dimon and Musk launch SpaceX IPO roadshow in New York

Jamie Dimon and Elon Musk are launching a SpaceX IPO roadshow in New York, signaling potential plans to take the private space company public. The move could reshape market expectations around private space ventures and their economic impact on global markets.

← PrevPage 482 of 1379Next →