🧠

AI

20,607 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

20607 articles

AIBullishBlockonomi · Apr 136/10

🧠

Micron (MU) Stock Could Soar 40% Higher, According to Wall Street Analyst

KeyBanc Capital Markets has issued a $600 price target for Micron Technology (MU), implying 40% upside potential. The bullish outlook is driven by strong demand for AI memory chips and supply constraints expected to persist through mid-2027, positioning the semiconductor company to capitalize on the AI infrastructure buildout.

AIBullishBlockonomi · Apr 136/10

🧠

BofA Elevates ON Semiconductor (ON) Stock to Buy With $85 Target Amid AI Growth

Bank of America upgraded ON Semiconductor to Buy with an $85 price target, citing strength in AI-related power solutions and the Treo product line. The upgrade reflects confidence in ON's positioning within the AI semiconductor supply chain, backed by a $6 billion three-year buyback commitment.

AIBullishAI News · Apr 136/10

🧠

Companies expand AI adoption while keeping control

Companies are adopting a measured approach to AI implementation, prioritizing human-in-the-loop systems that augment decision-making rather than fully autonomous solutions. This cautious strategy is particularly pronounced in high-risk sectors like finance and legal services, where errors carry significant financial or compliance consequences.

AINeutralBlockonomi · Apr 136/10

🧠

Oracle (ORCL) Stock Plunges 29% Despite Record AI Backlog — Is It a Buying Opportunity?

Oracle stock has declined 29% year-to-date despite maintaining a record $553B AI backlog and strong revenue performance, raising questions about whether the sell-off represents a genuine buying opportunity or reflects legitimate concerns about the company's debt burden and valuation relative to growth prospects.

AINeutralBlockonomi · Apr 136/10

🧠

ARK Invest Rotates $10M from AMD into Palantir (PLTR) Stock Amid Market Volatility

ARK Invest executed a $10M+ portfolio rotation on April 10-11, 2026, selling AMD stock while buying Palantir shares amid disagreement among analysts about AI sector valuations. The move reflects evolving institutional confidence in Palantir's AI capabilities relative to semiconductor plays during a period of market uncertainty.

AINeutralarXiv – CS AI · Apr 136/10

🧠

Structured Exploration and Exploitation of Label Functions for Automated Data Annotation

Researchers introduce EXPONA, an automated framework for generating label functions that improve weak label quality in machine learning datasets. The system balances exploration across surface, structural, and semantic levels with reliability filtering, achieving up to 98.9% label coverage and 46% downstream performance improvements across diverse classification tasks.

AINeutralarXiv – CS AI · Apr 136/10

🧠

Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models

Researchers systematically evaluated how sampling temperature and prompting strategies affect extended reasoning performance in large language models, finding that zero-shot prompting peaks at moderate temperatures (T=0.4-0.7) while chain-of-thought performs better at extremes. The study reveals that extended reasoning benefits grow substantially with higher temperatures, suggesting that T=0 is suboptimal for reasoning tasks.

🧠 Grok

AINeutralarXiv – CS AI · Apr 136/10

🧠

GNN-as-Judge: Unleashing the Power of LLMs for Graph Learning with GNN Feedback

Researchers propose GNN-as-Judge, a framework combining Large Language Models with Graph Neural Networks to improve learning on text-attributed graphs in low-resource settings. The approach uses collaborative pseudo-labeling and weakly-supervised fine-tuning to generate reliable labels while reducing noise, demonstrating significant performance gains when labeled data is scarce.

AIBullisharXiv – CS AI · Apr 136/10

🧠

WAND: Windowed Attention and Knowledge Distillation for Efficient Autoregressive Text-to-Speech Models

Researchers introduce WAND, a framework that reduces computational and memory costs of autoregressive text-to-speech models by replacing full self-attention with windowed attention combined with knowledge distillation. The approach achieves up to 66.2% KV cache memory reduction while maintaining speech quality, addressing a critical scalability bottleneck in modern AR-TTS systems.

AINeutralarXiv – CS AI · Apr 136/10

🧠

Silhouette Loss: Differentiable Global Structure Learning for Deep Representations

Researchers introduce Soft Silhouette Loss, a novel machine learning objective that improves deep neural network representations by enforcing intra-class compactness and inter-class separation. The lightweight differentiable loss outperforms cross-entropy and supervised contrastive learning when combined, achieving 39.08% top-1 accuracy compared to 37.85% for existing methods while reducing computational overhead.

AINeutralarXiv – CS AI · Apr 136/10

🧠

Act or Escalate? Evaluating Escalation Behavior in Automation with Language Models

Researchers analyzed how large language models decide whether to act on predictions or escalate to humans, finding that models use inconsistent and miscalibrated thresholds across five real-world domains. Supervised fine-tuning on chain-of-thought reasoning proved most effective at establishing robust escalation policies that generalize across contexts, suggesting escalation behavior requires explicit characterization before AI system deployment.

AIBearisharXiv – CS AI · Apr 136/10

🧠

Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces

Researchers introduce OmniBehavior, a benchmark for evaluating large language models' ability to simulate real-world human behavior across complex, long-horizon scenarios. The study reveals that current LLMs struggle with authentic behavioral simulation and exhibit systematic biases toward homogenized, overly-positive personas rather than capturing individual differences and realistic long-tail behaviors.

AIBullisharXiv – CS AI · Apr 136/10

🧠

On Divergence Measures for Training GFlowNets

Researchers propose improved divergence measures for training Generative Flow Networks (GFlowNets), comparing Renyi-α, Tsallis-α, and KL divergences to enhance statistical efficiency. The work introduces control variates that reduce gradient variance and achieve faster convergence than existing methods, bridging GFlowNets training with generalized variational inference frameworks.

AIBullisharXiv – CS AI · Apr 136/10

🧠

E3-TIR: Enhanced Experience Exploitation for Tool-Integrated Reasoning

Researchers introduce E3-TIR, a new training paradigm for Large Language Models that improves tool-use reasoning by combining expert guidance with self-exploration. The method achieves 6% performance gains while using less than 10% of typical synthetic data, addressing key limitations in current reinforcement learning approaches for AI agents.

AINeutralarXiv – CS AI · Apr 136/10

🧠

StaRPO: Stability-Augmented Reinforcement Policy Optimization

Researchers propose StaRPO, a reinforcement learning framework that improves large language model reasoning by incorporating stability metrics alongside task rewards. The method uses Autocorrelation Function and Path Efficiency measurements to evaluate logical coherence and goal-directedness, demonstrating improved accuracy and reasoning consistency across four benchmarks.

AIBullisharXiv – CS AI · Apr 136/10

🧠

Enhancing LLM Problem Solving via Tutor-Student Multi-Agent Interaction

Researchers present PETITE, a tutor-student multi-agent framework that enhances LLM problem-solving by assigning complementary roles to agents from the same model. Evaluated on coding benchmarks, the approach achieves comparable or superior accuracy to existing methods while consuming significantly fewer tokens, demonstrating that structured role-differentiated interactions can improve LLM performance more efficiently than larger models or heterogeneous ensembles.

AIBullisharXiv – CS AI · Apr 136/10

🧠

SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks

Researchers introduce Sequence-Level PPO (SPPO), a new algorithm that improves how large language models are trained for reasoning tasks by addressing stability and computational efficiency issues in standard reinforcement learning approaches. SPPO matches the performance of resource-heavy methods while significantly reducing memory and computational costs, potentially accelerating LLM alignment for complex problem-solving.

AINeutralarXiv – CS AI · Apr 136/10

🧠

SEA-Eval: A Benchmark for Evaluating Self-Evolving Agents Beyond Episodic Assessment

Researchers introduce SEA-Eval, a new benchmark for evaluating self-evolving AI agents that go beyond single-task execution by measuring how agents improve across sequential tasks and accumulate experience over time. The benchmark reveals significant inefficiencies in current state-of-the-art frameworks, exposing up to 31.2x differences in token consumption despite identical success rates, highlighting a critical bottleneck in agent development.

AINeutralarXiv – CS AI · Apr 136/10

🧠

Mind the Gap Between Spatial Reasoning and Acting! Step-by-Step Evaluation of Agents With Spatial-Gym

Researchers introduce Spatial-Gym, a benchmarking environment that evaluates AI models on spatial reasoning tasks through step-by-step pathfinding in 2D grids rather than one-shot generation. Testing eight models reveals a significant performance gap, with the best model achieving only 16% solve rate versus 98% for humans, exposing critical limitations in how AI systems scale reasoning effort and process spatial information.

AINeutralarXiv – CS AI · Apr 136/10

🧠

CONDESION-BENCH: Conditional Decision-Making of Large Language Models in Compositional Action Space

Researchers introduce CONDESION-BENCH, a new benchmark for evaluating how large language models make decisions in complex, real-world scenarios with compositional actions and conditional constraints. The benchmark addresses limitations in existing decision-making frameworks by incorporating variable-level, contextual, and allocation-level restrictions that better reflect actual decision-making environments.

AINeutralarXiv – CS AI · Apr 136/10

🧠

Structuring versus Problematizing: How LLM-based Agents Scaffold Learning in Diagnostic Reasoning

Researchers developed PharmaSim Switch, an AI-powered educational platform that uses large language models to scaffold diagnostic reasoning in pharmacy technician training through two distinct pedagogical approaches: structuring and problematizing. A 63-student experiment found both methods effective, with structuring promoting more accurate participation and problematizing encouraging deeper constructive engagement, suggesting hybrid scaffolding strategies optimize learning outcomes.

AIBearisharXiv – CS AI · Apr 136/10

🧠

GRM: Utility-Aware Jailbreak Attacks on Audio LLMs via Gradient-Ratio Masking

Researchers introduce GRM, a frequency-selective jailbreak framework that exploits vulnerabilities in audio large language models while maintaining utility preservation. By strategically perturbing specific frequency bands rather than entire spectrums, GRM achieves 88.46% jailbreak success rates with better trade-offs between attack effectiveness and transcription quality compared to existing methods.

AINeutralarXiv – CS AI · Apr 136/10

🧠

CLIP-Inspector: Model-Level Backdoor Detection for Prompt-Tuned CLIP via OOD Trigger Inversion

Researchers introduce CLIP-Inspector, a backdoor detection method for prompt-tuned CLIP models that reconstructs hidden triggers using out-of-distribution images to identify if a model has been maliciously compromised. The technique achieves 94% detection accuracy and enables post-hoc model repair, addressing critical security vulnerabilities in outsourced machine learning services.

AIBullisharXiv – CS AI · Apr 136/10

🧠

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Researchers propose Interactive ASR, a new framework that combines semantic-aware evaluation using LLM-as-a-Judge with multi-turn interactive correction to improve automatic speech recognition beyond traditional word error rate metrics. The approach simulates human-like interaction, enabling iterative refinement of recognition outputs across English, Chinese, and code-switching datasets.

AIBullisharXiv – CS AI · Apr 136/10

🧠

RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval

Researchers introduce RecaLLM, a post-trained language model that addresses the 'lost-in-thought' phenomenon where retrieval performance degrades during extended reasoning chains. The model interleaves explicit in-context retrieval with reasoning steps and achieves strong performance on long-context benchmarks using training data significantly shorter than existing approaches.

← PrevPage 450 of 825Next →