AI Pulse News

Models, papers, tools. 19,542 articles with AI-powered sentiment analysis and key takeaways.

19542 articles

AIBearisharXiv – CS AI · Mar 176/10

🧠

On the Adversarial Transferability of Generalized "Skip Connections"

Researchers discovered that skip connections in deep neural networks make adversarial attacks more transferable across different AI models. They developed the Skip Gradient Method (SGM) which exploits this vulnerability in ResNets, Vision Transformers, and even Large Language Models to create more effective adversarial examples.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Estimating Causal Effects of Text Interventions Leveraging LLMs

Researchers propose CausalDANN, a novel method using large language models to estimate causal effects of textual interventions in social systems. The approach addresses limitations of traditional causal inference methods when dealing with complex, high-dimensional textual data and can handle arbitrary text interventions even with observational data only.

AIBullisharXiv – CS AI · Mar 176/10

🧠

VisionZip: Longer is Better but Not Necessary in Vision Language Models

Researchers introduce VisionZip, a new method that reduces redundant visual tokens in vision-language models while maintaining performance. The technique improves inference speed by 8x and achieves 5% better performance than existing methods by selecting only informative tokens for processing.

AIBullisharXiv – CS AI · Mar 176/10

🧠

SyncSpeech: Efficient and Low-Latency Text-to-Speech based on Temporal Masked Transformer

Researchers introduce SyncSpeech, a new text-to-speech model that combines autoregressive and non-autoregressive approaches using a Temporal Mask Transformer architecture. The model achieves 5.8x lower first-packet latency and 8.8x improved real-time performance while maintaining comparable speech quality to existing models.

AINeutralarXiv – CS AI · Mar 176/10

🧠

NetArena: Dynamic Benchmarks for AI Agents in Network Automation

NetArena introduces a dynamic benchmarking framework for evaluating AI agents in network automation tasks, addressing limitations of static benchmarks through runtime query generation and network emulator integration. The framework reveals that AI agents achieve only 13-38% performance on realistic network queries, significantly improving statistical reliability by reducing confidence-interval overlap from 85% to 0%.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning

Researchers developed E2H Reasoner, a curriculum reinforcement learning method that improves LLM reasoning by training on tasks from easy to hard. The approach shows significant improvements for small LLMs (1.5B-3B parameters) that struggle with vanilla RL training alone.

AIBullisharXiv – CS AI · Mar 176/10

🧠

EvolvR: Self-Evolving Pairwise Reasoning for Story Evaluation to Enhance Generation

Researchers have developed EvolvR, a self-evolving framework that improves AI's ability to evaluate and generate stories through pairwise reasoning and multi-agent data filtering. The system achieves state-of-the-art performance on three evaluation benchmarks and significantly enhances story generation quality when used as a reward model.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Researchers conducted the first systematic study on post-training quantization for diffusion large language models (dLLMs), identifying activation outliers as a key challenge for compression. The study evaluated state-of-the-art quantization methods across multiple dimensions to provide insights for efficient dLLM deployment on edge devices.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Induction Signatures Are Not Enough: A Matched-Compute Study of Load-Bearing Structure in In-Context Learning

Research shows that synthetic data designed to enhance in-context learning capabilities in AI models doesn't necessarily improve performance. The study found that while targeted training can increase specific neural mechanisms, it doesn't make them more functionally important compared to natural training approaches.

🏢 Perplexity

AIBullisharXiv – CS AI · Mar 176/10

🧠

XQC: Well-conditioned Optimization Accelerates Deep Reinforcement Learning

Researchers introduce XQC, a deep reinforcement learning algorithm that achieves state-of-the-art sample efficiency by optimizing the critic network's condition number through batch normalization, weight normalization, and distributional cross-entropy loss. The method outperforms existing approaches across 70 continuous control tasks while using fewer parameters.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Diverse Text-to-Image Generation via Contrastive Noise Optimization

Researchers introduce Contrastive Noise Optimization, a new method that improves diversity in text-to-image AI generation by optimizing initial noise patterns rather than intermediate outputs. The technique uses contrastive loss to maximize diversity while preserving image quality, achieving superior results across multiple text-to-image model architectures.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning

Researchers introduce Slow-Fast Policy Optimization (SFPO), a new reinforcement learning framework that improves training stability and efficiency for large language model reasoning. SFPO outperforms existing methods like GRPO by up to 2.80 points on math benchmarks while requiring up to 4.93x fewer rollouts and 4.19x less training time.

AIBullisharXiv – CS AI · Mar 176/10

🧠

GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning

GlobalRAG is a new reinforcement learning framework that significantly improves multi-hop question answering by decomposing questions into subgoals and coordinating retrieval with reasoning. The system achieves 14.2% average improvements in performance metrics while using only 42% of the training data required by baseline models.

AIBullisharXiv – CS AI · Mar 176/10

🧠

VLAD-Grasp: Zero-shot Grasp Detection via Vision-Language Models

Researchers developed VLAD-Grasp, a training-free robotic grasping system that uses vision-language models to detect graspable objects without requiring curated datasets. The system achieves competitive performance with state-of-the-art methods on benchmark datasets and demonstrates zero-shot generalization to real-world robotic manipulation tasks.

AINeutralarXiv – CS AI · Mar 176/10

🧠

LabelFusion: Fusing Large Language Models with Transformer Encoders for Robust Financial News Classification

Researchers developed LabelFusion, a hybrid AI architecture combining Large Language Models with transformer encoders for financial news classification. The system achieves 96% F1 score on full datasets but LLMs alone perform better in low-data scenarios, suggesting different strategies based on available training data.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Protecting Deep Neural Network Intellectual Property with Chaos-Based White-Box Watermarking

Researchers have developed a new white-box watermarking framework that uses chaotic sequences to embed ownership information into deep neural network parameters for intellectual property protection. The method uses logistic maps and genetic algorithms to verify model ownership without degrading performance, showing effectiveness on MNIST and CIFAR-10 datasets.

AINeutralarXiv – CS AI · Mar 176/10

🧠

EgoGrasp: World-Space Hand-Object Interaction Estimation from Egocentric Videos

EgoGrasp introduces the first method to reconstruct world-space hand-object interactions from egocentric videos using open-vocabulary objects. The multi-stage framework combines vision foundation models with body-guided diffusion models to achieve state-of-the-art performance in 3D scene reconstruction and hand pose estimation.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Agentic Retoucher for Text-To-Image Generation

Researchers introduce Agentic Retoucher, a new AI framework that fixes common distortions in text-to-image generation through a three-agent system for perception, reasoning, and correction. The system outperformed existing methods on a new 27K-image dataset, potentially improving the quality and reliability of AI-generated images.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models

Researchers introduce Imagine-then-Plan (ITP), a new AI framework that enables agents to learn through adaptive lookahead imagination using world models. The system allows AI agents to simulate multi-step future scenarios and adjust planning horizons dynamically, significantly outperforming existing methods in benchmark tests.

AIBearisharXiv – CS AI · Mar 176/10

🧠

Should LLMs, like, Generate How Users Talk? Building Dialect-Accurate Dialog[ue]s Beyond the American Default with MDial

Researchers introduced MDial, the first large-scale framework for generating multi-dialectal conversational data across nine English dialects, revealing that over 80% of English speakers don't use Standard American English. Evaluation of 17 LLMs showed even frontier models achieve under 70% accuracy in dialect identification, with particularly poor performance on non-American dialects.

AIBearisharXiv – CS AI · Mar 176/10

🧠

A Coin Flip for Safety: LLM Judges Fail to Reliably Measure Adversarial Robustness

A new research study reveals that AI judges used to evaluate the safety of large language models perform poorly when assessing adversarial attacks, often degrading to near-random accuracy. The research analyzed 6,642 human-verified labels and found that many attacks artificially inflate their success rates by exploiting judge weaknesses rather than generating genuinely harmful content.

AIBearisharXiv – CS AI · Mar 176/10

🧠

HEARTS: Benchmarking LLM Reasoning on Health Time Series

Researchers introduce HEARTS, a comprehensive benchmark for evaluating large language models' ability to reason over health time series data across 16 datasets and 12 health domains. The study reveals that current LLMs significantly underperform compared to specialized models and struggle with multi-step temporal reasoning in healthcare applications.

AIBearishThe Register – AI · Mar 176/10

🧠

AI still doesn't work very well, businesses are faking it, and a reckoning is coming

The article appears to discuss concerns about AI technology's current limitations and suggests that businesses may be overstating AI capabilities. A market correction or reassessment of AI's actual effectiveness may be approaching.

AIBearishDecrypt – AI · Mar 166/10

🧠

Did ChatGPT Really Cure a Dog's Cancer? It's Complicated

A viral story claiming ChatGPT helped cure a dog's cancer by designing a custom vaccine has been disputed by the actual scientists involved. The researchers say the AI's role was minimal and the credit for the breakthrough belongs to traditional scientific methods and expertise.

🧠 ChatGPT

← PrevPage 380 of 782Next →