y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto
🧠All15,668🧠AI11,665🤖AI × Crypto505📰General3,498
Home/AI Pulse

AI Pulse News

Models, papers, tools. 15,687 articles with AI-powered sentiment analysis and key takeaways.

15687 articles
AIBearisharXiv – CS AI · Apr 157/10
🧠

One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness

Researchers demonstrate that instruction-tuned large language models suffer severe performance degradation when subject to simple lexical constraints like banning a single punctuation mark or common word, losing 14-48% of response quality. This fragility stems from a planning failure where models couple task competence to narrow surface-form templates, affecting both open-weight and commercially deployed closed-weight models like GPT-4o-mini.

🧠 GPT-4
AINeutralarXiv – CS AI · Apr 157/10
🧠

Parallax: Why AI Agents That Think Must Never Act

Researchers introduce Parallax, a security framework that structurally separates AI reasoning from execution to prevent autonomous agents from carrying out malicious actions even when compromised. The system achieves 98.9% attack prevention across adversarial tests, addressing a critical vulnerability in enterprise AI deployments where prompt-based safeguards alone prove insufficient.

AINeutralarXiv – CS AI · Apr 157/10
🧠

Distorted or Fabricated? A Survey on Hallucination in Video LLMs

Researchers have conducted a comprehensive survey on hallucinations in Video Large Language Models (Vid-LLMs), identifying two core types—dynamic distortion and content fabrication—and their root causes in temporal representation limitations and insufficient visual grounding. The study reviews evaluation benchmarks, mitigation strategies, and proposes future directions including motion-aware encoders and counterfactual learning to improve reliability.

AIBullisharXiv – CS AI · Apr 157/10
🧠

Efficient Adversarial Training via Criticality-Aware Fine-Tuning

Researchers introduce Criticality-Aware Adversarial Training (CAAT), a parameter-efficient method that identifies and fine-tunes only the most robustness-critical parameters in Vision Transformers, achieving 94.3% of standard adversarial training robustness while tuning just 6% of model parameters. This breakthrough addresses the computational bottleneck preventing large-scale adversarial training deployment.

AIBullisharXiv – CS AI · Apr 157/10
🧠

OSC: Hardware Efficient W4A4 Quantization via Outlier Separation in Channel Dimension

Researchers present OSC, a hardware-efficient framework that addresses the challenge of deploying Large Language Models with 4-bit quantization by intelligently separating activation outliers into a high-precision processing path while maintaining low-precision computation for standard values. The technique achieves 1.78x speedup over standard 8-bit approaches while limiting accuracy degradation to under 2.2% on state-of-the-art models.

AINeutralarXiv – CS AI · Apr 157/10
🧠

Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training

Researchers demonstrate that post-training in reasoning models creates specialized attention heads that enable complex problem-solving, but this capability introduces trade-offs where sophisticated reasoning can degrade performance on simpler tasks. Different training methods—SFT, distillation, and GRPO—produce fundamentally different architectural mechanisms, revealing tensions between reasoning capability and computational reliability.

AINeutralarXiv – CS AI · Apr 157/10
🧠

Latent Planning Emerges with Scale

Researchers demonstrate that large language models develop internal planning representations that scale with model size, enabling them to implicitly plan future outputs without explicit verbalization. The study on Qwen-3 models (0.6B-14B parameters) reveals mechanistic evidence of latent planning through neural features that predict and shape token generation, with planning capabilities increasing consistently across model scales.

AIBullisharXiv – CS AI · Apr 157/10
🧠

Decoding by Perturbation: Mitigating MLLM Hallucinations via Dynamic Textual Perturbation

Researchers introduce Decoding by Perturbation (DeP), a training-free method that reduces hallucinations in multimodal large language models by applying controlled textual perturbations during decoding. The approach addresses the core issue where language priors override visual evidence, achieving improvements across multiple benchmarks without requiring model retraining or visual manipulation.

AIBearisharXiv – CS AI · Apr 157/10
🧠

Is Vibe Coding the Future? An Empirical Assessment of LLM Generated Codes for Construction Safety

Researchers empirically evaluated 450 LLM-generated Python scripts for construction safety and found alarming reliability gaps, including a 45% silent failure rate where code executes but produces mathematically incorrect safety outputs. The study demonstrates that current frontier LLMs lack the deterministic rigor required for autonomous safety-critical engineering applications, necessitating human oversight and governance frameworks.

🧠 GPT-4🧠 Claude🧠 Gemini
AIBullisharXiv – CS AI · Apr 157/10
🧠

Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models

Researchers present Chain-of-Models Pre-Training (CoM-PT), a novel method that accelerates vision foundation model training by up to 7.09X through sequential knowledge transfer from smaller to larger models in a unified pipeline, rather than training each model independently. The approach maintains or improves performance while significantly reducing computational costs, with efficiency gains increasing as more models are added to the training sequence.

AIBearisharXiv – CS AI · Apr 157/10
🧠

TEMPLATEFUZZ: Fine-Grained Chat Template Fuzzing for Jailbreaking and Red Teaming LLMs

Researchers introduce TEMPLATEFUZZ, a fuzzing framework that systematically exploits vulnerabilities in LLM chat templates—a previously overlooked attack surface. The method achieves 98.2% jailbreak success rates on open-source models and 90% on commercial LLMs, significantly outperforming existing prompt injection techniques while revealing critical security gaps in production AI systems.

AIBullisharXiv – CS AI · Apr 157/10
🧠

CascadeDebate: Multi-Agent Deliberation for Cost-Aware LLM Cascades

CascadeDebate introduces a novel multi-agent deliberation system for large language model cascades that dynamically allocates computational resources based on query difficulty. By inserting lightweight agent ensembles at escalation boundaries to resolve ambiguous cases internally, the system achieves up to 26.75% performance improvement while reducing unnecessary escalations to expensive models.

AIBearisharXiv – CS AI · Apr 157/10
🧠

Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models

Researchers tested whether large language models exhibit the Identifiable Victim Effect (IVE)—a well-documented cognitive bias where people prioritize helping a specific individual over a larger group facing equal hardship. Across 51,955 API trials spanning 16 frontier models, instruction-tuned LLMs showed amplified IVE compared to humans, while reasoning-specialized models inverted the effect, raising critical concerns about AI deployment in humanitarian decision-making.

🏢 OpenAI🏢 Anthropic🏢 xAI
AIBullisharXiv – CS AI · Apr 157/10
🧠

Towards grounded autonomous research: an end-to-end LLM mini research loop on published computational physics

Researchers demonstrate an autonomous LLM agent capable of executing a complete research loop—reading, reproducing, critiquing, and extending computational physics papers. Testing across 111 papers reveals the agent identifies substantive flaws in 42% of cases, with 97.7% of issues requiring actual computation to detect, and produces a publishable peer-review comment on a Nature Communications paper without human direction.

AINeutralarXiv – CS AI · Apr 157/10
🧠

Benchmarking Deflection and Hallucination in Large Vision-Language Models

Researchers introduce VLM-DeflectionBench, a new benchmark with 2,775 samples designed to evaluate how large vision-language models handle conflicting or insufficient evidence. The study reveals that most state-of-the-art LVLMs fail to appropriately deflect when faced with noisy or misleading information, highlighting critical gaps in model reliability for knowledge-intensive tasks.

AIBullisharXiv – CS AI · Apr 157/10
🧠

How Transformers Learn to Plan via Multi-Token Prediction

Researchers demonstrate that multi-token prediction (MTP) outperforms standard next-token prediction (NTP) for training language models on reasoning tasks like planning and pathfinding. Through theoretical analysis of simplified Transformers, they reveal that MTP enables a reverse reasoning process where models first identify end states then reconstruct paths backward, suggesting MTP induces more interpretable and robust reasoning circuits.

AIBullisharXiv – CS AI · Apr 157/10
🧠

AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow

AutoSurrogate is an LLM-driven framework that automates the construction of deep learning surrogate models for subsurface flow simulation, enabling domain scientists without machine learning expertise to build high-quality models through natural language instructions. The system autonomously handles data profiling, architecture selection, hyperparameter optimization, and quality assessment while managing failure modes, demonstrating superior performance to expert-designed baselines on geological carbon storage tasks.

AIBullisharXiv – CS AI · Apr 157/10
🧠

Schema-Adaptive Tabular Representation Learning with LLMs for Generalizable Multimodal Clinical Reasoning

Researchers propose Schema-Adaptive Tabular Representation Learning, which uses LLMs to convert structured clinical data into semantic embeddings that transfer across different electronic health record schemas without retraining. When combined with imaging data for dementia diagnosis, the method achieves state-of-the-art results and outperforms board-certified neurologists on retrospective diagnostic tasks.

AIBullisharXiv – CS AI · Apr 157/10
🧠

Drawing on Memory: Dual-Trace Encoding Improves Cross-Session Recall in LLM Agents

Researchers introduce dual-trace memory encoding for LLM agents, pairing factual records with narrative scene reconstructions to improve cross-session recall by 20+ percentage points. The method significantly enhances temporal reasoning and multi-session knowledge aggregation without increasing computational costs, advancing the capability of persistent AI agent systems.

AIBullisharXiv – CS AI · Apr 157/10
🧠

RePAIR: Interactive Machine Unlearning through Prompt-Aware Model Repair

Researchers introduce RePAIR, a framework enabling users to instruct large language models to forget harmful knowledge, misinformation, and personal data through natural language prompts at inference time. The system uses a training-free method called STAMP that manipulates model activations to achieve selective unlearning with minimal computational overhead, outperforming existing approaches while preserving model utility.

AIBullisharXiv – CS AI · Apr 157/10
🧠

Transferable Expertise for Autonomous Agents via Real-World Case-Based Learning

Researchers propose a case-based learning framework enabling LLM-based autonomous agents to extract and reuse knowledge from past tasks, improving performance on complex real-world problems. The method outperforms traditional zero-shot, few-shot, and prompt-based baselines across six task categories, with gains increasing as task complexity rises.

AIBearisharXiv – CS AI · Apr 157/10
🧠

AISafetyBenchExplorer: A Metric-Aware Catalogue of AI Safety Benchmarks Reveals Fragmented Measurement and Weak Benchmark Governance

Researchers have catalogued 195 AI safety benchmarks released since 2018, revealing that rapid proliferation of evaluation tools has outpaced standardization efforts. The study identifies critical fragmentation: inconsistent metric definitions, limited language coverage, poor repository maintenance, and lack of shared measurement standards across the field.

🏢 Hugging Face
AIBullisharXiv – CS AI · Apr 157/10
🧠

DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding

Researchers introduce DocSeeker, a multimodal AI system designed to improve long document understanding by implementing structured analysis, localization, and reasoning workflows. The breakthrough addresses critical limitations in existing large language models that struggle with lengthy documents due to high noise levels and weak training signals, achieving superior performance on both short and ultra-long documents.

AIBullisharXiv – CS AI · Apr 157/10
🧠

IDEA: An Interpretable and Editable Decision-Making Framework for LLMs via Verbal-to-Numeric Calibration

Researchers introduce IDEA, a framework that converts Large Language Model decision-making into interpretable, editable parametric models with calibrated probabilities. The approach outperforms major LLMs like GPT-5.2 and DeepSeek R1 on benchmarks while enabling direct expert knowledge integration and precise human-AI collaboration.

🧠 GPT-5
AIBearisharXiv – CS AI · Apr 157/10
🧠

Every Picture Tells a Dangerous Story: Memory-Augmented Multi-Agent Jailbreak Attacks on VLMs

Researchers introduce MemJack, a multi-agent framework that exploits semantic vulnerabilities in Vision-Language Models through coordinated jailbreak attacks, achieving 71.48% attack success rates against Qwen3-VL-Plus. The study reveals that current VLM safety measures fail against sophisticated visual-semantic attacks and introduces MemJack-Bench, a dataset of 113,000+ attack trajectories to advance defensive research.

← PrevPage 57 of 628Next →
◆ AI Mentions
🏢OpenAI
58×
🏢Anthropic
54×
🏢Nvidia
50×
🧠Claude
46×
🧠Gemini
43×
🧠GPT-5
43×
🧠ChatGPT
37×
🧠GPT-4
28×
🧠Llama
27×
🧠Opus
9×
🏢Meta
9×
🏢Hugging Face
9×
🧠Grok
6×
🏢Perplexity
6×
🏢Google
6×
🏢xAI
5×
🧠Sonnet
5×
🏢Microsoft
4×
🧠Stable Diffusion
2×
🧠Haiku
1×
▲ Trending Tags
1#iran5182#ai5083#market3144#geopolitical2695#geopolitics2546#geopolitical-risk2207#market-volatility1728#middle-east1429#sanctions12310#trump11811#security9512#energy-markets9313#oil-markets9114#strait-of-hormuz8615#inflation71
Tag Sentiment
#iran518 articles
#ai508 articles
#market314 articles
#geopolitical269 articles
#geopolitics254 articles
#geopolitical-risk220 articles
#market-volatility172 articles
#middle-east142 articles
#sanctions123 articles
#trump118 articles
BullishNeutralBearish
Stay Updated
Models, papers, tools
Tag Connections
#geopolitical↔#iran
167
#iran↔#market
131
#geopolitical↔#market
109
#geopolitics↔#iran
82
#iran↔#trump
67
#ai↔#artificial-intelligence
56
#geopolitical-risk↔#market-volatility
49
#geopolitics↔#middle-east
46
#ai↔#market
43
#geopolitical-risk↔#oil-markets
42
Filters
Sentiment
Importance
Sort
📡 See all 70+ sources
y0.exchange
Your AI agent for DeFi
Connect Claude or GPT to your wallet. AI reads balances, proposes swaps and bridges — you approve. Your keys never leave your device.
8 MCP tools · 15 chains · $0 fees
Connect Wallet to AI →How it works →
Viewing: AI Pulse feed
Filters
Sentiment
Importance
Sort
Stay Updated
Models, papers, tools
y0news
y0.exchangeLaunch AppDigestsSourcesAboutRSSAI NewsCrypto News
© 2026 y0.exchange