#web-search News & Analysis

7 articles tagged with #web-search. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

7 articles

AIBullisharXiv – CS AI · Jun 87/10

🧠

SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating

Researchers introduce SlimSearcher, a framework that trains AI web agents to perform complex information-seeking tasks with 17-58% fewer tool calls while maintaining or improving accuracy. The approach combines efficient trajectory filtering during supervised fine-tuning with adaptive reward gating during reinforcement learning to eliminate wasteful search behaviors.

AIBullisharXiv – CS AI · May 47/10

🧠

To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

Researchers present a decision-making framework to optimize when large language models should call external tools like web search. The study reveals that models often misjudge their actual need for tool use, and proposes lightweight estimators trained on hidden states to improve tool-calling decisions, demonstrating performance gains across multiple tasks.

AIBearisharXiv – CS AI · Apr 207/10

🧠

When Search Goes Wrong: Red-Teaming Web-Augmented Large Language Models

Researchers introduce CREST-Search, a red-teaming framework that exposes vulnerabilities in web-augmented LLMs by crafting benign-seeming queries designed to trigger unsafe citations from the internet. The study reveals that integrating web search into language models creates new safety risks beyond traditional LLM harms, requiring specialized defensive strategies.

AIBullishOpenAI News · Oct 317/104

🧠

Introducing ChatGPT search

OpenAI has launched ChatGPT search, a new feature that provides fast, timely answers with links to relevant web sources. This enhancement integrates real-time web search capabilities directly into ChatGPT, allowing users to get current information alongside AI-generated responses.

AINeutralarXiv – CS AI · Jun 116/10

🧠

TreeSeeker: Tree-Structured Trial, Error, and Return in Deep Search

TreeSeeker is a new inference-time framework that improves deep web search by using tree-structured trial-and-error navigation. The system balances exploration and exploitation through textual UCB signals, demonstrating consistent improvements over baseline models on multiple benchmarks.

AINeutralarXiv – CS AI · May 286/10

🧠

VeriTrip: A Verifiable Benchmark for Travel Planning Agents over Unstructured Web Corpora

Researchers introduce VeriTrip, a new benchmark for evaluating travel planning AI agents on their ability to reason over unstructured web data rather than structured APIs. The benchmark addresses critical gaps in agent evaluation by testing performance against information noise, contradictory facts, and multimodal content, revealing a significant trade-off between autonomous information retrieval and instruction following.

AINeutralarXiv – CS AI · Apr 76/10

🧠

TimeSeek: Temporal Reliability of Agentic Forecasters

TimeSeek introduces a benchmark showing that AI language models perform best at predicting binary market outcomes early in a market's lifecycle and on high-uncertainty markets, but struggle near resolution and on consensus markets. Web search generally improves forecasting accuracy across models, though not uniformly, while simple ensembles reduce errors without beating market performance overall.