y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#mcts News & Analysis

11 articles tagged with #mcts. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

11 articles
AIBullisharXiv – CS AI · 4d ago7/10
🧠

MCTS-Judge: Test-Time Scaling in LLM-as-a-Judge for Code Correctness Evaluation

Researchers introduce MCTS-Judge, a test-time scaling framework that enhances LLM-based code evaluation by applying Monte Carlo Tree Search to improve reasoning accuracy. The system achieves 80% accuracy on code correctness tasks—surpassing OpenAI's o1 models while using 3x fewer tokens—addressing a critical limitation in using LLMs as reliable judges for complex technical problems.

AIBullisharXiv – CS AI · Mar 47/102
🧠

$\texttt{SEM-CTRL}$: Semantically Controlled Decoding

Researchers introduce SEM-CTRL, a new approach that ensures Large Language Models produce syntactically and semantically correct outputs without requiring fine-tuning. The system uses token-level Monte Carlo Tree Search guided by Answer Set Grammars to enforce context-sensitive constraints, allowing smaller pre-trained LLMs to outperform larger models on tasks like reasoning and planning.

AINeutralarXiv – CS AI · 16h ago6/10
🧠

COMPASS: Cognitive MCTS-Guided Process Alignment for Safe Search Agents

Researchers introduce COMPASS, a safety alignment framework for LLM-powered search agents that prevents harmful outcomes from seemingly innocent multi-step queries. The method combines cognitive tree exploration and step-wise alignment to achieve robust safety while maintaining utility, requiring less training data than existing approaches.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

When Does Memory Help Multi-Trajectory Inference for Tool-Use LLM Agents?

Researchers demonstrate that memory mechanisms in multi-trajectory LLM agents produce inconsistent results depending on the inference strategy used, revealing that previous evaluations conflated memory abstraction properties with inference method effects. The study systematically evaluates four memory methods across three inference strategies on tool-use benchmarks, showing that reflection, fact extraction, and observation injection each perform optimally under different conditions.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

Researchers introduce McDiffuSE, an MCTS-based framework that optimizes slot-filling order in Masked Diffusion Models to improve performance on mathematical and code reasoning tasks. The approach achieves 3.2% improvement over autoregressive baselines and up to 19.5% gains on specific benchmarks by strategically exploring generation orderings rather than following sequential patterns.

AINeutralarXiv – CS AI · 5d ago5/10
🧠

Monte Carlo Permutation Search

Researchers propose Monte Carlo Permutation Search (MCPS), an improved Monte Carlo Tree Search algorithm that enhances the GRAVE algorithm for game-playing AI. MCPS leverages statistics from all playouts containing moves along the path from root to node, demonstrating superior performance across multiple games while eliminating GRAVE's bias hyperparameter.

AINeutralarXiv – CS AI · May 126/10
🧠

LLM-Guided Monte Carlo Tree Search over Knowledge Graphs: Composing Mechanistic Explanations for Drug-Disease Pairs

Researchers introduce TESSERA, a neuro-symbolic framework that combines Large Language Models with Monte Carlo Tree Search to extract multi-step explanations from knowledge graphs, specifically for drug-disease mechanism discovery. The system uses LLMs for local judgments rather than autonomous generation, enforcing structural constraints through knowledge graphs while employing MCTS for principled credit assignment across extended reasoning chains.

AINeutralarXiv – CS AI · May 116/10
🧠

Finite-Time Analysis of MCTS in Continuous POMDP Planning

Researchers present the first finite-time theoretical analysis of Monte Carlo Tree Search (MCTS) applied to Partially Observable Markov Decision Processes (POMDPs), bridging a critical gap in algorithmic guarantees. The paper introduces Voro-POMCPOW, which uses Voronoi cell partitioning for continuous observation spaces, proving high-probability bounds on value estimates while maintaining competitive empirical performance.

AIBullisharXiv – CS AI · May 76/10
🧠

CodeEvolve: LLM-Driven Evolutionary Optimization with Runtime-Enriched Target Selection for Multi-Language Code Enhancement

CodeEvolve is an AI-driven evolutionary framework that automates code optimization by using LLMs, runtime profiling, and Monte Carlo Tree Search to identify and improve performance bottlenecks. The system achieves significant speedups (15.22x average) on enterprise Java codebases while maintaining functional correctness through rigorous validation pipelines.

AIBullisharXiv – CS AI · Apr 106/10
🧠

PyFi: Toward Pyramid-like Financial Image Understanding for VLMs via Adversarial Agents

Researchers introduce PyFi, a framework enabling vision language models to understand financial images through progressive reasoning chains, backed by a 600K synthetic dataset organized as a reasoning pyramid. The approach uses adversarial agents to automatically generate training data without human annotation, achieving up to 19.52% accuracy improvements on fine-tuned models.

AIBullisharXiv – CS AI · Mar 36/107
🧠

LiTS: A Modular Framework for LLM Tree Search

LiTS is a new modular Python framework that enables LLM reasoning through tree search algorithms like MCTS and BFS. The framework demonstrates reusable components across different domains and reveals that LLM policy diversity, not reward quality, is the key bottleneck for effective tree search in infinite action spaces.