#program-synthesis News & Analysis

18 articles tagged with #program-synthesis. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

18 articles

AIBullisharXiv – CS AI · Jun 257/10

🧠

Weave of Formal Thought

Researchers introduce Weave of Formal Thought (WoFT), a framework that combines rigorous syntactic validation with learned structural representations to improve code generation in large language models. The approach uses constrained decoding with full Tree-sitter compliance and fine-tuning methods that teach models to embed grammar symbols during generation, achieving 14.3% relative cross-entropy reduction on Python code.

AIBullisharXiv – CS AI · Jun 237/10

🧠

AutoACSL: Synthesizing ACSL Specifications by Integrating LLMs with CPG-Based Static Analysis

Researchers introduce AutoACSL, a framework combining large language models with Code Property Graph analysis to automatically generate formal specifications for C programs. The system achieves 96% verification success rates, significantly outperforming code-only baselines and advancing automated formal verification capabilities.

🧠 GPT-5🧠 Gemini🧠 Grok

AIBullisharXiv – CS AI · Jun 17/10

🧠

Fighting Numerical Hallucinations via Data-centric Compilation for Online Financial QA

Researchers propose DCRC, a data-centric framework addressing numerical hallucinations in LLM-based financial question-answering systems. The approach combines adversarial data construction, multi-stage training, and executable reasoning programs to improve reliability in high-stakes financial applications where accuracy is critical.

AIBullisharXiv – CS AI · May 287/10

🧠

LACUNA: Safe Agents as Recursive Program Holes

LACUNA is a new programming model that allows LLM agents to write code that shapes their own runtime environment while maintaining safety through type-checking and validation. The system rejects unsafe code before execution and uses compiler diagnostics to drive retries, achieving competitive performance on benchmark tests while preventing prompt injection and tool misuse attacks.

AIBullisharXiv – CS AI · May 97/10

🧠

ReaComp: Compiling LLM Reasoning into Symbolic Solvers for Efficient Program Synthesis

ReaComp introduces a method to compile reasoning traces from large language models into reusable symbolic program synthesizers that eliminate runtime LLM calls. The approach achieves 91.3% accuracy on benchmark tasks while reducing token usage by 78%, demonstrating that neuro-symbolic hybrid systems can outperform pure LLM inference on complex program synthesis problems.

AIBullisharXiv – CS AI · Apr 107/10

🧠

Inference-Time Code Selection via Symbolic Equivalence Partitioning

Researchers propose Symbolic Equivalence Partitioning, a novel inference-time selection method for code generation that uses symbolic execution and SMT constraints to identify correct solutions without expensive external verifiers. The approach improves accuracy on HumanEval+ by 10.3% and on LiveCodeBench by 17.1% at N=10 without requiring additional LLM inference.

AIBullisharXiv – CS AI · Mar 177/10

🧠

Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI

Researchers introduced SOAR, a self-improving language model system that combines evolutionary search with hindsight learning for program synthesis tasks. The method achieved 52% success rate on the challenging ARC-AGI benchmark by iteratively improving through search and refinement cycles.

AINeutralarXiv – CS AI · Jun 196/10

🧠

Interpreting Neural Combinatorial Optimization via Evolving Programmatic Bottlenecks

Researchers introduce Evolving Programmatic Bottlenecks (EPB), a novel framework for interpreting Neural Combinatorial Optimization models by distilling them into human-readable program portfolios. The method uses large language models to autonomously evolve interpretable programs while maintaining performance comparable to the original black-box models, addressing a critical gap in AI explainability for complex sequential decision-making systems.

AINeutralarXiv – CS AI · Jun 46/10

🧠

Constrained Adaptive Rejection Sampling

Researchers introduce Constrained Adaptive Rejection Sampling (CARS), a novel technique that improves the efficiency of generating constrained outputs from language models while maintaining distributional fidelity. The method adaptively prunes invalid continuations using a trie data structure, achieving higher sample validity rates without sacrificing output diversity.

AINeutralarXiv – CS AI · Jun 26/10

🧠

LLM-Evolved Pattern Generators for Optimal Classical Planning

Researchers have developed a novel method using large language models and evolutionary algorithms to automatically generate admissible heuristics for optimal classical planning problems. Unlike existing learned heuristics that improve search speed but cannot guarantee optimal solutions, this approach preserves A* optimality guarantees while matching or exceeding the performance of traditional domain-independent methods.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Algebraic anti-unification

Researchers have developed an algebraic (semantic) theory of anti-unification that extends abstraction and generalization from syntactic term-based systems to arbitrary algebras. This theoretical computer science advancement moves anti-unification beyond equational theories and establishes foundational properties compatible with homomorphisms and isomorphisms, with computability analysis for finite algebras.

AINeutralarXiv – CS AI · Jun 16/10

🧠

PatchWorld: Gradient-Free Optimization of Executable World Models

Researchers introduce PatchWorld, a gradient-free framework that converts offline trajectories into executable Python world models for AI agents operating in partially observable environments. The method achieves 76.4% success on planning tasks without requiring LLM calls during prediction, while revealing a fundamental tradeoff between observation accuracy and decision-making utility in executable world models.

AINeutralarXiv – CS AI · May 126/10

🧠

Prospective Compression in Human Abstraction Learning

Researchers demonstrate that humans learn abstractions prospectively rather than retrospectively when facing non-stationary task environments. Using a visual program synthesis experiment called Pattern Builder Task, they show that human library learning anticipates future task structures rather than merely compressing past experience, a capability that existing algorithmic approaches and LLM-based models fail to replicate.

AINeutralarXiv – CS AI · May 126/10

🧠

Sketch-and-Verify: Structured Inference-Time Scaling via Program Sketching

Sketch-and-Verify is an inference-time scaling technique that improves small language model performance by having the LLM generate multiple algorithmic strategies as program sketches, then filling and verifying them. On HumanEval+, this approach delivers superior cost-performance within a model tier compared to flat sampling, though upgrading to a stronger model tier remains more effective than scaling test-time compute on smaller models.

🧠 Gemini

AINeutralarXiv – CS AI · May 116/10

🧠

Beyond Pairs: Your Language Model is Secretly Optimizing a Preference Graph

Researchers introduce Graph Direct Preference Optimization (GraphDPO), an advancement over standard DPO that leverages full preference structures from multiple rollouts per prompt rather than collapsing data into independent pairs. The method maintains computational efficiency while improving stability and performance on reasoning and program synthesis tasks by enforcing transitivity and reducing conflicting supervision signals.

AINeutralarXiv – CS AI · May 96/10

🧠

Back to the Beginning of Heuristic Design: Bridging Code and Knowledge with LLMs

Researchers propose a top-down approach to automatic heuristic design for combinatorial optimization using large language models, where interpretable knowledge becomes the primary search object rather than executable code. This knowledge-first paradigm improves discovery efficiency and generalization across problems compared to traditional code-centric methods, suggesting future progress in AI-driven optimization depends on building reusable, explicit hypotheses.

AINeutralarXiv – CS AI · Apr 106/10

🧠

One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration

Researchers introduce OneLife, a framework for learning symbolic world models from minimal unguided exploration in complex, stochastic environments. The approach uses conditionally-activated programmatic laws within a probabilistic framework and demonstrates superior performance on 16 of 23 test scenarios, advancing autonomous construction of world models for unknown environments.

AIBullisharXiv – CS AI · Mar 27/1011

🧠

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

Researchers propose a new framework for foundation world models that enables autonomous agents to learn, verify, and adapt reliably in dynamic environments. The approach combines reinforcement learning with formal verification and adaptive abstraction to create agents that can synthesize verifiable programs and maintain correctness while adapting to novel conditions.