AINeutralarXiv – CS AI · 14h ago6/10
🧠
Predicting Causal Effects from Natural Language Queries using Structured Representations
Researchers introduce Query2Effect, a 72,000-question benchmark for predicting causal effect sizes from natural language queries using LLMs. A two-step framework combining structured representation generation with supervised encoding reduces prediction error by 27-71% compared to standard LLMs, demonstrating that separating semantic interpretation from numerical estimation improves both in-domain performance and out-of-domain generalization.