Legal2LogicICL: Improving Generalization in Transforming Legal Cases to Logical Formulas via Diverse Few-Shot Learning
Researchers introduce Legal2LogicICL, an LLM-based framework that improves the conversion of natural-language legal cases into logical formulas through retrieval-augmented few-shot learning. The method addresses data scarcity in legal AI systems and introduces a new annotated dataset (Legal2Proleg) to advance interpretable legal reasoning without requiring model fine-tuning.
Legal reasoning systems have historically struggled with the gap between natural language and formal logical representations, particularly when training data is scarce. This research tackles a genuine pain point in legal AI: existing approaches depend heavily on fine-tuned models trained on limited annotated datasets, which constrains their ability to generalize across different legal domains and case structures. The Legal2LogicICL framework leverages large language models' in-context learning capabilities, using retrieval-augmented generation to identify relevant few-shot examples that balance semantic similarity with structural diversity in legal texts.
The innovation addresses a subtle but critical problem: entity-induced retrieval bias. In legal documents, specific names, dates, and case references often dominate semantic representations, potentially obscuring the underlying logical reasoning patterns that matter legally. By explicitly mitigating this bias, the framework ensures that selected exemplars highlight generalizable legal principles rather than superficial textual similarities.
The introduction of the Legal2Proleg dataset represents a valuable resource for the legal AI community, providing aligned mappings between natural-language cases and PROLEG logical formulas. This enables reproducible evaluation and future research. The approach demonstrates practical efficiency: improved accuracy and stability without requiring expensive retraining on proprietary models, making it accessible across both open-source and commercial LLMs.
For legal tech developers and AI researchers, this framework reduces implementation barriers for logic-based legal reasoning systems. Enterprises building compliance, contract analysis, or legal research tools could adopt these techniques to improve interpretability and reliability. The work signals growing maturity in applying LLMs to specialized domains requiring formal logical reasoning.
- →Legal2LogicICL enables accurate natural-language-to-logic conversion using retrieval-augmented few-shot learning without model fine-tuning.
- →The framework explicitly addresses entity-induced bias in legal documents to surface meaningful reasoning patterns.
- →New Legal2Proleg dataset provides aligned legal case and logical formula pairs to benchmark legal semantic parsing.
- →Method demonstrates consistent improvements across both open-source and proprietary LLMs, improving generalization and stability.
- →Approach reduces data annotation requirements, making logic-based legal reasoning more accessible to enterprises.