ByteDance Researchers Introduce ProtoReasoning: Enhancing LLM Generalization via Logic-Based Prototypes

Source: MarkTechPost

Why Cross-Domain Reasoning Matters in Large Language Models (LLMs)

Recent breakthroughs in LRMs, especially those trained using Long CoT techniques, show they can generalize impressively across different domains. Interestingly, models trained on tasks such as math or coding often perform well in unrelated areas, like logical puzzles or creative writing. However, what enables this flexibility isn’t fully clear. One possible explanation is that these models learn core reasoning patterns, known as abstract reasoning prototypes, which cut across domains. These shared cognitive structures enable the model to focus less on how problems are presented and more on the similar thought processes required to solve them, allowing for broader transfer.

From CoT to RL: A Shift in How LLMs Learn to Reason

Recent progress in large language model reasoning has shifted from simple CoT and supervised fine-tuning to RL. Models like DeepSeek-R1 and Seed-Thinking-v1.5 have enhanced Long CoT reasoning through mathematical problems, logic tasks, and code execution. These models utilize RL techniques guided by verifiable rewards, such as accuracy from ground-truth answers, to explore complex reasoning paths. This approach enables models to learn from errors, break down complex problems, and refine solutions through iteration. In contrast to past methods, this work introduces the concept of “reasoning prototypes” to understand better the core thinking patterns that enable models to generalize across vastly different domains.

ProtoReasoning Framework: Structured Reasoning with Prolog and PDDL

Researchers from ByteDance Seed and Shanghai Jiao Tong University have developed ProtoReasoning, a framework designed to enhance reasoning in large language models by utilizing structured prototype representations, such as Prolog and PDDL. This system includes an automated pipeline to translate problems into these formats, a reliable verification setup using interpreters, and scalable problem synthesis without manual labeling. The models trained on these prototypes demonstrated notable improvements across various tasks, including logical reasoning (+4.7%), planning (+6.3%), general reasoning (+4.0%), and math (+1.0%). Crucially, training within this structured “prototype space” led to better generalization across similar tasks, supporting the idea that abstract reasoning patterns enhance cross-domain performance.

Architecture Overview: Prototype Constructor and Verifier System

The ProtoReasoning framework boosts reasoning in LLMs by using structured prototypes, Prolog for logic, and PDDL for planning. It includes two core modules: a Prototype Constructor that translates natural language problems into formal representations, and a Verification System that checks solution correctness. For Prolog, a four-step pipeline generates diverse logic problems, which are verified using SWI-Prolog. For planning, tasks such as plan generation, Completion, and Reordering are built using PDDL, with correctness checked via the VAL validator. The training process includes teacher model distillation for reasoning paths, difficulty-based sampling, and filtering to ensure only high-quality data fine-tunes the model for robust generalization.

Evaluations Show Measurable Improvements in Reasoning and Planning

The ProtoReasoning framework was evaluated through experiments using a 150B parameter Mixture-of-Experts model (15B active), trained on a curated set of high-quality Prolog and PDDL samples. Results showed consistent improvements across logical reasoning, planning, and general benchmarks, including MMLU and AIME 2024. A key ablation study compared Prolog-based training with NL versions on matched datasets. Both formats significantly outperformed the baseline, with Prolog achieving near-equal performance to NL. This demonstrates that structured prototype training can be applied to natural language tasks. However, explicit reasoning (e.g., chain-of-thought) is crucial, and low-sample categories showed weaker gains due to insufficient data.

Key Findings and Theoretical Implications of Reasoning Prototypes

In conclusion, ProtoReasoning, a framework built on the idea that abstract reasoning prototypes like Prolog for logic and PDDL for planning enable large language models to generalize across domains. By training models on these structured representations, the study observed notable improvements in logical reasoning, planning, and general problem-solving tasks. The results support the hypothesis that shared reasoning patterns across domains facilitate knowledge transfer in models. While the empirical results are promising, the exact nature of reasoning prototypes remains theoretically underexplored. Future work will aim to formalize these concepts mathematically and validate findings using open-source models and datasets.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

Sana Hassan

Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.