Skip to content
Longterm Wiki
Back

LLM-Modulo framework

paper

Authors

Subbarao Kambhampati·Karthik Valmeekam·Lin Guan·Mudit Verma·Kaya Stechly·Siddhant Bhambri·Lucas Saldyt·Anil Murthy

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

This paper addresses the proper role of LLMs in planning and reasoning by proposing the LLM-Modulo framework, which clarifies how LLMs should be combined with symbolic reasoning systems—relevant to understanding AI capability limitations and safe system design.

Paper Details

Citations
1
9 influential
Year
2025
Methodology
peer-reviewed
Categories
2025 34th IEEE International Conference on Robot a

Metadata

arxiv preprintprimary source

Abstract

There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the problem specification from one syntactic format to another, and ship the problem off to external symbolic solvers. In this position paper, we take the view that both these extremes are misguided. We argue that auto-regressive LLMs cannot, by themselves, do planning or self-verification (which is after all a form of reasoning), and shed some light on the reasons for misunderstandings in the literature. We will also argue that LLMs should be viewed as universal approximate knowledge sources that have much more meaningful roles to play in planning/reasoning tasks beyond simple front-end/back-end format translators. We present a vision of {\bf LLM-Modulo Frameworks} that combine the strengths of LLMs with external model-based verifiers in a tighter bi-directional interaction regime. We will show how the models driving the external verifiers themselves can be acquired with the help of LLMs. We will also argue that rather than simply pipelining LLMs and symbolic components, this LLM-Modulo Framework provides a better neuro-symbolic approach that offers tighter integration between LLMs and symbolic components, and allows extending the scope of model-based planning/reasoning regimes towards more flexible knowledge, problem and preference specifications.

Summary

This position paper challenges both over-optimistic and over-pessimistic views of LLMs in planning and reasoning tasks. The authors argue that auto-regressive LLMs cannot independently perform planning or self-verification, but should not be reduced to mere format translators. Instead, they propose the LLM-Modulo Framework, a neuro-symbolic approach that combines LLMs as universal approximate knowledge sources with external model-based verifiers in tight bi-directional interaction. This framework enables LLMs to play more meaningful roles beyond simple front-end/back-end translation, while allowing external verifiers themselves to be acquired with LLM assistance.

Cited by 1 page

PageTypeQuality
Reasoning and PlanningCapability65.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202664 KB
# LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks

Subbarao Kambhampati

Karthik Valmeekam  Lin Guan  Kaya Stechly

Mudit VermaSiddhant BhambriLucas SaldytAnil Murthy

School of Computing & AI, Arizona State University
Corresponding author. Email: rao@asu.edu

###### Abstract

There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the problem specification from one syntactic format to another, and ship the problem off to external symbolic solvers. In this position paper, we take the view that both these extremes are misguided. We argue that auto-regressive LLMs cannot, by themselves, do planning or self-verification (which is after all a form of reasoning), and shed some light on the reasons for misunderstandings in the literature. We will also argue that LLMs should be viewed as universal approximate knowledge sources that have much more meaningful roles to play in planning/reasoning tasks beyond simple front-end/back-end format translators. We present a vision of LLM-Modulo Frameworks that combine the strengths of LLMs with external model-based verifiers in a tighter bi-directional interaction regime. We will show how the models driving the external verifiers themselves can be acquired with the help of LLMs. We will also argue that rather than simply pipelining LLMs and symbolic components, this LLM-Modulo Framework provides a better neuro-symbolic approach that offers tighter integration between LLMs and symbolic components, and allows extending the scope of model-based planning/reasoning regimes towards more flexible knowledge, problem and preference specifications.

## 1 Introduction

Large Language Models (LLMs), essentially n-gram models on steroids which have been trained on web-scale language corpora
(or, effectively, our collective consciousness), have caught the imagination of the AI research community with linguistic behaviors that no one expected text completion systems to possess. Their seeming versatility has led many researchers to wonder whether they can also do well on planning and reasoning tasks typically associated with System 2 competency. On the face of it, this doesn’t seem to ring true, as both by training and operation, LLMs are best seen as a giant pseudo System 1 Kahneman ( [2011](https://ar5iv.labs.arxiv.org/html/2402.01817#bib.bib22 "")) (see Figure [1](https://ar5iv.labs.arxiv.org/html/2402.01817#S1.F1 "Figure 1 ‣ 1 Introduction ‣ LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks")). Even from a pure engineering perspective, a system that takes constant time to produce the next token cannot possibly be doing principled reasoning on its own.111Think of askin

... (truncated, 64 KB total)
Resource ID: 5a8e0c175dc36497 | Stable ID: NTlhNGQ5NW