Skip to content
Longterm Wiki
Back

[2210.03629] ReAct: Synergizing Reasoning and Acting in Language Models

paper

Authors

Shunyu Yao·Jeffrey Zhao·Dian Yu·Nan Du·Izhak Shafran·Karthik Narasimhan·Yuan Cao

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

ReAct is a seminal paper establishing the reasoning-plus-action paradigm central to modern LLM agent systems; relevant to AI safety discussions around agent reliability, hallucination reduction, and interpretability of autonomous AI behavior.

Paper Details

Citations
6,649
784 influential
Year
2022

Metadata

Importance: 82/100arxiv preprintprimary source

Abstract

While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e.g. chain-of-thought prompting) and acting (e.g. action plan generation) have primarily been studied as separate topics. In this paper, we explore the use of LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two: reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external sources, such as knowledge bases or environments, to gather additional information. We apply our approach, named ReAct, to a diverse set of language and decision making tasks and demonstrate its effectiveness over state-of-the-art baselines, as well as improved human interpretability and trustworthiness over methods without reasoning or acting components. Concretely, on question answering (HotpotQA) and fact verification (Fever), ReAct overcomes issues of hallucination and error propagation prevalent in chain-of-thought reasoning by interacting with a simple Wikipedia API, and generates human-like task-solving trajectories that are more interpretable than baselines without reasoning traces. On two interactive decision making benchmarks (ALFWorld and WebShop), ReAct outperforms imitation and reinforcement learning methods by an absolute success rate of 34% and 10% respectively, while being prompted with only one or two in-context examples. Project site with code: https://react-lm.github.io

Summary

ReAct introduces a prompting paradigm that interleaves reasoning traces with task-specific actions in LLMs, enabling them to use external tools (e.g., Wikipedia API) while reasoning. This approach reduces hallucination and error propagation compared to chain-of-thought alone, and outperforms imitation/reinforcement learning baselines on interactive decision-making benchmarks by large margins.

Key Points

  • Interleaves 'think' steps with 'act' steps, allowing LLMs to gather external information mid-reasoning rather than relying solely on parametric knowledge.
  • On HotpotQA and FEVER, ReAct reduces hallucination by grounding reasoning in Wikipedia API lookups, improving factual accuracy.
  • On ALFWorld and WebShop, ReAct outperforms imitation and reinforcement learning methods by 34% and 10% absolute success rate with only 1-2 in-context examples.
  • Produces interpretable, human-like task-solving trajectories that improve trustworthiness and allow easier diagnosis of model errors.
  • Foundational precursor to modern LLM agent frameworks (e.g., LangChain agents, tool-use pipelines), making it highly influential in agentic AI research.

Cited by 3 pages

PageTypeQuality
Long-Horizon Autonomous TasksCapability65.0
Heavy Scaffolding / Agentic SystemsConcept57.0
Light ScaffoldingCapability53.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202698 KB
# \\model: Synergizing Reasoning and Acting in Language Models

Shunyu Yao
Work during Google internship. Projet page with code: [https://react-lm.github.io/](https://react-lm.github.io/ "").

{shunyuy,karthikn}@princeton.eduJeffrey Zhao
{jeffreyzhao,dianyu,dunan,izhak,yuancao}@google.comDian Yu
{jeffreyzhao,dianyu,dunan,izhak,yuancao}@google.comNan Du
{jeffreyzhao,dianyu,dunan,izhak,yuancao}@google.comIzhak Shafran
{jeffreyzhao,dianyu,dunan,izhak,yuancao}@google.comKarthik Narasimhan
{shunyuy,karthikn}@princeton.eduYuan Cao
{jeffreyzhao,dianyu,dunan,izhak,yuancao}@google.com

###### Abstract

While large language models (LLMs) have demonstrated impressive performance across tasks in language understanding and interactive decision making, their abilities for reasoning (e.g. chain-of-thought prompting) and acting (e.g. action plan generation) have primarily been studied as separate topics.
In this paper, we explore the use of LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two: reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with and gather additional information from external sources such as knowledge bases or environments.
We apply our approach, named \\model, to a diverse set of language and decision making tasks and demonstrate its effectiveness over state-of-the-art baselines in addition to improved human interpretability and trustworthiness.
Concretely, on question answering (HotpotQA) and fact verification (Fever), \\model overcomes prevalent issues of hallucination and error propagation in chain-of-thought reasoning
by interacting with a simple Wikipedia API, and generating human-like task-solving trajectories that are more interpretable than baselines without reasoning traces.
Furthermore, on two interactive decision making benchmarks (ALFWorld and WebShop), \\model outperforms imitation and reinforcement learning methods by an absolute success rate of 34% and 10% respectively, while being prompted with only one or two in-context examples.

## 1 Introduction

A unique feature of human intelligence is the ability to seamlessly combine task-oriented actions with verbal reasoning (or inner speech,  Alderson-Day & Fernyhough, [2015](https://ar5iv.labs.arxiv.org/html/2210.03629#bib.bib3 "")), which has been theorized to play an important role in human cognition for enabling self-regulation or strategization (Vygotsky, [1987](https://ar5iv.labs.arxiv.org/html/2210.03629#bib.bib31 ""); Luria, [1965](https://ar5iv.labs.arxiv.org/html/2210.03629#bib.bib20 ""); Fernyhough, [2010](https://ar5iv.labs.arxiv.org/html/2210.03629#bib.bib10 "")) and maintaining a working memory (Baddeley, [1992](https://ar5iv.labs.arxiv.org/html/2210.03629#bib.bib4 "")).
Consider the example of cooking up a dish in the kitchen. Between any two specific actions, we may reason in language in order to tra

... (truncated, 98 KB total)
Resource ID: 7647307fe49844a0 | Stable ID: Mzg5MjM0MD