Skip to content
Longterm Wiki
Back

RL agents

paper

Authors

Andrew Kyle Lampinen·Stephanie C Y Chan·Ishita Dasgupta·Andrew J Nam·Jane X Wang

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

Investigates how RL agents can learn causal reasoning from passive data, addressing safety concerns about agent learning and generalization in interactive domains like tool use.

Paper Details

Citations
25
0 influential
Year
2023

Metadata

arxiv preprintprimary source

Abstract

What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However, we show that purely passive learning can in fact allow an agent to learn generalizable strategies for determining and using causal structures, as long as the agent can intervene at test time. We formally illustrate that learning a strategy of first experimenting, then seeking goals, can allow generalization from passive learning in principle. We then show empirically that agents trained via imitation on expert data can indeed generalize at test time to infer and use causal links which are never present in the training data; these agents can also generalize experimentation strategies to novel variable sets never observed in training. We then show that strategies for causal intervention and exploitation can be generalized from passive data even in a more complex environment with high-dimensional observations, with the support of natural language explanations. Explanations can even allow passive learners to generalize out-of-distribution from perfectly-confounded training data. Finally, we show that language models, trained only on passive next-word prediction, can generalize causal intervention strategies from a few-shot prompt containing examples of experimentation, together with explanations and reasoning. These results highlight the surprising power of passive learning of active causal strategies, and may help to understand the behaviors and capabilities of language models.

Summary

This paper investigates how agents can learn causal reasoning and experimentation strategies from purely passive data, despite the inherent limitations of passive learning. The authors demonstrate both theoretically and empirically that agents trained via imitation on passive expert data can generalize at test time to infer causal relationships and devise experimentation strategies for novel scenarios never seen during training. Notably, they show that language models trained only on next-word prediction can acquire causal intervention strategies through few-shot prompting with explanations, suggesting that passive learning contains sufficient information for active causal reasoning when combined with appropriate inference mechanisms.

Cited by 1 page

PageTypeQuality
Deceptive Alignment Decomposition ModelAnalysis62.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202698 KB
# Passive learning of active causal strategies in agents and language models

Andrew K. Lampinen

Google DeepMind

London, UK

lampinen@deepmind.com

&Stephanie C. Y. Chan

Google DeepMind

London, UK

scychan@deepmind.com

&Ishita Dasgupta

Google DeepMind

London, UK

idg@deepmind.com

Andrew J. Nam

Stanford University

Stanford, CA

ajhnam@stanford.edu

&Jane X. Wang

Google DeepMind

London, UK

wangjane@deepmind.com

###### Abstract

What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However, we show that purely passive learning can in fact allow an agent to learn generalizable strategies for determining and using causal structures, as long as the agent can intervene at test time. We formally illustrate that, under certain assumptions, learning a strategy of first experimenting, then seeking goals, can allow generalization from passive learning in principle. We then show empirically that agents trained via imitation on expert data can indeed generalize at test time to infer and use causal links which are never present in the training data; these agents can also generalize experimentation strategies to novel variable sets never observed in training.
We then show that strategies for causal intervention and exploitation can be generalized from passive data even in a more complex environment with high-dimensional observations, with the support of natural language explanations. Explanations can even allow passive learners to generalize out-of-distribution from otherwise perfectly-confounded training data. Finally, we show that language models, trained only on passive next-word prediction, can generalize causal intervention strategies from a few-shot prompt containing examples of experimentation, together with explanations and reasoning. These results highlight the surprising power of passive learning of active causal strategies, and may help to understand the behaviors and capabilities of language models.

## 1 Introduction

Learning from passive observational data only allows learning correlational, not causal, structure. This observation is sometimes cited as a fundamental limitation of current machine learning research \[ [60](https://ar5iv.labs.arxiv.org/html/2305.16183#bib.bib60 ""), [61](https://ar5iv.labs.arxiv.org/html/2305.16183#bib.bib61 ""), [39](https://ar5iv.labs.arxiv.org/html/2305.16183#bib.bib39 "")\]. However, reinforcement learning (RL) agents can intervene on their environment, and are therefore not entirely limited. Indeed, various works have shown that RL agents can (meta-)learn to intervene on the environment to discover and exploit its causal structure \[ [50](https://ar5iv.labs.arxiv.org/html/2305.16183#bib.bib50 ""), [14](https://ar5iv.labs.arxiv.org/html/2305.16183#bib.bib14 ""), [41](https://ar5iv.labs.arxiv.org/html/2305.16183#bib.

... (truncated, 98 KB total)
Resource ID: bf34410b4b3a23c6 | Stable ID: MDcyMzBhZj