You can’t imitation-learn how to continual-learn

web

2026·LessWrong·lesswrong.com/posts/9rCTjbJpZB4KzqhiQ/you-can-t-imitation...

Author

Steven Byrnes

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: LessWrong

A LessWrong post exploring a theoretical limitation of imitation learning for instilling continual learning capabilities, relevant to debates about the adequacy of behavior-cloning-based alignment techniques.

Forum Post Details

Karma

176

Comments

Forum

lesswrong

Status

Curated

Forum Tags

Metadata

Importance: 52/100commentary

Summary

This LessWrong post argues that continual learning—the ability to learn new tasks without forgetting old ones—cannot be acquired through imitation learning alone. The author explains a fundamental limitation: an agent trained to mimic a continual learner would not internalize the underlying learning mechanisms, only the behavioral outputs. This has implications for AI alignment and training robust, adaptable AI systems.

Key Points

•Imitation learning captures behavioral outputs but cannot replicate the internal learning mechanisms that enable continual adaptation.
•A model trained on demonstrations of a continual learner will not itself become a continual learner—it lacks the underlying update process.
•This presents a challenge for alignment approaches that rely on behavior cloning or RLHF to instill robust long-term learning properties.
•Continual learning requires architectural or training-level interventions, not just exposure to examples of adaptive behavior.
•The argument highlights a gap between surface-level behavioral mimicry and deeper cognitive or learning capacities in AI systems.

Cached Content Preview

HTTP 200Fetched Apr 7, 202613 KB

# You can’t imitation-learn how to continual-learn
By Steven Byrnes
Published: 2026-03-16
In this post, I’m trying to put forward a narrow, pedagogical point, one that comes up mainly when I’m arguing in favor of LLMs having limitations that human learning does not. (E.g. [here](https://www.lesswrong.com/posts/ZJZZEuPFKeEdkrRyf/why-we-should-expect-ruthless-sociopath-asi?commentId=5js8FmkJPQMoQ6SWv), [here](https://x.com/steve47285/status/2031423263558054155), [here](https://www.lesswrong.com/posts/RRvdRyWrSqKW2ANL9/alignment-proposal-adversarially-robust-augmentation-and?commentId=fj3Css5KZMnhJCqaa).)

See the bottom of the post for a list of subtexts that you should NOT  read into this post, including “…therefore LLMs are dumb”, or “…therefore LLMs can’t possibly scale to superintelligence”.

Some intuitions on how to think about “real” continual learning
===============================================================

Consider an algorithm for training a Reinforcement Learning (RL) agent, like the [Atari-playing Deep Q network (2013)](https://arxiv.org/abs/1312.5602) or [AlphaZero (2017)](https://en.wikipedia.org/wiki/AlphaZero), or think of within-lifetime learning in the human brain, which ([I claim](https://www.lesswrong.com/posts/As7bjEAbNpidKx6LR/valence-series-1-introduction#1_2_Model_based_reinforcement_learning__RL_)) is in the general class of “model-based reinforcement learning”, broadly construed.

These are all real-deal full-fledged learning algorithms: there’s an algorithm for choosing the next action right now, and there’s one or more update rules for permanently changing some adjustable parameters (a.k.a. weights) in the model such that its actions and/or predictions will be better in the future. And indeed, the longer you run them, the more competent they get.

When we think of “continual learning”, I suggest that those are good central examples to keep in mind. Here are some aspects to note:

***Knowledge vs information:*** These systems allow for continual acquisition of *knowledge*, not just *information*—the “continual learning” can install wholly new ways of conceptualizing and navigating the world, not just keeping track of what’s going on.

***Huge capacity for open-ended learning:*** These examples all have huge capacity for continual learning, indeed enough that they can start from random initialization and “continually learn” all the way to expert-level competence. Likewise, new continual learning can build on previous continual learning, in an ever-growing tower.

***Ability to figure things out that aren’t already on display in the environment:*** For example, an Atari-playing RL agent will get better and better at playing an Atari game, even without having any expert examples to copy. Likewise, billions of humans over thousands of years invented language, math, science, and a whole $100T global economy from scratch, all by ourselves, without angels dropping new training data from the heavens.

I bring these up bec

... (truncated, 13 KB total)

Resource ID: 2e80c543f53d272a | Stable ID: sid_qiXTh9yJ3R