Finn et al. (2017)
paperAuthors
Credibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: arXiv
MAML (Model-Agnostic Meta-Learning) is a foundational algorithm enabling AI systems to learn quickly from few examples, which has implications for AI capabilities, generalization, and potential safety considerations around rapid adaptation to new tasks.
Paper Details
Metadata
Abstract
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.
Summary
This paper introduces Model-Agnostic Meta-Learning (MAML), an algorithm that enables models to learn from a variety of tasks such that they can quickly adapt to new tasks with minimal training data. The key innovation is explicitly optimizing model parameters to be easily fine-tunable—requiring only a few gradient steps on new tasks to achieve good generalization. The approach is model-agnostic and applicable across diverse domains including few-shot image classification, regression, and reinforcement learning, demonstrating state-of-the-art results on standard benchmarks.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Mesa-Optimization Risk Analysis | Analysis | 61.0 |
Cached Content Preview
# Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
###### Abstract
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.
meta-learning, deep learning
## 1 Introduction
Learning quickly is a hallmark of human intelligence, whether it involves recognizing objects from a few examples or quickly learning new skills
after just minutes of experience. Our artificial agents should be able to do the same, learning and adapting quickly from only a few examples, and continuing to adapt as more data becomes available. This kind of fast and flexible learning is challenging, since the agent must integrate its prior experience with a small amount of new information, while avoiding overfitting to the new data. Furthermore, the form of prior experience and new data will depend on the task. As such, for the greatest applicability, the mechanism for learning to learn (or meta-learning) should be general to the task and the form of computation required to complete the task.
In this work, we propose a meta-learning algorithm that is general and model-agnostic, in the sense that it can be directly applied to any learning problem and model that is trained with a gradient descent procedure. Our focus is on deep neural network models, but we illustrate how our approach can easily handle different architectures and different problem settings, including classification, regression, and policy gradient reinforcement learning, with minimal modification.
In meta-learning, the goal of the trained model is to quickly learn a new task from a small amount of new data, and the model is
trained by the meta-learner to be able to learn on a large number of different tasks.
The key idea underlying our method is to
train the model’s initial parameters such that the model has maximal performance on a new task after the parameters have been updated
through one or more gradient steps computed with a small amount of data from that new task.
Unlike prior meta-learning methods that learn an update function
... (truncated, 81 KB total)a066c84493de99f3 | Stable ID: NGI2YWE0NW