Skip to content
Longterm Wiki
Back

Deepfake Detection Challenge Dataset

web
deepfakedetectionchallenge.ai·deepfakedetectionchallenge.ai

A key benchmark dataset for AI-generated media detection research; relevant to AI safety discussions around synthetic media misuse, detection capabilities, and evaluation of countermeasures against harmful deepfake technology.

Metadata

Importance: 55/100dataset

Summary

The Deepfake Detection Challenge (DFDC) Dataset, released by Meta/Facebook AI in 2020, is a large-scale benchmark dataset of over 124,000 videos designed to accelerate research in detecting AI-generated manipulated media. Created in partnership with industry and academic leaders, it features videos with multiple facial modification algorithms applied to paid actors. The dataset was used in a Kaggle competition and is publicly available to support ongoing deepfake detection research.

Key Points

  • Full dataset contains 124,000 videos featuring eight different facial modification algorithms, with a smaller 5k-video preview dataset also available.
  • Created by Facebook/Meta AI in collaboration with industry and academic partners as part of the Deepfake Detection Challenge launched in September 2019.
  • Used in a Kaggle competition to benchmark and develop new deepfake detection models from researchers worldwide.
  • Dataset was ethically created using paid actors who consented to the use and manipulation of their likenesses.
  • Requires AWS account setup with IAM credentials to access; associated research papers available on arXiv (2006.07397 and 1910.08854).

Cited by 1 page

PageTypeQuality
AI-Enabled Historical RevisionismRisk43.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202619 KB
[![Meta](https://scontent-lax3-1.xx.fbcdn.net/v/t39.8562-6/252294889_575082167077436_6034106545912333281_n.svg/meta-logo-primary_standardsize.svg?_nc_cat=108&ccb=1-7&_nc_sid=e280be&_nc_ohc=7TgIhmzWV40Q7kNvwEi7WQE&_nc_oc=AdqCEayPR9D-e8vxlixIfTHcwcel0jzXIHs3SaTFIJKChCCYp2_Upb5_iJdJ6CktNJk&_nc_zt=14&_nc_ht=scontent-lax3-1.xx&_nc_gid=oyFjJ9sgGaXYvFV4AK3mtA&_nc_ss=7a30f&oh=00_AfxPXc9cGUMJBzT6P_Lfyweiv9R4CpY8WeSUt5McLauRGw&oe=69C28839)](https://ai.meta.com/datasets/dfdc/#)

[Meta AI](https://ai.meta.com/datasets/dfdc/#)
[AI Research](https://ai.meta.com/datasets/dfdc/#)
[The Latest](https://ai.meta.com/blog/)
[About](https://ai.meta.com/datasets/dfdc/#)
[Get Llama](https://www.llama.com/?utm_source=ai_meta_site&utm_medium=web&utm_content=AI_nav&utm_campaign=09252025_moment)
[Try Meta AI](https://www.meta.ai/?utm_source=ai_meta_site&utm_medium=web&utm_content=AI_nav&utm_campaign=02022026_moment)

#### JUNE 25, 2020

# Deepfake Detection Challenge Dataset

The Deepfake Detection Challenge Dataset is designed to measure progress on deepfake detection technology.

[Download the Dataset](https://dfdc.ai/) [Download the Paper](https://arxiv.org/abs/2006.07397) [Read the Article](https://ai.meta.com/blog/deepfake-detection-challenge-results-an-open-initiative-to-advance-ai/)

## Overview

We partnered with other industry leaders and academic experts in September 2019 to create the Deepfake Detection Challenge (DFDC) in order to accelerate development of new ways to detect deepfake videos. In doing so, we created and shared a unique new dataset for the challenge consisting of more than 100,000 videos. The DFDC has enabled experts from around the world to come together, benchmark their deepfake detection models, try new approaches, and learn from each others’ work.

The DFDC dataset consists of two versions:

- Preview dataset


  - 5k videos

  - Featuring two facial modification algorithms

  - Associated [research paper](https://arxiv.org/abs/1910.08854/)


- Full dataset


  - 124k videos

  - Featuring eight facial modification algorithms

  - Associated [research paper](https://arxiv.org/abs/2006.07397/)


This full dataset was used by participants during a [Kaggle competition](https://www.kaggle.com/c/deepfake-detection-challenge/) to create new and better models to detect manipulated media. The dataset was created by Facebook with paid actors who entered into an agreement to the use and manipulation of their likenesses in our creation of the dataset.

We hope that by making this dataset available outside the challenge, the research community will continue to accelerate progress on detecting harmful manipulated media.

Facebook AI’s work in this space can be found in this [blog post](https://ai.meta.com/blog/deepfake-detection-challenge-results-an-open-initiative-to-advance-ai/) for more information.

If using this dataset, please cite the paper associated with the relevant dataset (preview/full):

@misc{DFDC2019Preview,

title={The Deepfake Detection Ch

... (truncated, 19 KB total)
Resource ID: 4d7d6773b35b5278 | Stable ID: ODhkYzIwNz