Center for AI Safety — publication: The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning — benchmark for evaluating dual-use AI capabilities in biosecurity, cybersecurity, and chemical weapons
1 evidence check
Last checked: 3/31/2026
The claim has two main components: (1) the publication title, scope, and date are confirmed by the source; (2) the attribution to CAIS is not addressed in the provided source text. The source shows this is an arXiv paper with 56 authors but does not explicitly identify it as a CAIS publication. While CAIS may be involved (given the nature of the work and some author names like Alexandr Wang who is associated with CAIS), the source excerpt does not confirm institutional affiliation. This is a partial confirmation because the publication details are correct but the organizational attribution cannot be verified from the provided text.
Evidence — 1 source, 1 check
Note: The claim has two main components: (1) the publication title, scope, and date are confirmed by the source; (2) the attribution to CAIS is not addressed in the provided source text. The source shows this is an arXiv paper with 56 authors but does not explicitly identify it as a CAIS publication. While CAIS may be involved (given the nature of the work and some author names like Alexandr Wang who is associated with CAIS), the source excerpt does not confirm institutional affiliation. This is a partial confirmation because the publication details are correct but the organizational attribution cannot be verified from the provided text.
Debug info
Record type: fact
Record ID: f_apd264NL5h