External Scorecards

Side-by-side overall grades from the five major external AI-safety scorecards. Click any row to view per-dimension grades on the organization's profile.

Scorecards

Sources w/ Data

Organizations

Cells

Overall grades — latest wave per scorecard

Organization	FLI Index	SaferAI	AI Lab Watch	FMTI	Seoul Tracker
AI21 Labs	—	—	—	66	—
Alibaba Cloud	D-	—	—	26	—
Amazon	—	Very Weak	—	39	Fulfilled
Anthropic	C+	Weak	Weak	46	Fulfilled
Cohere	—	Very Weak	—	—	Partial
DeepSeek	D	—	Very Weak	32	—
G42	—	Very Weak	—	—	Fulfilled
Google	—	—	—	41	Fulfilled
Google DeepMind	C	Very Weak	Very Weak	—	—
IBM	—	—	—	95	Unfulfilled
Inflection AI	—	—	—	—	Unfulfilled
Magic	—	Very Weak	—	—	—
Meta AI (FAIR)	D	Very Weak	Very Weak	31	Partial
Microsoft AI	—	Very Weak	Very Weak	—	Fulfilled
Midjourney	—	—	—	14	—
Mistral	—	—	—	18	Unfulfilled
Naver	—	Very Weak	—	—	Partial
NVIDIA	—	Very Weak	—	—	—
OpenAI	C+	Weak	Very Weak	35	Fulfilled
Samsung Electronics	—	—	—	—	Unfulfilled
Technology Innovation Institute	—	—	—	—	Unfulfilled
Writer	—	—	—	72	—
xAI	D	Very Weak	Very Weak	14	Partial
Z.ai	D	—	—	—	—
Z.ai	D	—	—	—	Unfulfilled

Methodology

FLI AI Safety Index

by Future of Life Institute

Letter-grade ratings of frontier AI labs across six safety domains: risk assessment, current harms, safety frameworks, existential safety, governance, and information sharing.

Source ↗Methodology ↗Latest: Winter 2025License: fair-use-citation

SaferAI Ratings

by SaferAI

Continuously-updated risk-management ratings for frontier developers across four pillars: risk identification, analysis & evaluation, treatment, and governance.

Source ↗Methodology ↗Latest: October 2025License: CC BY-SA 4.0

AI Lab Watch

by Zach Stein-PerlmanNo longer maintained

Weighted scorecard across seven categories (risk assessment, scheming prevention, safety research, misuse prevention, security, info sharing, planning).

Source ↗Methodology ↗Latest: September 2025 (frozen)License: fair-use-citation

Foundation Model Transparency Index

by Stanford CRFM

Transparency-focused index scoring developers on 100 indicators across upstream resources, the model itself, and downstream use.

Source ↗Methodology ↗Latest: v1.2 December 2025License: CC-BY-4.0

Seoul Commitment Tracker

by The Midas Project

Tracks adherence to the Seoul Frontier AI Safety Commitments via 'Fulfilled / Partial / Unfulfilled' verdicts on five red-line components.

Source ↗Methodology ↗Latest: February 2025License: fair-use-citation

Grades are mirrored from upstream sources. We do not score organizations ourselves — see each row's source link for methodology and full per-dimension grades. Citation of published grades is consistent with fair-use attribution.