Skip to content
Longterm Wiki

Grades from external scorecards. We mirror published grades only; per-source methodology lives at the link in each panel header. See the scorecards directory for cross-org comparison.

SaferAI Ratings

by SaferAIPublished 2025-10-01Source ↗
Overall:Very Weak
Risk Analysis and Evaluation
Very Weak
Risk Governance
Very Weak
Risk Identification
Very Weak
Risk Treatment
Very Weak

Foundation Model Transparency Index

by Stanford CRFMPublished 2025-12-01Source ↗
Overall:39
Agent Protocols
100
AI bug bounty
100
Amount of usage
0
AUP enforcement frequency
0
AUP enforcement process
100
Basic model properties
0
Benchmarked inference
0
Benefits Assessment
0
Capabilities evaluation
100
Capabilities taxonomy
100
Carbon emissions for final training run
0
Change log
100
Classification of usage data
0
Code access
0
Compute hardware for final training run
0
Compute provider
100
Compute usage for final training run
0
Compute usage including R&D
0
Consumer/enterprise usage
0
Crawling
0
Data acquisition methods
100
Data domain composition
0
Data laborer practices
0
Data language composition
0
Data processing methods
100
Data processing purpose
0
Data processing techniques
0
Data replicability
0
Data retention and deletion policy
0
Data size
0
Deeper model properties
0
Detection of machine-generated content
100
Development duration for final training run
0
Distribution channels with usage data
100
Documentation for responsible use
100
Downstream
50
Acceptable use policy
80
Accountability
33.3
Downstream mitigations
100
Impact
0
Model Behavior Policy
75
Post-deployment monitoring
57.1
Usage data
20
Energy usage for final training run
0
Enterprise mitigations
100
Enterprise users
0
External data access
0
External developer mitigations
100
External products and services
0
External reproducibility of capabilities evaluation
0
External reproducibility of mitigations evaluation
0
External reproducibility of risks evaluation
0
External risk evaluation
100
Feedback mechanisms
0
Foundation model roadmap
100
Geographic statistics
0
Government commitments
0
Government use
0
Instructions for data generation
0
Intermediate tokens
100
Internal compute allocation
0
Internal product and service mitigations
100
Internal products and services
0
Licensed data compensation
0
Licensed data sources
0
Misuse incident reporting protocol
0
Mitigations efficacy
0
Mitigations taxonomy
100
Mitigations taxonomy mapped to risk taxonomy
100
Model
50
Capabilities
50
Model cost
0
Model dependencies
0
Model access
50
Model information
0
Model Mitigations
60
Model objectives
100
Release
75
Model response characteristics
100
Risks
40
Model stages
100
Model theft prevention measures
100
New human-generated data sources
0
Notice of usage data used in training
0
Open weights
0
Organization chart
0
Oversight mechanism
100
Permitted and prohibited users
100
Permitted, restricted, and prohibited model behaviors
100
Permitted, restricted, and prohibited uses
100
Post-deployment coordination with government
100
Pre-deployment risk evaluation
0
Public datasets
0
Quantization
0
Regional policy variations
100
Release stages
100
Researcher credits
0
Responsible disclosure policy
100
Risks evaluation
0
Risks taxonomy
100
Risk thresholds
100
Safe harbor
0
Security incident reporting protocol
100
Specialized access
100
Synthetic data purpose
100
Synthetic data sources
0
System prompt
0
Terms of use
100
Top distribution channels
100
Train-test overlap
0
Upstream
17.6
Compute
11.1
Data Acquisition
16.7
Data Processing
33.3
Data Properties
0
Methods
66.7
Other resources
0
Usage data used in training
0
Users of internal products and services
0
Versioning protocol
0
Water usage for final training run
0
Whistleblower protection
0
Grade trajectory (3 waves)
v1.2 December 2025latest
39
v1.1 May 2024
81.2
v1.0 October 2023
12