Skip to content
Longterm Wiki
Back

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: UK AI Safety Institute

Published by the UK AI Safety Institute (now AI Security Institute), this post is an authoritative government source on frontier model evaluation methodology and represents a key pillar of the UK's approach to pre-deployment AI safety assessment.

Metadata

Importance: 62/100blog postprimary source

Summary

The UK AI Safety Institute shares early findings and methodology from its evaluations of frontier AI models, covering how they assess potentially dangerous capabilities including cybersecurity risks, CBRN threats, and autonomous behavior. The post outlines the AISI's approach to pre-deployment evaluations and the practical challenges encountered when testing leading AI systems.

Key Points

  • AISI conducts pre-deployment evaluations of frontier AI models to assess dangerous capability thresholds before public release.
  • Evaluations cover domains including cybersecurity, CBRN (chemical, biological, radiological, nuclear) risks, and autonomous AI behavior.
  • Early lessons highlight methodological challenges in designing rigorous, reproducible capability evaluations for rapidly advancing models.
  • The institute works with frontier AI developers to access models prior to deployment, establishing a government-led evaluation framework.
  • Results inform UK government policy and contribute to international coordination on AI safety standards and red-teaming practices.

Cited by 5 pages

Cached Content Preview

HTTP 200Fetched Mar 15, 202631 KB
Early lessons from evaluating frontier AI systems | AISI Work 

 

 Read the Frontier AI Trends Report Please enable javascript for this website. 
 A 
 
 A 
 Careers 
 
 
 
 Blog 
 
 Organisation Early lessons from evaluating frontier AI systems

 We look into the evolving role of third-party evaluators in assessing AI safety, and explore how to design robust, impactful testing frameworks.

 — Oct 24, 2024 Note to readers: we changed our name to the AI Security Institute on 14 February 2025. Read more here. 

 Introduction 

 The AI Safety Institute's (AISI) mission is to equip governments with an empirical understanding of the safety of frontier AI systems. To achieve this, we design and run evaluations that aim to measure the capabilities of AI systems that may pose risks.  

 Since November 2023, we have built one of the largest safety evaluation teams globally with a leadership team of world class researchers, established a Memorandum of Understanding (MoU) and begun to operationalise joint testing with the US AISI. We have also published an open-source framework for evaluations , reported our results of testing 4 large language models from major companies, and conducted several pre-deployment tests. 

 But this is just the beginning. Model evaluations are a new and rapidly evolving science, and we routinely iterate on our approach. We will continue to grow our team and build our capacity to address challenges and open scientific questions. If you are excited by this, please join us ! 

 This blog sets out our thinking to date on how to design and run third-party evaluations, including key elements to consider and open questions. This is not intended to provide robust recommendations; rather we want to start a conversation in the open about these practices and to learn from others.  

 We discuss the role of third-party evaluators and what they could target for testing, including which systems to test, when to test them, and which risks and capabilities to test for. We also examine how to evaluate effectively, including which tests to use for which purpose, how to develop robust testing, and how to ensure safety and security. 

 1. What is the role of third-party evaluators?   

 Independent evaluations of frontier AI systems can provide a snapshot of a given system’s capabilities and vulnerabilities. Whilst our sense is that the science is too nascent for independent evaluations to act as a ‘certification’ function (i.e. provide confident assurances that a particular system is ‘safe’), they are a critical part of incentivising best efforts at improving safety. Independent evaluations – completed by governments or other third parties - provide key benefits to AI companies, governments, and the public, including: 

 Providing an independent source of verification of claims made by companies about capabilities and safety : third-party evaluations will often have legitimacy through their independence and absence of conflicts of interest, in

... (truncated, 31 KB total)
Resource ID: 0fd3b1f5c81a37d8 | Stable ID: MjdhNzk5Mz