David Dalrymple

Person

David Dalrymple

A comprehensive biographical profile of David Dalrymple covering his role directing ARIA's £59M Safeguarded AI programme, his technical approach to formal verification for safety-critical AI, and his views on AI timelines. Content is well-organized and concrete but has significant citation gaps—footnotes are vague research notes rather than verifiable primary sources, and several key biographical and economic claims lack proper sourcing.

RoleProgramme Director, ARIA Safeguarded AI

Known ForProvably safe AI agenda, Guaranteed Safe AI framework, ARIA Safeguarded AI programme

ProfileView profile page

Websitedavidad.org

Approaches

1.9k words · 2 backlinks

Quick Assessment

Attribute	Details
Full Name	David A. Dalrymple ("davidad")
Current Role	Programme Director, Advanced Research and Invention Agency (ARIA)
Programme	Safeguarded AI (\£59M R&D programme)
Previous Role	Research Fellow, Oxford University Future of Humanity Institute (2021–2023)
Key Focus	Formal verification, provably safe AI, mathematical guarantees for critical infrastructure
Notable Work	Safeguarded AI framework, flexHEG, C. elegans nervous system simulation, Filecoin co-invention
Key Collaborators	Yoshua Bengio, Stuart Russell, Max Tegmark, Sanjit Seshia (co-authors); Evan Hubinger (MATS co-mentor)

Key Links

Source	Link
Personal website	davidad.org
CV	davidad.org/davidad_cv.pdf

Overview

David Dalrymple, widely known online as "davidad," is a researcher and programme director whose work sits at the intersection of formal verification, AI safety, and applied mathematics. Since 2023 he has served as Programme Director at the UK's Advanced Research and Invention Agency (ARIA), where he directs a \£59 million R&D programme on Safeguarded AI and the broader Mathematics for Safe AI initiative. His core thesis is that AI systems deployed in safety-critical domains — such as electricity grids, climate modeling, and critical infrastructure — require formal mathematical guarantees as a complement to empirical reliability assessments.

Before joining ARIA, Dalrymple was a Research Fellow at Oxford University's Future of Humanity Institute (2021–2023), where he worked on metaethics for AI alignment and sociotechnical safety engineering. His background spans a wide range of fields: he studied computer architecture and programming languages at MIT, then biophysics and neuroinformatics at Harvard, and has published on topics ranging from Caenorhabditis elegans nervous system simulation to applied category theory and compositional machine learning semantics. He is also a co-inventor of Filecoin, a blockchain-based decentralized storage protocol.

Within the AI safety community, Dalrymple's approach — sometimes called Provably Safe AI — has been cited in technical discussions, with his Open Agency Architecture discussed on LessWrong as a detailed technical framework for AI safety requirements. He has also expressed public concern that rapid AI capability progress may outpace the development of adequate safety measures, particularly for systems that could come to control critical infrastructure.

Background and Early Career

Dalrymple's educational path was unconventional. According to available biographical sources, he completed a B.S. in Computer Science at the University of Maryland, Baltimore County (2000–2005), with a focus on algorithms and data structures, before studying computer architecture and programming languages at MIT's Media Lab and subsequently pursuing doctoral work in biophysics and neuroinformatics at Harvard — though he did not complete a PhD at either institution.¹

An early significant research contribution was leading an international collaboration on imaging and simulation of the nervous system of C. elegans, the smallest known nervous system. This work, funded by personal grants from Google DeepMind co-founder Larry Page and PayPal co-founder Peter Thiel, was published in Nature Methods.¹ The project combined advanced microscopy with biophysical simulation techniques and produced a detailed nervous system simulation.

Dalrymple also received a Thiel Foundation research grant in 2012 for work on brain analysis. His tech industry experience includes a senior software engineering role at Twitter, where he worked on cache systems, and research and co-invention work at Protocol Labs, where he co-invented Filecoin and contributed to research on public goods funding and distributed storage. During this period he also worked on dioptics — a mathematical framework unifying open games from game theory and gradient-based learners from machine learning — presenting this work at SYCO 5.²

He has held advisory roles including membership on the Industrial Advisory Board of the University of Birmingham School of Computer Science and participation in a joint China–U.S.–Australia AI governance panel, including as an invited delegate to a Meeting of Eminent Thinkers on AI Governance at Tsinghua University in 2019.¹

AI Safety Work

The Safeguarded AI Framework

The central thrust of Dalrymple's current work is what he calls Safeguarded AI — an approach that distinguishes between different phases of AI capability development and tailors safety strategies accordingly. Rather than attempting to guarantee safety across open-ended or unbounded contexts — which he regards as extremely difficult given the current state of the science — Safeguarded AI focuses on generating formal safety guarantees within well-defined, scoped contexts of use, drawing on methods from safety-critical engineering.³

The framework incorporates multiple layers: formal verification of algorithmic behavior, containment measures, and hardware-level security protections. A key output of this work is flexHEG (flexible hardware-enabled guarantees), which addresses how safety properties can be embedded into computing hardware itself — ensuring confidentiality, integrity, and availability without creating surveillance vulnerabilities. This work was published in 2025 as "Flexible Hardware-Enabled Guarantees for AI Compute," co-authored with J. Petrie, O. Aarne, and N. Ammann.⁴

One concrete application domain for the ARIA programme is electricity grid optimization. According to programme descriptions, deploying verified AI algorithms in this domain could potentially save on the order of \£3 billion annually in excess capacity costs in the UK alone — though Dalrymple himself notes the difficulty of such economic projections. The ARIA programme deliberately excludes frontier AI models from its near-term deployment targets, instead focusing on AI systems whose behavior can be formally analyzed.⁵

Formal Verification and Mechanistic Interpretability

Dalrymple has contributed to the broader AI safety research landscape through collaboration on frameworks for robust and reliable AI systems. In May 2024, he co-authored "Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems" with J. Skalse and others, published through Redwood Research.⁴ He is also listed among contributors to "Open Problems in Interpretability," published in January 2025.⁴

His technical expertise spans applied category theory, interactive theorem proving, modal and substructural logic, measure-theoretic probability, stochastic processes, control theory, decision theory, game theory, and mechanism design — a broad formal toolkit spanning multiple mathematical disciplines.¹

Views on AI Timelines and Risk

Dalrymple has been notably cautious about making specific timeline predictions, emphasizing radical uncertainty in AI development. However, he has made several substantive public observations about the pace and character of AI progress. He has noted that over a recent period, multiple frontier AI models quantitatively exceeded evaluations by METR and his own personal expectations across several successive model releases.⁶ He describes this as a pattern of AI progress outpacing forecasts, and has used it to motivate urgency in safety work.

He has argued that the "middle period" of AI development — in which systems are smarter than top humans but not yet superintelligent — may be compressed or squeezed short by rapid scaling, potentially leading quickly to superintelligence without adequate time to develop safety measures. He has also publicly warned, including in coverage in The Guardian, that the science needed to ensure full AI reliability may not emerge quickly enough given economic pressures driving rapid capability deployment.⁷

He has discussed the potential economic stakes of this transition, framing it in terms of a possible 10% GDP growth from narrow AI systems weighed against the risks of a rapid and uncontrolled transition to superintelligence. A November 2025 pivot in the ARIA programme — shifting toward a broader foundational toolkit — was attributed in part to faster-than-expected frontier AI progress.⁶

Community Reception

Within AI safety and effective altruism communities, Dalrymple's work has been cited and discussed. His Open Agency Architecture has been discussed on LessWrong as a detailed technical framework for AI safety requirements, though some community members have continued to seek more precise formal definitions of concepts such as "boundaries" in the architecture.⁸ He served as a mentor in the MATS Winter 2023–24 program alongside researchers such as Evan Hubinger and Owain Evans.⁸ He has received funding through Manifund (a $30,000 grant for "Activation vector steering with BCI" work with Lisa Thiergart) and has been cited in reviews of technical AI safety alongside figures such as Yoshua Bengio and Stuart Russell.⁸

He has been a frequent visitor to and collaborator with both the Future of Humanity Institute and the Machine Intelligence Research Institute (MIRI), where he was a Summer Fellow in 2015.¹

ARIA Programme

The ARIA Safeguarded AI programme, which Dalrymple has directed since 2023, is funded at \£59 million by the UK government. Its stated goal is to develop mathematically verified AI controllers suitable for deployment in safety-critical systems. The programme has cited the April 2025 blackout in Spain and Portugal — which lasted approximately 10 hours, caused billions of euros in economic damage, and resulted in eight fatalities — as illustrative of the risks posed by inadequate safety guarantees in energy infrastructure.⁵

Dalrymple has given keynote addresses on the programme's themes at CODE BLUE in Tokyo (2024), the AI Security Forum in Paris (2025), the High-Confidence Software & Systems conference in Annapolis (2025), and SCAI 2025 in Singapore.⁴ In a January 2025 podcast, he discussed the economic trade-offs of different approaches to managing the transition to transformative AI, including the costs and benefits of delaying superintelligence development.⁶

The programme takes a specific position within the UK AI policy landscape, emphasizing formal mathematical proof as a component of AI safety assurance for high-stakes applications alongside empirical testing and behavioral evaluations — an approach that contrasts with the empirically-oriented alignment methods more prevalent at major AI labs.

Criticisms and Concerns

No significant personal controversies or criticisms of Dalrymple's research have been documented in available sources. Community discussions on LessWrong and EA Forum contain limited direct critique of his technical proposals. One noted area of ongoing clarification relates to his concept of "boundaries" in the Open Agency Architecture, with community members continuing to seek precise formal definitions.⁸

A broader skeptical question — applicable to the Safeguarded AI agenda generally — is whether formal verification techniques, which have historically struggled to scale to complex software systems, can be made to work for AI systems of the complexity and scale now being deployed. Dalrymple's ARIA programme explicitly addresses this concern by targeting scoped, well-defined deployment contexts rather than open-ended AI behavior, but it remains an open empirical question whether this scoping can be maintained as AI capabilities advance.

His stated view that the "safe AI development window" may be closing faster than anticipated is also a claim that is difficult to evaluate rigorously, as acknowledged in his own public comments about the difficulty of technological forecasting.⁷

Key Uncertainties

Whether formal verification methods can scale to match the complexity of frontier AI systems, even in scoped deployment contexts
The actual timeline compression of the "middle period" between current AI capabilities and potential superintelligence
Whether the ARIA Safeguarded AI programme's technical approach will produce deployable systems within its funding horizon
The degree to which hardware-level guarantees (flexHEG) can be made robust to adversarial conditions without creating new vulnerability surfaces

Background research on David Dalrymple's career, education, and research history — biographical data from multiple research sections ↩ ↩² ↩³ ↩⁴ ↩⁵
Research data on Protocol Labs and SYCO 5 presentation of dioptics framework ↩
Research data on Safeguarded AI framework description and ARIA programme goals ↩
Research data on publications: "Flexible Hardware-Enabled Guarantees for AI Compute" (arXiv, 2025); "Towards Guaranteed Safe AI" (FAR.AI, May 2024); "Open Problems in Mechanistic Interpretability" (FAR.AI, January 2025); conference keynote appearances ↩ ↩² ↩³ ↩⁴
Research data on ARIA programme funding, electricity grid application, and Spain/Portugal blackout reference ↩ ↩²
Research data on METR evaluation observations, timeline views, and November 2025 ARIA programme pivot ↩ ↩² ↩³
Research data on Guardian coverage and public warnings about AI safety window ↩ ↩²
Research data on LessWrong/EA Forum community reception, MATS mentorship, and Manifund grant ↩ ↩² ↩³ ↩⁴

References

1davidad.org▸

davidad.org

2[PDF] davidad_cv.pdf - davidad (David A. Dalrymple)davidad.org▸

davidad.org

David Dalrymple

David Dalrymple

Quick Assessment

Key Links

Overview

Background and Early Career

AI Safety Work

The Safeguarded AI Framework

Formal Verification and Mechanistic Interpretability

Views on AI Timelines and Risk

Community Reception

ARIA Programme

Criticisms and Concerns

Key Uncertainties

References

Structured Data

Career History

Related Wiki Pages

Top Related Pages

Provably Safe AI (davidad agenda)

Superintelligence

Advanced Research and Invention Agency (ARIA)

Redwood Research

METR

Other

Organizations

Concepts

Approaches

Organization	Title	Start	End
Google DeepMind	Research Scientist	2020	2023
Advanced Research and Invention Agency (ARIA)	Programme Director	2023	—
AI Objectives Institute	Researcher	—	—

David Dalrymple

David Dalrymple

Quick Assessment

Key Links

Overview

Background and Early Career

AI Safety Work

The Safeguarded AI Framework

Formal Verification and Mechanistic Interpretability

Views on AI Timelines and Risk

Community Reception

ARIA Programme

Criticisms and Concerns

Key Uncertainties

Footnotes

References

Structured Data

Career History

Related Wiki Pages

Top Related Pages

Provably Safe AI (davidad agenda)

Superintelligence

Advanced Research and Invention Agency (ARIA)

Redwood Research

METR

Other

Organizations

Concepts

Approaches