Skip to content
Longterm Wiki
Back

Adam Gleave | FAR.AI

web

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: FAR AI

Author index page for Adam Gleave at FAR.AI; useful for finding his specific papers on adversarial policies and reward modeling rather than as a standalone resource.

Metadata

Importance: 30/100homepage

Summary

Author page for Adam Gleave at FAR.AI (Foundational Research for AI Safety), listing his published research and contributions to AI safety. Gleave is a prominent AI safety researcher known for work on adversarial policies, reward modeling, and scalable oversight.

Key Points

  • Adam Gleave is a key researcher at FAR.AI focused on technical AI safety problems
  • His work spans adversarial robustness, reward learning, and evaluation of AI systems
  • FAR.AI is an independent AI safety research organization producing technical alignment research
  • This page serves as an index to his published papers and blog posts on AI safety topics

Cited by 1 page

PageTypeQuality
FAR AIOrganization76.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202620 KB
[We updated our website and would love your feedback!](https://www.far.ai/about/website-feedback)

[![](https://cdn.prod.website-files.com/66f4503c3d0f4d4a75074a18/66f6e16c08352cbc69f1fe55_Far%20AI%20Logotype.svg)](https://www.far.ai/)

![](https://cdn.prod.website-files.com/66f6ee23e5732cc3b38ca38e/686c4c6b2056373c8e8331d9_Adam-Gleave-1.webp)

# Adam Gleave

Co-founder & CEO

FAR.AI

Adam Gleave is the CEO of FAR.AI. He completed his PhD in artificial intelligence (AI) at UC Berkeley, advised by [Stuart Russell](https://people.eecs.berkeley.edu/~russell/). His goal is to develop techniques necessary for advanced automated systems to verifiably act according to human preferences, even in situations unanticipated by their designer. He is particularly interested in improving methods for value learning, and robustness of deep RL. For more information, visit his [website](https://gleave.me/).

# NEWs & publications

# NEWs & publications

[**Concept Influence: Leveraging Interpretability to Improve Performance and Efficiency in Training Data Attribution**](https://www.far.ai/news/concept-data-attribution-02-2026)

[February 19, 2026](https://www.far.ai/about/people/adam-gleave#)

[**Prefill-level Jailbreak: A Black-Box Risk Analysis of Large Language Models**](https://www.far.ai/research/prefill-level-jailbreak-a-black-box-risk-analysis-of-large-language-models)

[February 19, 2026](https://www.far.ai/about/people/adam-gleave#)

[**The Obfuscation Atlas: Mapping Where Honesty Emerges in RLVR with Deception Probes**](https://www.far.ai/research/the-obfuscation-atlas-mapping-where-honesty-emerges-in-rlvr-with-deception-probes)

[February 17, 2026](https://www.far.ai/about/people/adam-gleave#)

[**Revisiting Frontier LLMs’ Attempts to Persuade on Extreme Topics: GPT and Claude Improved, Gemini Worsened**](https://www.far.ai/news/revisiting-attempts-to-persuade)

[February 11, 2026](https://www.far.ai/about/people/adam-gleave#)

[**Large language models can effectively convince people to believe conspiracies**](https://www.far.ai/research/large-language-models-can-effectively-convince-people-to-believe-conspiracies)

[January 9, 2026](https://www.far.ai/about/people/adam-gleave#)

[**AI in 2025: Faster Progress, Harder Problems**](https://www.far.ai/news/san-diego-2025-opening-remarks)

[December 16, 2025](https://www.far.ai/about/people/adam-gleave#)

[**Frontier LLMs Attempt to Persuade into Harmful Topics**](https://www.far.ai/news/attempt-to-persuade-eval)

[August 21, 2025](https://www.far.ai/about/people/adam-gleave#)

[**​​A Toolkit for Estimating the Safety-Gap between Safety Trained and Helpful Only LLMs**](https://www.far.ai/news/safety-gap-toolkit)

[July 31, 2025](https://www.far.ai/about/people/adam-gleave#)

[**Jailbreak-Tuning: Models Efficiently Learn Jailbreak Susceptibility**](https://www.far.ai/research/jailbreak-tuning-models-efficiently-learn-jailbreak-susceptibility)

[July 15, 2025](https://www.far.ai/about/people/adam-gleave#)

[*

... (truncated, 20 KB total)
Resource ID: ca68437469b0fe97 | Stable ID: OTY2ZDE1YW