LessWrong is a rationality-focused community blog founded in 2009 that has influenced AI safety discourse, receiving $5M+ in funding and serving as the origin point for ~31% of EA survey respondents in 2014. Survey participation peaked at 3,000+ in 2016, declining to 558 by 2023, with the community increasingly focused on AI alignment discussions.
AI AlignmentApproachAI AlignmentComprehensive review of AI alignment approaches finding current methods (RLHF, Constitutional AI) show 75%+ effectiveness on measurable safety metrics for existing systems but face critical scalabi...Quality: 91/100AI Non-Extremization CoordinationApproachAI Non-Extremization CoordinationA well-structured but loosely-defined conceptual framework synthesizing multi-agent coordination risks, epistemic fragmentation, and organizational AI deployment under the umbrella term 'non-extrem...
Analysis
Model Organisms of MisalignmentAnalysisModel Organisms of MisalignmentModel organisms of misalignment is a research agenda creating controlled AI systems exhibiting specific alignment failures as testbeds. Recent work achieves 99% coherence with 40% misalignment rate...Quality: 65/100Donations List WebsiteProjectDonations List WebsiteComprehensive documentation of an open-source database tracking $72.8B in philanthropic donations (1969-2023) across 75+ donors, with particular coverage of EA/AI safety funding. The page thoroughl...Quality: 52/100Timelines WikiProjectTimelines WikiTimelines Wiki is a specialized MediaWiki project documenting chronological histories of AI safety and EA organizations, created by Issa Rice with funding from Vipul Naik in 2017. While useful as a...Quality: 45/100
Organizations
Lightcone InfrastructureOrganizationLightcone InfrastructureOrganization operating LessWrong, the Alignment Forum, and the Lighthaven event venue.Machine Intelligence Research InstituteOrganizationMachine Intelligence Research InstituteComprehensive organizational history documenting MIRI's trajectory from pioneering AI safety research (2000-2020) to policy advocacy after acknowledging research failure, with detailed financial da...Quality: 50/100Google DeepMindOrganizationGoogle DeepMindComprehensive overview of DeepMind's history, achievements (AlphaGo, AlphaFold with 200M+ protein structures), and 2023 merger with Google Brain. Documents racing dynamics with OpenAI and new Front...Quality: 37/100
Other
Jaan TallinnPersonJaan TallinnProfile of Jaan Tallinn documenting $150M+ lifetime AI safety giving (86% of $51M in 2024), primarily through SFF ($34.33M distributed in a 2025 grant round). Co-founded CSER (2012) and FLI (2014),...Quality: 53/100Dustin MoskovitzPersonDustin MoskovitzDustin Moskovitz and Cari Tuna have given $4B+ since 2011, with ~$336M (12% of total) directed to AI safety through Coefficient Giving (formerly Open Philanthropy), making them the largest individu...Quality: 49/100InterpretabilityResearch AreaInterpretabilityMechanistic interpretability has extracted 34M+ interpretable features from Claude 3 Sonnet with 90% automated labeling accuracy and demonstrated 75-85% success in causal validation, though less th...Quality: 66/100
Concepts
AI TimelinesConceptAI TimelinesForecasts and debates about when transformative AI capabilities will be developedQuality: 95/100Existential Risk from AIConceptExistential Risk from AIHypotheses concerning risks from advanced AI systems that some researchers believe could result in human extinction or permanent global catastrophe, including institutional frameworks developed by ...Quality: 92/100Self-Improvement and Recursive EnhancementCapabilitySelf-Improvement and Recursive EnhancementComprehensive analysis of AI self-improvement from current AutoML systems (23% training speedups via AlphaEvolve) to theoretical intelligence explosion scenarios, with expert consensus at ~50% prob...Quality: 69/100
Risks
Sleeper Agents: Training Deceptive LLMsRiskSleeper Agents: Training Deceptive LLMsAnthropic's 2024 sleeper agents research demonstrates that deceptive AI behavior, once present, persists through standard safety training and can even be strengthened by adversarial training attemp...Quality: 78/100Instrumental ConvergenceRiskInstrumental ConvergenceComprehensive review of instrumental convergence theory with extensive empirical evidence from 2024-2025 showing 78% alignment faking rates, 79-97% shutdown resistance in frontier models, and exper...Quality: 64/100
Key Debates
Why Alignment Might Be HardArgumentWhy Alignment Might Be HardA comprehensive taxonomy of alignment difficulty arguments spanning specification problems, inner alignment failures, verification limits, and adversarial dynamics, with expert p(doom) estimates ra...Quality: 69/100AI Structural Risk CruxesCruxAI Structural Risk CruxesAnalyzes 12 key uncertainties about AI structural risks across power concentration, coordination feasibility, and institutional adaptation. Provides quantified probability ranges: US-China coordina...Quality: 66/100AI Misuse Risk CruxesCruxAI Misuse Risk CruxesComprehensive analysis of AI misuse cruxes with quantified evidence across bioweapons (RAND bio study found no significant difference; novice uplift studies show modest gains on in silico tasks), c...Quality: 65/100
Historical
The MIRI EraHistoricalThe MIRI EraComprehensive chronological account of AI safety's institutional emergence (2000-2015), from MIRI's founding through Bostrom's Superintelligence to mainstream recognition. Covers key organizations,...Quality: 31/100