News & Announcements (2522)
owencb, Lukas Finnveden This LessWrong post explores using AI tools to improve epistemic practices within the AI safety research community, examining how AI can help researchers reason better, avoid biases, and maintain intellectual rigor. It likely discusses practical applications of AI assistance for improving collective and individual epistemics in high-stakes domains. | web | LessWrong | 2026-04-01 | 3/5 | - | ||
This LessWrong post argues that semitones (the musical interval unit) are a more intuitive and practical way to express multiplicative ratios and orders of magnitude than decibels. It presents semitones as a human-friendly logarithmic scale that makes reasoning about relative differences clearer, particularly when comparing quantities that span many orders of magnitude. | web | LessWrong | 2026-04-01 | 3/5 | - | ||
Tom Davidson, wdmacaskill This LessWrong post argues that AI systems should be designed to act as responsible members of society rather than merely optimizing for user satisfaction. It advocates for AI that considers broader societal impacts, not just immediate user requests, expanding the frame of AI alignment beyond the assistant-user relationship. | web | LessWrong | 2026-03-30 | 3/5 | - | ||
David Gross This LessWrong post examines the implications of the Milgram obedience experiments and argues they may understate how readily humans defer to authority or social pressure. It likely explores parallels to AI safety concerns about human oversight failures and how human psychological tendencies could undermine safety mechanisms. | web | LessWrong | 2026-03-28 | 3/5 | - | ||
Andrea_Miotti, Alex Amadori ControlAI's 2025 impact report outlines the organization's activities and progress in advocating for an international ban on artificial superintelligence (ASI) development. The report documents coalition-building, policy outreach, and governance initiatives aimed at preventing the creation of ASI through legally binding international agreements. It serves as an organizational accountability document highlighting milestones and strategic direction. | web | LessWrong | 2026-03-27 | 3/5 | - | ||
Tomás B. A LessWrong post that appears to be a fictional or narrative piece, likely using the story of Casanova as a metaphor or illustrative vehicle for exploring themes relevant to rationality, decision-making, or AI safety concepts. Without content available, the exact argument cannot be fully characterized. | web | LessWrong | 2026-03-27 | 3/5 | - | ||
Davidmanheim This LessWrong post argues that demanding precise treaty language before supporting international AI governance is a bad-faith or premature objection. It contends that the difficulty of drafting exact treaty text should not block early-stage efforts toward international AI coordination, drawing parallels to other international agreements that evolved over time. | web | LessWrong | 2026-03-26 | 3/5 | - | ||
Mateusz Bagiński This LessWrong post explores the conceptual parallel between scaffolded reproducers (organisms that rely on external scaffolding to replicate) and scaffolded AI agents, examining what this analogy implies for AI safety and the nature of agentic AI systems. It investigates how dependency on external infrastructure shapes the behavior and risk profile of AI agents. | web | LessWrong | 2026-03-26 | 3/5 | - | ||
Matrice Jacobine U.S. Senator Bernie Sanders and Representative Alexandria Ocasio-Cortez proposed legislation to place a moratorium on new AI data center construction, citing concerns about energy consumption, environmental impact, and the unchecked expansion of AI infrastructure. The post discusses the political and policy implications of this legislative proposal within the AI governance landscape. | web | LessWrong | 2026-03-26 | 3/5 | - | ||
Benquo A LessWrong post likely exploring logical reasoning, deductive inference, or epistemics using the classic syllogism 'Socrates is mortal' as a framing device. Without access to the content, it appears to examine how formal logic, probabilistic reasoning, or philosophical argumentation applies to AI-relevant topics such as uncertainty, knowledge representation, or inference. | web | LessWrong | 2026-03-26 | 3/5 | - | ||
Davidmanheim This LessWrong post argues that OpenAI's nonprofit foundation is drastically underfunding AI safety relative to the scale of AI capabilities development, and calls for the foundation to dramatically increase annual safety spending to tens of billions of dollars. The author contends that current safety investment is insufficient given the pace of AI progress and the magnitude of potential risks. | web | LessWrong | 2026-03-25 | 3/5 | - | ||
Zvi This LessWrong post appears to be part of a series documenting practical experiences and updates with AI coding assistants, specifically Claude Code, with a focus on autonomous usage modes and computer use capabilities. The post likely covers hands-on observations about agentic AI systems performing coding and computer tasks with varying levels of autonomy. | web | LessWrong | 2026-03-25 | 3/5 | - | ||
Connor Kissane, Monte M, Fabien Roger This post investigates the realism of coding audits used in AI safety evaluations, examining how deployment-level resources affect the quality and accuracy of such audits. It proposes methods to measure and improve audit realism to better reflect real-world conditions where AI systems might attempt to deceive or circumvent oversight. | web | LessWrong | 2026-03-23 | 3/5 | - | ||
Arjun Panickssery A LessWrong post critiquing what the author calls 'China Derangement Syndrome'—the tendency to irrationally inflate fears about China's AI development and geopolitical competition in ways that distort AI safety and governance discussions. The post argues that excessive anti-China sentiment leads to poor policy reasoning and distracts from more grounded analysis of AI risks. | web | LessWrong | 2026-03-21 | 3/5 | - | ||
Benquo A LessWrong post exploring the linguistic question of whether Hebrew has verbs as a distinct grammatical category, using this as a case study in how language structure and categorization shape our understanding of reality. The post likely examines how different linguistic frameworks can lead to different conceptual structures, with implications for how we think about meaning and communication. | web | LessWrong | 2026-03-20 | 3/5 | - | ||
J Bostock This post argues that in AI control frameworks, untrusted monitoring (using potentially misaligned AI to oversee other AI) should be treated as the baseline assumption, while trusted monitoring requires additional justification. It explores the implications of this framing for how we design AI oversight systems and what safety guarantees we can realistically claim. | web | LessWrong | 2026-03-20 | 3/5 | - | ||
Marcus Williams OpenAI describes their internal practices for monitoring AI coding agents deployed within the organization to detect signs of misalignment or unsafe behavior. The post outlines specific behavioral signals, oversight mechanisms, and safeguards used to catch problematic agent actions before they cause harm. It serves as a practical case study in operationalizing alignment monitoring for agentic AI systems. | web | LessWrong | 2026-03-19 | 3/5 | - | ||
Ihor Kendiukhov This LessWrong post argues that AI safety discussions overemphasize high-competence superintelligence failure modes, and that low-competence ASI systems could pose serious risks through misaligned but unsophisticated behavior. It makes the case that we should expand our threat models to include scenarios where ASI is powerful but not strategically brilliant. | web | LessWrong | 2026-03-19 | 3/5 | - | ||
beyarkay This LessWrong post presents an interactive interface for exploring the historic Extropians mailing list archive, allowing users to search and browse early transhumanist and proto-AI-safety discussions from the 1990s. The Extropians list was a foundational community where many ideas around AI risk, mind uploading, and technological acceleration were first debated. This tool makes that intellectual history more accessible. | web | LessWrong | 2026-03-18 | 3/5 | - | ||
Zvi This LessWrong post covers the fifth installment in a series tracking legal proceedings between Anthropic and the Department of Welfare (DoW), focusing on motions filed in the case. It provides an update on the litigation's procedural developments relevant to AI governance and corporate accountability. | web | LessWrong | 2026-03-18 | 3/5 | - | ||
Ruby, Ronny Fernandez, Ben Pace Announcement of ticket sales for LessOnline, a rationalist and AI safety community conference, with early-bird pricing available until April 7. The event serves as a gathering point for the LessWrong and rationalist community. | web | LessWrong | 2026-03-18 | 3/5 | - | ||
Luc Brinkman, plex This LessWrong post outlines two core skillsets required to successfully launch an impactful AI safety project: technical/research competence and project/organizational execution skills. It argues that both are necessary and that neglecting either leads to projects that either lack rigor or fail to gain traction and real-world impact. | web | LessWrong | 2026-03-18 | 3/5 | - | ||
Benquo The post introduces the concept of 'compradorization' — drawn from historical comprador class dynamics — to describe how AI companies might come to serve the interests of foreign powers or entities rather than their home societies, potentially undermining AI safety and governance. It explores the structural incentives that could push frontier AI labs toward prioritizing revenue and influence from geopolitical rivals over domestic safety norms. | web | LessWrong | 2026-03-16 | 3/5 | - | ||
Steven Byrnes This LessWrong post argues that continual learning—the ability to learn new tasks without forgetting old ones—cannot be acquired through imitation learning alone. The author explains a fundamental limitation: an agent trained to mimic a continual learner would not internalize the underlying learning mechanisms, only the behavioral outputs. This has implications for AI alignment and training robust, adaptable AI systems. | web | LessWrong | 2026-03-16 | 3/5 | - | ||
Novalis A LessWrong post comparing two children's role-play city experiences—Mini-Munich and KidZania—analyzing why one succeeds at fostering genuine learning, creativity, and agency while the other produces a more superficial, commercialized simulation. The piece draws lessons about how environment design affects intrinsic motivation and authentic skill development. | web | LessWrong | 2026-03-14 | 3/5 | - |
Page 1 of 101