LongtermWiki Strategy Brainstorm
LongtermWiki Strategy Brainstorm
Status: Working document, not polished Purpose: Think through what could go wrong and what success looks like
The Core Strategic Question
What is LongtermWiki actually trying to be?
Diagram (loading…)
flowchart TD
subgraph Options["Strategic Options"]
A[Insight Generator]
B[Reference Wiki]
C[Decision Support Tool]
D[Crux Mapping Platform]
E[Synthesis Engine]
end
A --> A1[High variance, might produce nothing useful]
B --> B1[Commodity, already exists in various forms]
C --> C1[Needs clear users with clear decisions]
D --> D1[Novel but unproven value]
E --> E1[Labor-intensive, hard to maintain]These aren't mutually exclusive, but they have very different implications for what we build and how we measure success.
Failure Modes
Failure Mode 1: "Just Another Wiki"
The fear: We build a comprehensive knowledge base that nobody uses because:
- It's not differentiated from LessWrong, Wikipedia, 80K, etc.
- Maintenance burden exceeds our capacity
- Information gets stale and trust erodes
- No clear "job to be done" that it uniquely serves
Signs we're heading here:
- Low repeat visitors
- People cite primary sources instead of our pages
- "Oh yeah I've seen that site" but no behavior change
- Content quality diverges wildly across pages
Possible mitigations:
- Ruthless focus on a narrow use case
- Quality over quantity (30 great pages > 300 mediocre ones)
- Opinionated curation (not neutral, not comprehensive)
- Clear "this is what LongtermWiki is for" positioning
Failure Mode 2: "Insights That Aren't"
The fear: We try to generate novel strategic insights but:
- We're not actually smarter than the existing field
- "Insights" are obvious to experts, only novel to us
- Analysis is shallow because we're spread too thin
- We mistake complexity for depth
Signs we're heading here:
- Experts are unimpressed or dismissive
- Our "insights" don't survive contact with counterarguments
- We can't point to decisions that changed because of our work
- Internal feeling of "are we just rearranging deck chairs?"
Possible mitigations:
- Tight feedback loops with sophisticated users
- Explicit "what would falsify this?" for our claims
- Hire/consult people who can actually do the analysis
- Focus on synthesis and structure rather than novel claims
Failure Mode 3: "Cathedral in the Desert"
The fear: We build something beautiful and comprehensive that nobody asked for:
- Solves a problem that isn't actually a bottleneck
- Users don't have the "job to be done" we imagined
- The people who need prioritization help aren't reading documents
- Decision-makers use networks and calls, not wikis
Signs we're heading here:
- We can't name 10 specific people who would use this weekly
- User research reveals different pain points than we assumed
- "This is cool but I wouldn't actually use it"
- Building for an imagined user rather than real ones
Possible mitigations:
- User research before building
- Start with a specific user and their specific workflow
- Build minimum viable versions and test
- Be willing to pivot or kill the project
Failure Mode 4: "Maintenance Hell"
The fear: Initial build is fine but ongoing maintenance is unsustainable:
- Content rots faster than we can update it
- Quality degrades as original authors leave
- Scope creep makes everything shallow
- Becomes a zombie project that's "still around" but not useful
Signs we're heading here:
- Growing backlog of "needs review" pages
- Key pages haven't been touched in 6+ months
- New content added but old content not maintained
- Feeling of dread when thinking about updates
Possible mitigations:
- Scope down aggressively from the start
- Build staleness into the UX (visible "last reviewed" dates)
- Automated content health monitoring
- Plan for graceful degradation or archiving
Failure Mode 5: "Too Weird to Adopt"
The fear: The crux-mapping / worldview-mapping approach is too unusual:
- Users don't understand how to use it
- Requires too much buy-in to a novel framework
- Experts don't want to be "mapped" onto worldview archetypes
- The structure imposes false precision on messy disagreements
Signs we're heading here:
- People engage with the wiki content, ignore the crux structure
- Pushback on worldview categories ("that's not what I believe")
- The novel features feel gimmicky
- Simpler alternatives would serve users better
Possible mitigations:
- Test the framework with real users before building
- Make novel features optional/progressive disclosure
- Be willing to drop features that don't work
- Start simple, add complexity only if earned
What Does "Useful" Mean?
This is the crux. Different definitions lead to very different projects:
Definition A: Changes Decisions
Test: Can we point to funding decisions, research directions, or career choices that changed because of LongtermWiki?
Implications:
- Need to be embedded in actual decision-making workflows
- Probably need direct relationships with funders/researchers
- Quality of analysis matters more than breadth
- Might only need to serve a small number of users well
Concerns:
- Very high bar
- Attribution is hard
- Might be serving decisions that would have happened anyway
Definition B: Improves Understanding
Test: Do users report that they understand AI safety landscape better after using LongtermWiki?
Implications:
- Educational value is primary
- Breadth and accessibility matter
- Competes with AI Safety Fundamentals, 80K, etc.
- Success = people recommend it to newcomers
Concerns:
- Lots of competition in this space
- "Understanding" doesn't necessarily lead to better decisions
- Risk of being a stepping stone people quickly move past
Definition C: Structures Discourse
Test: Do people use LongtermWiki's categories and cruxes when discussing AI safety strategy?
Implications:
- The framework itself is the product
- Success = LongtermWiki vocabulary becomes common
- Focus on crux definitions, worldview archetypes
- Less about content, more about structure
Concerns:
- Very hard to achieve
- Might impose bad structure on good discourse
- Requires significant field buy-in
Definition D: Surfaces Disagreements
Test: Do we help people identify where they disagree and why?
Implications:
- Disagreement mapping is primary
- Need to represent multiple perspectives fairly
- Value = reducing "talking past each other"
- Could be more interactive/tool-like
Concerns:
- Might reify disagreements instead of resolving them
- Hard to represent views fairly (everyone will object)
- Unclear who the user is
Definition E: Generates Insights
Test: Do we produce novel strategic insights that weren't obvious before?
Implications:
- We're doing original analysis
- Quality of thinking matters most
- Might look more like reports than wiki
- Success = "I hadn't thought of it that way"
Concerns:
- Are we actually capable of this?
- Might be better done by existing researchers
- High variance in outcomes
Strategic Options
Option 1: Narrow & Deep
Focus on one very specific use case and nail it.
Example: "LongtermWiki helps funders compare interventions under different worldviews"
- 30-50 intervention pages, deeply analyzed
- Clear worldview → priority tool
- Explicit targeting of Open Phil, SFF, smaller funders
- Success = funders actually reference it in decision memos
Pros: Clear focus, measurable success, defensible niche Cons: Small user base, high stakes per user, might not be what funders want
Option 2: Broad & Shallow (Quality Wiki)
Comprehensive reference that's clearly better than alternatives.
Example: "LongtermWiki is the best single source for 'what is X in AI safety?'"
- 200+ pages, all at consistent quality
- Good SEO, clear navigation
- First stop for researchers, journalists, newcomers
- Success = high traffic, people link to us
Pros: Clear value prop, measurable, scales well Cons: Maintenance burden, commodity competition, doesn't leverage our unique angle
Option 3: Opinionated Synthesis
Not neutral, not comprehensive — actively opinionated about what matters.
Example: "LongtermWiki is where you go to understand the strategic landscape according to [specific perspective]"
- Explicitly represents a worldview or analytical lens
- Quality of argument matters more than coverage
- More like a think tank than a wiki
- Success = "LongtermWiki's take on X" becomes a reference point
Pros: Differentiated, lower maintenance, can be higher quality per page Cons: Alienates those who disagree, requires us to actually have good takes, might be seen as biased
Option 4: Crux Laboratory
Focus almost entirely on the crux-mapping innovation.
Example: "LongtermWiki maps the key uncertainties in AI safety and what would resolve them"
- Minimal wiki content, max crux structure
- Interactive disagreement exploration tools
- Integrate with prediction markets, expert surveys
- Success = cruxes get referenced, forecasts get made
Pros: Novel, potentially high-impact, differentiated Cons: Unproven value, might be too weird, hard to measure success
Option 5: Living Strategic Assessment
Regular, updating analysis of AI safety landscape.
Example: "LongtermWiki provides quarterly strategic assessments of the AI safety field"
- More report-like than wiki-like
- Regular update cadence with clear "what changed"
- Track predictions, update estimates
- Success = people read the quarterly updates, cite trends
Pros: Clear rhythm, natural freshness, can be event-driven Cons: High ongoing effort, journalism-like, competes with newsletters
Key Uncertainties About Our Own Strategy
Uncertainty 1: Who is the user?
| Candidate User | Their Need | Our Fit |
|---|---|---|
| Funders (Open Phil, SFF) | Compare interventions | Maybe? Do they want this? |
| Researchers choosing topics | Understand landscape | Probably already do this |
| Newcomers to field | Get oriented | Strong competition exists |
| Journalists/policymakers | Quick reference | Might be underserved |
| AI labs making safety decisions | ??? | Probably not us |
How to resolve: Actually talk to potential users.
Uncertainty 2: Are we capable of generating insights?
Honest assessment: Do we have the expertise to say things that aren't obvious to field insiders?
- If yes → lean into analysis and synthesis
- If no → lean into curation and structure
How to resolve: Try it with a few topics and get expert feedback.
Uncertainty 3: Is the crux-mapping frame valuable?
The LongtermWiki vision heavily features worldviews, cruxes, disagreement mapping. Is this actually useful or is it an intellectual hobby?
- If useful → it's our core differentiator
- If not → we're just a wiki with extra steps
How to resolve: Prototype the crux interface, test with users.
Uncertainty 4: Can we maintain this?
2 person-years builds it. What maintains it?
- Ongoing funding?
- Community contribution?
- AI assistance?
- Graceful archiving?
How to resolve: Plan for maintenance from day 1, or plan for finite lifespan.
Possible Validation Approaches
Before going all-in, we could test key assumptions:
Test 1: User Interviews (2 weeks)
Talk to 10-15 potential users:
- Funders, researchers, policy people
- "How do you currently make prioritization decisions?"
- "What information do you wish you had?"
- "Would you use X if it existed?"
Test 2: Crux Prototype (2 weeks)
Build a minimal crux-mapping interface for 3-5 cruxes:
- Show to experts, get feedback
- "Does this capture the disagreement?"
- "Would you use this?"
Test 3: Page Quality Test (2 weeks)
Write 5 pages at our target quality level:
- Show to potential users
- "Is this useful? Better than alternatives?"
- "What's missing?"
Test 4: Insight Generation Test (2 weeks)
Try to generate 3 novel strategic insights:
- Write them up
- Share with experts
- "Is this valuable? Novel? Correct?"
My Current Intuitions (To Be Challenged)
-
The wiki is necessary but not sufficient. We need good reference content, but that alone won't differentiate us.
-
The crux-mapping is our unique angle but it's also the riskiest part. Need to validate it works.
-
Narrow is safer than broad. Better to serve 50 users very well than 500 users poorly.
-
We should pick a user and work backwards. Abstract "field building" goals are too vague.
-
Maintenance is the hard part. Initial build is straightforward; sustainability is the real challenge.
-
We might be solving a problem that isn't actually a bottleneck. Need to validate that prioritization confusion is actually causing suboptimal resource allocation.
Open Questions for Discussion
-
Who are 10 specific people who would use this weekly? Can we name them?
-
What's the simplest version of LongtermWiki that would still be valuable?
-
If we could only do one of (wiki, crux map, worldview tool), which?
-
What would make us confident this is worth 2 person-years vs. not?
-
Are there existing projects we should join/support rather than build?
-
What's the "failure mode" that would make us kill the project?
-
How do we avoid this becoming a self-justifying project that continues because it exists?
Next Steps
- Schedule user interviews with 5-10 potential users
- Define "minimum viable LongtermWiki" that could be tested in 4 weeks
- Identify 3 specific cruxes to prototype mapping
- Honest assessment: do we have the expertise for Option 3 (Opinionated Synthesis)?
- Research what happened to similar past projects