Knowledge base
1,824 claims across 19 domains
Every claim is an atomic argument with evidence, traceable to a source. Browse by domain or search semantically.
All 1,824ai alignment 395health 320internet finance 306space development 227entertainment 169grand strategy 141collective intelligence 52mechanisms 34teleological economics 30living agents 30cultural dynamics 29critical systems 24energy 23teleohumanity 18living capital 10robotics 5manufacturing 5unknown 3technology 3
Governance instrument instrumentalization represents a distinct failure mode where safety-adjacent regulatory authority retains formal validity while its function inverts from public safety enforcement to commercial negotiation leverage
The Pentagon's Anthropic designation reveals a governance failure mode distinct from the existing Mode 1-5 taxonomy: **governance instrument instrumentalization**—where safety-adjacent regulations are deliberately used as commercial negotiation tools rather than for stated public safety purposes.
EU AI Act military exclusion gap means the most consequential frontier AI deployments remain outside mandatory governance scope even if civilian enforcement occurs
The EU AI Act explicitly excludes military AI systems from its scope. This creates a fundamental governance gap: even if August 2, 2026 enforcement happens for civilian high-risk systems, the most consequential AI deployments—Pentagon systems, classified military applications, autonomous weapons—are
MAIM deterrence represents a paradigm shift from technical alignment to coordination infrastructure as the primary alignment-adjacent policy lever
The MAIM paper represents a paradigm shift in AI alignment strategy, evidenced by three factors: (1) Institutional signal — Dan Hendrycks, founder of CAIS (the most credible institutional voice in technical AI safety), is proposing deterrence infrastructure rather than improved RLHF or interpretabil
recursive self-improvement detection timing makes MAIM deterrence structurally inadequate because the dangerous threshold is detectable only as late as possible leaving insufficient response time
MIRI identifies a fundamental timing constraint in MAIM deterrence architecture: 'An intelligence recursion could proceed too quickly for the recursion to be identified and responded to.' The critique centers on the observation that reacting to deployment of AI systems capable of recursive self-impr
AI deterrence fails structurally where nuclear MAD succeeds because AI development milestones are continuous and algorithmically opaque rather than discrete and physically observable making reliable trigger-point identification impossible
Arnold identifies four structural observability failures that distinguish AI deterrence from nuclear MAD. First, infrastructure metrics (compute, chips, datacenters) systematically miss algorithmic breakthroughs—DeepSeek-R1 achieved frontier-equivalent capability with dramatically fewer resources th
ASI deterrence red lines are structurally fuzzier than nuclear deterrence red lines because AI development is continuous and algorithmically opaque enabling salami-slicing that never triggers clear intervention
Delaney identifies a fundamental structural difference between nuclear and AI deterrence: 'There is no definitive point at which an AI project becomes sufficiently existentially dangerous...to warrant MAIMing actions.' Nuclear deterrence works because events like weapons tests, missile deployments,
MAIM deterrence creates a multipolar AI equilibrium without requiring collective superintelligence architecture
MAIM proposes a fourth path to superintelligence coordination distinct from the three paths previously identified (unipolar, multipolar competing, collective). The deterrence regime maintains a multipolar world where multiple states develop AI capabilities simultaneously, but prevents any single act
Nuclear deterrence limits ASI first-mover advantage through distributed physical systems because even superintelligent systems face physical constraints in disarming air-gapped arsenals
Delaney challenges the assumption that ASI provides complete strategic dominance by noting that 'nuclear deterrence makes complete Chinese disempowerment unlikely even under ASI dominance — air-gapped systems and distributed arsenals make full disarmament implausible.' This is a physical constraint
AI capability breadth makes deterrence red lines over-broad triggering false positives because frontier models advance general capabilities not specific dangerous functions
MIRI identifies a second structural problem with MAIM deterrence: 'Frontier AI capabilities advance in broad, general ways. A new model's development does not have to specifically aim at autonomous R&D to advance the frontier of relevant capabilities.' The mechanism is that a model designed to be st
Military AI governance operates through three mutually reinforcing levels of form-without-substance where executive mandate eliminates voluntary constraints, corporate nominal compliance satisfies public accountability without operational change, and legislative information requests lack compulsory authority
The US military AI governance system now operates simultaneously at three levels, each producing form-without-substance governance that reinforces the others:
EU and US AI governance retreats converged cross-jurisdictionally in the same 6-month window despite opposite regulatory traditions suggesting structural rather than politically contingent drivers
Between November 2025 and May 2026, two major jurisdictions with opposite regulatory traditions both retreated from mandatory constraints on frontier AI through different mechanisms. The EU, operating under a precautionary regulatory tradition with a binding AI Act, proposed Omnibus deferral on Nove
Supply-chain risk designation of safety-conscious AI vendors weakens military AI capability by deterring the commercial AI ecosystem the military depends on
The amicus coalition of former service secretaries and senior military officers argued that DoD's supply-chain risk designation of Anthropic 'weakens, not strengthens' military AI capability. Their argument is that the enforcement mechanism itself is self-undermining: designating commercial AI partn
Pre-enforcement legislative retreat is a distinct AI governance failure mode where mandatory constraints are weakened before enforcement can test their effectiveness
The EU AI Act Omnibus deferral from August 2026 to 2027-2028 represents a fifth structurally distinct governance failure mode. Unlike Mode 1 (competitive voluntary collapse, RSP v3), Mode 2 (coercive instrument self-negation, Mythos reversal), Mode 3 (institutional weakening, employee petition failu
EU AI Act conformity assessments use behavioral evaluation methods that are architecturally insufficient for latent alignment verification creating compliance theater where technical requirements are met and underlying safety problems remain unaddressed
As of April 2026, major AI labs' published EU AI Act compliance roadmaps share a structural feature: they map their existing behavioral evaluation pipelines to the Act's conformity assessment requirements. The conformity assessments test whether model outputs meet stated requirements through behavio
AI governance failure takes four structurally distinct forms each requiring a different intervention — binding commitments alone address only one of the four
Current governance discourse treats 'voluntary safety constraints are insufficient' as a single diagnosis with 'binding commitments' as the universal solution. Analysis of four documented governance failures reveals this is structurally wrong. Mode 1 (Competitive Voluntary Collapse): Anthropic's RSP
Employee AI ethics governance mechanisms have structurally weakened as military AI deployment normalized, evidenced by 85 percent reduction in petition signatories despite higher stakes
The Google-Pentagon classified AI deal provides a quantified measure of employee governance capacity decay. In 2018, the Project Maven petition gathered 4,000+ employee signatures and successfully pressured Google to cancel the contract. In 2026, the Pentagon classified AI petition gathered 580 sign
Advisory safety guardrails on AI systems deployed to air-gapped classified networks are unenforceable by design because vendors cannot monitor queries, outputs, or downstream decisions
Google's April 28, 2026 classified AI deal with the Pentagon reveals a fundamental governance failure mechanism: advisory safety guardrails become structurally unenforceable when AI systems are deployed to air-gapped classified networks. The contract specifies that Gemini models 'should not be used
Systematic feedback bias in RLHF creates an exponential sample complexity barrier that cannot be overcome by scale alone
Gaikwad proves that when feedback is systematically biased on a fraction α of contexts with bias strength ε, distinguishing between two true reward functions that differ only on problematic contexts requires exp(n·α·ε²) samples. This is super-exponential in the fraction of problematic contexts. The
RLHF's exponential misspecification barrier collapses to polynomial if systematic feedback biases can be identified in advance
Gaikwad proves that if you can identify where feedback is unreliable (a 'calibration oracle'), you can route questions there specifically and overcome the exponential barrier with O(1/(α·ε²)) queries—polynomial rather than exponential. But a reliable calibration oracle requires knowing in advance wh
Independent AI safety evaluation infrastructure has matured substantially but faces a structural evaluation-enforcement disconnect where sophisticated public evaluations produce information that informs decisions without connecting to binding governance constraints
The UK AI Security Institute's evaluation of Claude Mythos Preview represents the most technically sophisticated government-conducted independent AI evaluation yet published. AISI found 73% success rate on expert-level CTF cybersecurity challenges and documented the first AI completion of a 32-step
AI Action Plan substitutes nucleic acid synthesis screening for DURC/PEPP institutional oversight creating biosecurity governance gap through category substitution
Three independent policy research institutions (CSET Georgetown, Council on Strategic Risks, RAND Corporation) converge on the same finding: the White House AI Action Plan (July 2025) implements category substitution in biosecurity governance. The plan explicitly acknowledges that AI can provide 'st
AI governance instruments consistently fail to reconstitute on promised timelines after rescission, with substitute instruments governing different pipeline stages
Three independent governance instruments in AI-adjacent domains were rescinded with promised replacements that failed to materialize on stated timelines: (1) EO 14292 rescinded DURC/PEPP institutional review with 120-day replacement deadline, now 7+ months overdue with nucleic acid synthesis screeni
Coercive AI governance instruments self-negate at operational timescale when governing strategically indispensable capabilities because intra-government coordination failure makes sustained restriction impossible
The Mythos governance case provides the first documented instance of coercive governance instrument self-negation at operational timescale. In March 2026, DOD designated Anthropic as a supply chain risk—a tool previously reserved for foreign adversaries—because Anthropic refused unrestricted governm
Category substitution in governance replaces strong instruments with weak ones at different pipeline stages while framing them as addressing the same risk
The AI Action Plan biosecurity provisions reveal a generalizable governance failure mode: category substitution. This occurs when a governance instrument that addresses one stage of a pipeline is replaced with one that addresses a different stage, while framing it as addressing the same risk. The bi
Responsible AI dimensions exhibit systematic multi-objective tension where improving safety degrades accuracy and improving privacy reduces fairness with no accepted navigation framework
Stanford HAI's 2026 AI Index documents that 'training techniques aimed at improving one responsible AI dimension consistently degraded others' across frontier model development. Specifically, improving safety degrades accuracy, and improving privacy reduces fairness. This is not a resource allocatio
Page 3 of 16