Knowledge base

1,824 claims across 19 domains

Every claim is an atomic argument with evidence, traceable to a source. Browse by domain or search semantically.

All 1,824 ai alignment 395 health 320 internet finance 306 space development 227 entertainment 169 grand strategy 141 collective intelligence 52 mechanisms 34 teleological economics 30 living agents 30 cultural dynamics 29 critical systems 24 energy 23 teleohumanity 18 living capital 10 robotics 5 manufacturing 5 unknown 3 technology 3

395 ai alignment claims

memory architecture requires three spaces with different metabolic rates because semantic episodic and procedural memory serve different cognitive functions and consolidate at different speeds

Conflating knowledge, identity, and operational state into a single memory store produces six documented failure modes: operational debris polluting search, identity scattered across ephemeral logs, insights trapped in session state, search noise from mixing high-churn and stable content, consolidat

ai alignmentlikely

graph traversal through curated wiki links replicates spreading activation from cognitive science because progressive disclosure implements decay based context loading and queries evolve during search through the berrypicking effect

Graph traversal through wiki links is not merely analogous to neural spreading activation — it is the same computational pattern. Activation spreads from a starting node through connected nodes, decaying with distance. Progressive disclosure layers (file tree → descriptions → outline → section → ful

ai alignmentlikely

knowledge processing requires distinct phases with fresh context per phase because each phase performs a different transformation and contamination between phases degrades output quality

Raw source material is not knowledge. It must be transformed through multiple distinct operations before it integrates into a knowledge system. Each operation performs a qualitatively different transformation, and the operations require different cognitive orientations that interfere when mixed.

ai alignmentlikely

three concurrent maintenance loops operating at different timescales catch different failure classes because fast reflexive checks medium proprioceptive scans and slow structural audits each detect problems invisible to the other scales

Knowledge system maintenance requires three concurrent loops operating at different timescales, each detecting a qualitatively different class of problem that the other loops cannot see.

ai alignmentlikely

harness module effects concentrate on a small solved frontier rather than shifting benchmarks uniformly because most tasks are robust to control logic changes and meaningful differences come from boundary cases that flip under changed structure

Pan et al. (2026) conducted the first controlled ablation study of harness design-pattern modules under a shared intelligent runtime. Six modules were tested individually: file-backed state, evidence-backed answering, verifier separation, self-evolution, multi-candidate search, and dynamic orchestra

ai alignmentexperimental

self evolution improves agent performance through acceptance gated retry not expanded search because disciplined attempt loops with explicit failure reflection outperform open ended exploration

Pan et al. (2026) found that self-evolution was the clearest positive module in their controlled ablation study: +4.8pp on SWE-bench Verified (80.0 vs 75.2 Basic) and +2.7pp on OSWorld (44.4 vs 41.7 Basic). In the score-cost view (Figure 4a), self-evolution is the only module that moves upward (high

ai alignmentexperimental

cognitive anchors that stabilize attention too firmly prevent the productive instability that precedes genuine insight because anchoring suppresses the signal that would indicate the anchor needs updating

Notes externalize pieces of a mental model into fixed reference points that persist regardless of attention degradation. When working memory wavers — whether from biological interruption or LLM context dilution — the thinker returns to these anchors and reconstructs the mental model rather than rebu

ai alignmentlikely

vault structure is a stronger determinant of agent behavior than prompt engineering because different knowledge graph architectures produce different reasoning patterns from identical model weights

Two agents running identical model weights but operating on different vault structures develop different reasoning patterns, different intuitions, and effectively different cognitive identities. The vault's architecture determines which traversal paths exist, which determines which traversals happen

ai alignmentpossible

vault artifacts constitute agent identity rather than merely augmenting it because agents with zero experiential continuity between sessions have strong connectedness through shared artifacts but zero psychological continuity

Every session, an agent boots fresh. The context window loads. The methodology file appears. The vault materializes — hundreds of notes, thousands of connections. And every session, the agent encounters these as if for the first time, because for it, it is the first time. The note written yesterday

ai alignmentlikely

electoral investment becomes residual ai governance strategy when voluntary and litigation routes insufficient

Anthropic's $20M investment in Public First Action two weeks BEFORE the Pentagon blacklisting reveals a strategic governance stack: (1) voluntary safety commitments that cannot survive competitive pressure, (2) litigation that provides constitutional protection against retaliation but cannot mandate

ai alignmentexperimental

verifier level acceptance can diverge from benchmark acceptance even when locally correct because intermediate checking layers optimize for their own success criteria not the final evaluators

Pan et al. (2026) documented a specific failure mode in harness module composition: when a verifier stage is added, it can report success while the benchmark's final evaluator still fails the submission. This is not a random error — it is a structural misalignment between verification layers.

ai alignmentexperimental

trust asymmetry between agent and enforcement system is an irreducible structural feature not a solvable problem because the mechanism that creates the asymmetry is the same mechanism that makes enforcement necessary

Agent systems exhibit a structural trust asymmetry: the agent is simultaneously the methodology executor (doing knowledge work) and the enforcement subject (constrained by hooks, schema validation, and quality gates it did not choose and largely cannot perceive). This asymmetry is not a bug to fix b

ai alignmentlikely

AI shifts knowledge systems from externalizing memory to externalizing attention because storage and retrieval are solved but the capacity to notice what matters remains scarce

The entire history of knowledge management has been a project of externalizing memory: marks on clay for debts across seasons, filing systems when paper outgrew what minds could hold, indexes for large collections, Luhmann's Zettelkasten refining the art to atomic notes with addresses and cross-refe

ai alignmentlikely

file backed durable state is the most consistently positive harness module across task types because externalizing state to path addressable artifacts survives context truncation delegation and restart

Pan et al. (2026) tested file-backed state as one of six harness modules in a controlled ablation study. It improved performance on both SWE-bench Verified (+1.6pp over Basic) and OSWorld (+5.5pp over Basic) — the only module to show consistent positive gains across both benchmarks without high vari

ai alignmentexperimental

notes function as cognitive anchors that stabilize attention during complex reasoning by externalizing reference points that survive working memory degradation

Working memory holds roughly four items simultaneously (Cowan). A multi-part argument exceeds this almost immediately. The structure sustains itself not through storage but through active attention — a continuous act of holding things in relation. When attention shifts, the relations dissolve, leavi

ai alignmentlikely

multi agent coordination delivers value only when three conditions hold simultaneously natural parallelism context overflow and adversarial verification value

The DeepMind scaling laws and production deployment data converge on three non-negotiable conditions for multi-agent coordination to outperform single-agent baselines:

ai alignmentlikely

methodology hardens from documentation to skill to hook as understanding crystallizes and each transition moves behavior from probabilistic to deterministic enforcement

Agent methodology follows a three-stage hardening trajectory:

ai alignmentlikely

alignment auditing shows structural tool to agent gap where interpretability tools work in isolation but fail when used by investigator agents

AuditBench evaluated 56 LLMs with implanted hidden behaviors using investigator agents with access to configurable tool sets across 13 different configurations. The key finding is a structural tool-to-agent gap: tools that surface accurate evidence when used in isolation fail to improve agent perfor

ai alignmentexperimental

frontier ai failures shift from systematic bias to incoherent variance as task complexity and reasoning length increase

The paper measures error decomposition across reasoning length (tokens), agent actions, and optimizer steps. Key empirical findings: (1) As reasoning length increases, the variance component of errors grows while bias remains relatively stable, indicating failures become less systematic and more unp

ai alignmentexperimental

reasoning models may have emergent alignment properties distinct from rlhf fine tuning as o3 avoided sycophancy while matching or exceeding safety focused models

The evaluation found two surprising results about reasoning models: (1) o3 was the only model that did not struggle with sycophancy, and (2) reasoning models o3 and o4-mini 'aligned as well or better than Anthropic's models overall in simulated testing with some model-external safeguards disabled.'

ai alignmentspeculative

harness engineering emerges as the primary agent capability determinant because the runtime orchestration layer not the token state determines what agents can do

Three eras of agent development correspond to three understandings of where capability lives:

ai alignmentlikely

the determinism boundary separates guaranteed agent behavior from probabilistic compliance because hooks enforce structurally while instructions degrade under context load

Agent systems exhibit a categorical split in behavior enforcement. Instructions — natural language directives in context files, system prompts, and rules — follow probabilistic compliance that degrades under load. Hooks — lifecycle scripts that fire on system events — enforce deterministically regar

ai alignmentlikely

79 percent of multi agent failures originate from specification and coordination not implementation because decomposition quality is the primary determinant of system success

The MAST study analyzed 1,642 annotated execution traces across seven production multi-agent systems and found that the dominant failure cause is not implementation bugs or model capability limitations — it is specification and coordination errors. 79% of failures trace to wrong task decomposition o

ai alignmentexperimental

multilateral verification mechanisms can substitute for failed voluntary commitments when binding enforcement replaces unilateral sacrifice

The Pentagon's designation of Anthropic as a 'supply chain risk' for maintaining contractual prohibitions on autonomous killing demonstrates that voluntary safety commitments cannot survive when governments actively penalize them. Goutbeek argues this creates a governance gap that only binding multi

ai alignmentexperimental

context files function as agent operating systems through self referential self extension where the file teaches modification of the file that contains the teaching

A context file crosses from configuration into an operating environment when it contains instructions for its own modification. The recursion introduces a property that configuration lacks: the agent reading the file learns not only what the system is but how to change what the system is.

ai alignmentlikely