AI Funnel to Judgement: HRM (Sapient), Attention with COT (Google), and Action (Doolittle)
-
LLMs (with CoT): Compression is linguistic and sequential. The model linearizes a huge search space into a token-by-token micro-grammar (the “chain”). Yield: transparent steps but high token cost and brittleness. (Background on CoT brittleness and overhead is standard; not re-cited here.)
-
)HRM (Sapient): Compression is hierarchical and latent. A fast “worker” loop solves details under a slow “planner” context; the system iterates to a fixed point, then halts. You get deep computation with small parameters and tiny datasets; no text-level chains are required. (
,
,
-
Referential problems (math/physics/computation): demand constructive proofs/programs. LLM path: generate a program/derivation + run/check with a tool; return the artifact + pass/fail. HRM path: add a trace projector head that emits the minimal operational skeleton (state transitions, invariants, halting reason). Co-train on checker feedback so the latent plan compresses toward checkable constructions rather than pretty narratives. (Speculative but feasible.)
-
Action problems (law/econ/ethics): demand constructive procedures (roles, rules, prices) rather than opinions. LLM: force outputs into procedures (frames, tests, and remedies). HRM: condition the planner on a procedure schema (who/what/harm/evidence/tests/remedy) so the fixed point equals a completed procedure, not merely a belief vector.
-
Detect grammar type of the query: referential vs. action.
-
If referential: attempt constructive proof/execution; if success → TRUE; if blocked → fall back to probabilistic accounting with explicit error bounds.
-
If action: build a Reciprocity Ledger (parties, demonstrated interests, costs, externalities, warranties, enforcement). Produce a rule, price, or remedy, not a “take.”
-
Attach liability/warranty proportional to scope and stakes.
-
TRUE = constructed, closed, test-passed.
-
POSSIBLY TRUE + WARRANTY = best cooperative action under quantified uncertainty and explicit insurance.
-
ABSTAIN/REQUEST = undecidable without violating reciprocity (your boycott option).
-
Closure debt (C): failed proof/run, unmet halting condition (HRM), or unresolved procedure.
-
Uncertainty mass (U): residual entropy after evidence; posterior spread or equilibrium variance.
-
Externality risk (E): expected unpriced harms on non-consenting parties.
-
Description length (D): MDL of the constructive trace (shorter = better compression, subject to correctness).
-
Warranty debt (W): liability not covered by proposed insurance/escrow/enforcement.
-
LLM training: add RLHF-style reward on low Δ*Delta^* with automatic checkers for C and D, Bayesian evaluators for U, and policy simulators for E/W.
-
HRM training: add an auxiliary head to estimate Δ*Delta^*; use it both as a halting criterion and as a shaping reward so the latent fixed point is the compressed optimum. (Speculative but directly testable.)
-
)Hierarchical planner <-> our “grammar within grammar”: H sets permitted dimensions/operations; L executes lawful transforms; the fixed point = closure. (
-
)Adaptive halting <-> decidability: HRM’s learned halting acts as a mechanical decision to stop when a bounded construction is achieved. Attach the Δ*Delta^* head to make that halting normatively correct, not just numerically stable. (
-
)Small data / strong generalization <-> epistemic compression: HRM’s near-perfect Sudoku and large mazes with ~1k samples indicates genuine internal compression rather than memorized chains; use your constructive + reciprocity scaffolds to push from puzzles → institutions (law/policy). (
,
-
)ARC-AGI results <-> paradigm fit: HRM’s ARC gains suggest it’s learning transformation grammars, not descriptions. That aligns with your operationalism (meaning = procedure). (
-
Router: classify prompt as referential vs. action.
-
Constructive toolchain: Referential → code/solver/prover; return artifact + pass/fail. Action → instantiate Reciprocity Ledger; run scenario sims; produce rule/price/remedy.
-
Warrant pack: attach artifacts, ledger, uncertainty bounds, and Δ*Delta^*.
-
Ternary decision: TRUE / POSSIBLY TRUE + WARRANTY / ABSTAIN.
-
Schema-conditioned planning: feed H with the grammar schema (dimensions, ops, closure tests).
-
Aux heads: (a) Trace projector (compressed state-transition sketch); (b) Warranty head producing Δ*Delta^*; (c) Halting reason code.
-
Training signals: correctness + checker feedback (closure), MDL regularizer (compression), reciprocity penalties from simulators (externalities), and insurance coverage bonuses (warranty).
-
Deployment: emit the operational result + trace + warranty; gate release on Δ*≤τDelta^* le tau.
-
From narrative coherence to constructive warranty.
-
From alignment-only to reciprocity-and-liability.
-
From binary truth to ternary, operational decidability.
-
For action domains, do you want the default abstention to be boycott (no action) or a default rule (e.g., “status-quo with escrow”) when Δ*Delta^*Δ* is above threshold? (OPEN QUESTION)
-
For referential domains, should we treat MDL minimization as co-primary with correctness (Occam pressure), or strictly secondary to checker-verified closure? (OPEN QUESTION)
-
)arXiv HTML view (same paper). (
-
)ARC Prize blog: The Hidden Drivers of HRM’s Performance on ARC-AGI (analysis/overview). (
-
)GitHub: sapientinc/HRM (official repo). (
-
)BDTechTalks explainer on HRM (context, quotes, and positioning beyond CoT). (
Source date (UTC): 2025-08-22 20:35:15 UTC
Original post: https://x.com/i/articles/1958991378220032093