Theme: AI

My argument would be that the innovations in Chinese open source models is a mat

My argument would be that the innovations in Chinese open source models is a matter of reducing compute without significant degradation of signal, but the innovations in closed source models are in computation and computability.

Source date (UTC): 2025-12-03 01:05:49 UTC

Original post: https://twitter.com/i/web/status/1996023031857381566

December 3, 2025
THE RUNCIBLE GOVERNANCE LAYER (Technical Version (ML Research Audience)) The gov

THE RUNCIBLE GOVERNANCE LAYER
(Technical Version (ML Research Audience))

The governance layer operates as a deterministic control system wrapped around an arbitrary foundation model.

The architecture is LLM-agnostic; it can govern any sufficiently large model whose parameter count supports high-dimensional reasoning, stable abstraction, and long-horizon dependency tracking.

The base model is treated as an untrusted but extremely capable function approximator, and the governance layer provides the closure, constraint, and decidability conditions the model itself cannot satisfy.

Architecture:
The system architecture consists of the following components, each playing a necessary and non-substitutable role:

Loader (Execution Engine):
A lightweight runtime that intercepts every request, parses protocol definitions, orchestrates tool calls, enforces constraints, and mediates all interaction with the underlying LLM. It standardizes inference behavior across models.

Protocols (YAML/JSON as Semantic Code):
Protocols provide the operational grammar. They specify context limits, constraint logic, adversarial checks, decomposition steps, permissible transformations, verification sequences, and required outputs. YAML functions as human-legible code defining the permissible state transitions in the reasoning process.

RAG Layer (Structured Research Index):
A retrieval system populated exclusively with the organization’s validated research corpus. It provides references, reduction paths, causal models, logical dependencies, and formal definitions. The RAG enforces epistemic locality: the LLM can only draw from adjudicated knowledge.

Truth Corpus (Certificate Ledger):
A versioned repository of certificates—fully resolved and verified problem–solution pairs that have survived decidability tests, reciprocity tests, and falsification attempts. Certificates provide high-value training anchors: they encode working solutions to genuinely difficult problems.

Training Agent (Certificate Compiler):
A transformation engine that converts certificates into structured training modules. The agent then submits these modules into the fine-tuning process to update the foundation model in bounded increments. This creates a closed feedback loop: research → certificate → training → improved inference → more certificates.

Attention Modifiers (Token-Economy Optimization):
Low-level control over attention masks, routing heuristics, and context allocation reduces compute by forcing the model to attend only to causally necessary information. This implements “possibility filters” and “truth-reciprocity filters” inside the attention mechanism.

The core innovation is the formalization of machine decidability: a method for converting the human concepts of possibility, testifiability, reciprocity, and liability into computable constraints the model must satisfy before it may emit a result. This converts an LLM from a probabilistic language model into a constrained reasoning engine with predictable failure bounds.

The theory behind machine decidability requires careful study, but the implementation becomes relatively direct once the dependency graph—protocols → constraints → corpus → certificates → training—is understood.

Source date (UTC): 2025-12-02 18:29:13 UTC

Original post: https://twitter.com/i/web/status/1995923221678620911

December 2, 2025
All language consists of measurement. (yes) There should be no reason that if so

All language consists of measurement. (yes)
There should be no reason that if something is described in language it can’t be modeled. The question is wether the LLM can be constrained to an operational model using langauge or it must use a tool (shell out) to do so (as it does with programming). To some degree we should treat programming as the equivalent of humans using any measurement tool.
In our work we force high dimensionality questions into operational prose, sequences of tests, and distinct outputs. I can’t yet fully test it’s operationalization against the ternary logic hierarchy since I need to finish what I’m working on first. But the partial tests work fine.
But asking it how to fix a 64 ford carburetor or something of that nature is wholly dependent upon existing text. Which is true for anything in that real world category.
I dont consider any of that very challenging. The robotics folks are tearing up the universe already. So between self driving (perception and navigation), robotics (manipulation and transformation) and llms (concepts and language) it’s just a matter (just? 😉 ) of representing and interfacing the three domains. And we have data models and languages for doing so.
Regardless of what others think, IMO the hard problem has always been language, and attention was the revolutionary leap that made it possible. Language is the system of measurement for humans at human scale.

Source date (UTC): 2025-11-28 23:58:27 UTC

Original post: https://twitter.com/i/web/status/1994556524400971860

November 28, 2025
We know how to solve the problem of computability using LLMs. I would argue that

We know how to solve the problem of computability using LLMs. I would argue that the foundation model producers don’t understand the problem which is why they can’t solve it.
We did. And it’s really, really, hard.

Source date (UTC): 2025-11-26 22:40:15 UTC

Original post: https://twitter.com/i/web/status/1993812071356747997

November 26, 2025
I would argue that’s not quite true. The brain is possible to understand at leas

I would argue that’s not quite true. The brain is possible to understand at least functionally. If we look at LLMs as the language faculty, and that we’re brute forcing the LLM’s world models via language, but that we haven’t yet created the prefrontal cortex and consciousness, then every LLM behavior is obvious and predictable. The impediment to completing that circuit is that it dramatically increases costs.

Source date (UTC): 2025-11-26 22:25:45 UTC

Original post: https://twitter.com/i/web/status/1993808423214043355

November 26, 2025
In our opinion (our organization) this is true. The value of any ai is dependent

In our opinion (our organization) this is true. The value of any ai is dependent upon the capacity of individuals to leverage extant AI. For the .001% of us, the value is infinite. But that value doesn’t scale enough to pay for the absurd cost of compute.

I don’t know if architectures is the right frame, I might argue it’s contexts. One must know enough to ask the meaningful question. And the AI must know the context in order to meaningfully respond to it.

Source date (UTC): 2025-11-26 22:22:41 UTC

Original post: https://twitter.com/i/web/status/1993807650879098983

November 26, 2025
@dwarkesh_sp Unfortunately, you don’t know me or my organization, but in simple

@dwarkesh_sp

Unfortunately, you don’t know me or my organization, but in simple terms, the evals are measuring low dimensionality easy-closure domains (math, programming, tests) with non-existent liability which we consider puzzles, whereas most problems are in high dimensionality hard-closure domains with attached liability. Ergo the evals over estimate the value of the AI in anything that is revenue producing. 😉

I work, my organization works, in high dimensional closure (real world problems), which is where liability exists and revenue to pay for AI exists. And oddly there is no one else even vaguely in the space.

So the evals are not indicative of the value of AI outside of easy closure (mathematics, programming, combinatorics).

Cheers
Curt Doolittle

http://
Runcible.com

Source date (UTC): 2025-11-26 22:20:02 UTC

Original post: https://twitter.com/i/web/status/1993806983120765323

November 26, 2025
You are incorrect. It is absolutely possible. We have done it. You are, like man

You are incorrect. It is absolutely possible. We have done it. You are, like many in and out of the field, confusing LLMs as the equivalent of the human language faculty, with LLMs as including the prefrontal cortex’s regulation.
We create a governance layer, that constrains the navigation through the latent space, as a set of tests through that space, and then emitting a narrative of that result.
This is what brains do.
Over time the feedback from these outputs will constrain the latent space without human intervention.
We are early in the development of ai.
After our solution we still need to solve episodic memory in way that is not prohibitively expensive.

Cheers

Source date (UTC): 2025-11-24 18:02:10 UTC

Original post: https://twitter.com/i/web/status/1993017312551878842

November 24, 2025
No. Our governance layer prohibits hallucination. Cheers

No. Our governance layer prohibits hallucination.
Cheers

Source date (UTC): 2025-11-24 01:27:46 UTC

Original post: https://twitter.com/i/web/status/1992767064206160356

November 24, 2025
Disagree. They mirror the human language faculty with extraordinary accuracy. Th

Disagree. They mirror the human language faculty with extraordinary accuracy. The fact that we must refine the constraint of the traversal or and build a better context before navigating, is merely a reflection of our stage of development.

Curt Doolittle
runcible inc.

Source date (UTC): 2025-11-24 01:21:56 UTC

Original post: https://twitter.com/i/web/status/1992765596078129347

November 24, 2025