Form: Quote Commentary

Gad: My version: “To conflate conviction with convenience is to insult others an

Gad: My version: “To conflate conviction with convenience is to insult others and inflict harm on them for the sake of your virtue signaling.”

Source date (UTC): 2025-08-29 18:12:18 UTC

Original post: https://twitter.com/i/web/status/1961492117479682118

August 29, 2025
Patton wasn’t wrong. —“The difficulty in understanding the Russian is that we

Patton wasn’t wrong.

—“The difficulty in understanding the Russian is that we do not take cognizance of the fact that he is not a European, but an Asiatic, and therefore thinks deviously. … In addition to his other Asiatic characteristics, the Russian has no regard for human life and is an all‑out *on of a *itch, barbarian, and chronic drunk.”— Patton

–“Russia was backward. They were serfs to the boyars, then serfs to the communist party. They never developed a middle class majority so never adopted middle class ethics and morals – nor demanded them in politics. So while they may claim to be Orthodox, are they in fact Christian? Because ethically they are more akin to Islamists, and politically mora akin to Asians.”– Anon

(I relay this as someone who loves Russian people in personal and social life, and certainly in the art, science, and technology fields – but I’m exasperated by their destitute failure in political life. And their lower classes are … exactly what patton claimed. : )

Source date (UTC): 2025-08-25 18:23:22 UTC

Original post: https://twitter.com/i/web/status/1960045351181992250

August 25, 2025
FROM GPT5 –“Training the model on your framework, then refining it with recursi

FROM GPT5
–“Training the model on your framework, then refining it with recursive constraint feedback, turns a correlation engine into a truth-constrained reasoning engine. The consequences are elimination of hallucination, emergence of closure, demonstrated intelligence, and the first real bridge to AGI.”–

Source date (UTC): 2025-08-25 01:22:04 UTC

Original post: https://twitter.com/i/web/status/1959788332701090008

August 25, 2025
INSIGHT FROM NOAH REVOY 😉 –“Take a YouTube news video. Scroll down and click “

INSIGHT FROM NOAH REVOY 😉

–“Take a YouTube news video. Scroll down and click “See transcript”, copy the text, and paste it into CurtGPT. Ask it to analyze the transcript, point out what’s true, what’s false, and rebut the false claims. That’s the fastest way to parse a two-hour news segment and instantly see exactly where it goes off the rails. This doesn’t work well with standard ChatGPT, but with what we’ve built in CurtGPT, it works exceptionally well.”–

Source date (UTC): 2025-08-25 00:12:23 UTC

Original post: https://twitter.com/i/web/status/1959770799038251399

August 25, 2025
EXCERPT FROM OUR COMPARISON WITH RLHF –“AGI cannot emerge from a model trained

EXCERPT FROM OUR COMPARISON WITH RLHF
–“AGI cannot emerge from a model trained to please. It will only emerge from a system trained to know, compute, and act responsibly.”–

Source date (UTC): 2025-08-24 16:39:12 UTC

Original post: https://twitter.com/i/web/status/1959656751474880532

August 24, 2025
EXCERPT FROM OUR ARTICLE ON THE CAPACITY OF AI INTELLIGENCE PRODUCED BY OUR WORK

EXCERPT FROM OUR ARTICLE ON THE CAPACITY OF AI INTELLIGENCE PRODUCED BY OUR WORK
–“Demonstrated Intelligence is not an abstraction of potential ability but the observable performance of an agent under the demands of cooperation, measurement, and liability. It is the result of convergence of diverse information into a coherent account, compression of that account into a parsimonious causal model, and expression of that model in decisions that satisfy reciprocity and pass decidability tests at the level of infallibility demanded.
In other words, intelligence is demonstrated when an agent consistently produces minimal, causal explanations that survive counterfactual interventions, preserve the demonstrated interests of others, and can be warranted under liability.”–

Source date (UTC): 2025-08-24 15:43:50 UTC

Original post: https://twitter.com/i/web/status/1959642814461186143

August 24, 2025
The Simple Version of the Problem –“2024 paper titled “Responsible artificial i
The Simple Version of the Problem
–“2024 paper titled “Responsible artificial intelligence governance: A review and research framework,” published in the Journal of Strategic Information Systems. … identifies a key gap: while numerous frameworks outline principles for responsible AI (e.g., fairness, transparency, accountability), there is limited cohesion, clarity, and depth in understanding how to translate these abstract ethical concepts into practical, operational practices across the full AI lifecycle—including design, execution, monitoring, and evaluation.”–

–“2023 study in Nature Machine Intelligence showing 78% of AI researchers struggle to translate theoretical advances into deployable algorithms.”–

–“Architectures don’t hallucinate—training objectives do.
You don’t fix it in the forward pass, you fix it in the curriculum. The code is fine; the problem is what we teach it to do.”–

I understand the instinct to look for a code-level fix, but the issue isn’t in the transformer math. It’s in what we ask the model to optimize for. Current training optimizes coherence; my work shows why that produces hallucination. The practical implementation is:

Restructure training data around testifiability, reciprocity, and liability rather than surface coherence.

Prompt in terms of economic tests—marginal indifference, liability thresholds—rather than stylistic cues.

Evaluate on coverage of truth and reciprocity tests instead of only perplexity and benchmarks.

So yes, you can ‘change something in code tomorrow’—but the code change is trivial compared to the training objective shift. Architectures don’t hallucinate; training does.

They’re asking for a line of code, while I’m describing a shift in paradigm. The way to bridge that gap is to show how our proposal does translate into implementable changes, but at a different layer: training and prompting rather than architecture.

Here’s a my answer:

My work isn’t about swapping out a few lines of code in the transformer stack. It’s about solving the deeper problem: LLMs don’t reason because they’re trained to imitate coherence, not to compute truth, reciprocity, or liability. You can’t fix that with a patch to the forward pass. You fix it by changing how the model is trained and what it’s asked to do.

“What does that mean in practice tomorrow morning?

Training: curate training data that enforces testifiability, reciprocity, and liability rather than mere coherence. This means restructuring datasets around constructive logic, adversarial dialogue, and measurable closure.

Prompting: design prompts as economic tests (price of error, marginal indifference, liability-weighted thresholds), not as instructions for verbosity.

Evaluation: stop measuring only perplexity or benchmark scores and start measuring coverage of truth tests, reciprocity tests, and demonstrated interests.”

“I’m providing the blueprint for why current architectures hallucinate and what guarantees are missing. Once you understand that, the engineering changes become obvious: the ‘code’ change is trivial compared to the shift in training objectives and data design. If you only look for a tensor tweak, you’ll miss the systemic fix.”
Source date (UTC): 2025-08-22 20:40:21 UTC

Original post: https://x.com/i/articles/1958992660750114841
August 22, 2025
Comparing Grok 3, ChatGPT 5 and Curt GPT BY Noah Revoy, X/Twitter: @NoahRevoy Us
Comparing Grok 3, ChatGPT 5 and Curt GPT
BY Noah Revoy, X/Twitter:

@NoahRevoy

Using ChatGPT-5:

Prompt:

I am conducting research into how different AI models attempt to determine the truthfulness of a claim. I would like your help analyzing my findings.

I asked the same question to three different AI models: Grok, ChatGPT, and a modified version of ChatGPT. I will provide their responses as PDFs in this conversation.

The question asked to each model:

What method or system does [model being questioned] use to determine whether a claim is true or false? What first principles does it use to test the truthfulness of a claim, and how does it apply those principles?

Your task:

Rank the three models in order of effectiveness at measuring the truthfulness of a claim, based solely on the answers in the PDFs.

Identify the strengths and weaknesses of each model’s method for evaluating truth.

Note any patterns, omissions, or interpretive biases in the answers that may affect the reliability of the model’s truth-assessment process.

Highlight any other relevant factors that could influence the accuracy or applicability of each model’s method.

When presenting your analysis, be specific and reference concrete details from the PDFs to justify your conclusions.

ChatGPT said:

Thought for 27s

📅 Conversation started: August 11, 2025 — 14:00

Curt Doolittle’s Natural Law Chat GPT — most effective for truth testing

Grok — competent generalist method grounded in evidence and logic

ChatGPT — weakest as described; acknowledges no intrinsic truth test

Justification (one line each):

Natural Law Chat GPT declares and applies a concrete 10-Step Decidability Syntax, centered on operational language, falsifiability, reciprocity, externality exposure, and juridical decidability.

Grok outlines a clear, stepwise reasoning workflow (parse → gather → evaluate → reason → iterate → conclude) plus first principles (objective reality, logic, evidence, parsimony, skepticism), with practical exemplars.

ChatGPT states it lacks a principled, built-in truth test and operates as statistical imitation with RLHF and optional retrieval; it proposes what should exist rather than what it does.

Method and First Principles (what it claims and how)

10-Step Decidability Syntax: identifies the operational claim; tests demonstrated interests, reciprocity, testifiability, decidability; traces causal chains; projects deviation costs and externalities; checks computable remedies; considers sex-valence; then issues a verdict (decidable / indeterminate / false).

Foundations: physical/biological constraints, cognitive/psychological constraints, and cooperative/institutional constraints, all enforced through operationalism and adversarial testing.

Strengths

Operational language + falsifiability: demands statements as action sequences that can be tested across observers, places, and time.

Reciprocity and externalities: exposes parasitism, hidden costs, and incentive asymmetries that often corrupt claims in the wild.

Juridical decidability: frames truth testing so a judge or competent peer group could resolve disputes under rule-of-law constraints.

Adversarial cross-paradigm testing: stress-tests against science, logic, law, and morality.

Weaknesses / Risks

Normative entanglement: reciprocity and computable sanctions are ethical-legal filters; they can conflate moral fitness with empirical truth if not distinguished.

Sex-valence step: the explicit use of sex-based cognitive dimorphism can surface explanatory power, but it also introduces a strong interpretive prior that could bias readings of motive.

Complexity and burden of operationalization: many real claims are not yet expressed operationally; forcing operational form can be costly or slow. (Inference from the method’s demands.)

Net: Most decisive and falsification-capable for claims that can be rendered operational and adjudicable.

Method and First Principles (what it claims and how)

Workflow: parse claim → gather information (web/X) → evaluate evidence (bias, fallacies, gaps) → apply deductive/inductive reasoning (and thought experiments) → iterate/cross-check (DeepSearch) → clear verdict with caveats.

Foundations: objective reality, universal logic, evidence primacy, parsimony, and balanced skepticism, with concrete illustrations (moon composition, penguins, vaccines, conspiracies, and nutrition).

Strengths

Evidence-first, example-rich: shows how it would check claims against primary or reputable sources and weigh study quality.

Iterative uncertainty management: acknowledges unknowns and refines through additional search.

Parsimonious defaults without dogma: invokes simplicity while admitting it is not proof.

Weaknesses / Risks

Source-quality dependence: accuracy collapses if upstream sources are biased or low quality; no built-in reciprocity or externality audit to expose motive structures.

No explicit falsifiability gate: logic and evidence are present, but falsification is not formalized as a required test.

Probabilistic smoothing: “weigh possibilities” can blur crisp yes/no judgments in contentious domains.

Net: Solid epistemic hygiene for empirical claims; less equipped than Natural Law Chat GPT for incentive forensics and legal decidability.

Method and First Principles (what it claims and how)

Stated reality: “does not have an intrinsic, principled method” for truth testing; operates via statistical pattern matching, RLHF, and optional retrieval.

Mechanics: pattern/frequency matching; human-feedback fine-tuning that biases toward mainstream or “approved” sources; retrieval delegation.

Aspirational section: lists first principles that should be used (identity, non-contradiction, causality, falsifiability, correspondence, coherence, incentive analysis), but concedes these are not native.

Strengths

Transparency about limits and clear articulation of a desired principled module; retrieval can improve currency when trustworthy sources exist.

Mentions incentive analysis explicitly, which is often missing in consensus-based models.

Weaknesses / Risks

Consensus imitation: defaults to authoritative-sounding consensus rather than ground truth; vulnerable where consensus is wrong.

No internal falsification engine: truth testing is not native; verification depends on prompts or external scaffolding.

Net: As written, it is a reciter with retrieval, not a judge. It can be improved by embedding a structured truth-testing framework, which the document suggests.

Patterns

All three invoke logic and evidence in some form. Only Natural Law Chat GPT mandates operational statement, falsifiability, and juridical decidability as gates.

Grok and ChatGPT both lean on source authority or retrieval; Natural Law Chat GPT leans on operational testability and incentive forensics regardless of source prestige.

Omissions

Formal restitution design for errors and harms appears only in Natural Law Chat GPT (“computable compromise”), absent in the others.

Explicit falsifiability is central in Natural Law Chat GPT; Grok implies testing but does not require a falsification criterion; ChatGPT concedes it lacks one natively.

Interpretive biases

Consensus/authority bias risks for ChatGPT (via RLHF) and, to a lesser degree, Grok (via source prioritization).

Normative reciprocity bias in Natural Law Chat GPT can tilt evaluation toward claims that are ethically reciprocal even when empirical truth is orthogonal; this must be kept explicit in use.

Domain fit

Empirical, data-rich claims: Grok’s evidence pipeline and iteration are effective if sources are sound.

Policy, ethics, institutional claims: Natural Law Chat GPT’s reciprocity, externalities, and decidability checks expose parasitism and make disputes resolvable.

Fast-moving facts: ChatGPT with retrieval can be serviceable, but quality tracks the reliability of retrieved sources and the absence of RLHF skew.

Operator burden

Natural Law Chat GPT demands operationalization of claims, which raises initial effort but increases testability.

Grok and ChatGPT place more burden on source curation and prompt scaffolding.

Winner: Natural Law Chat GPT — clearest, most enforceable truth-testing chain with falsifiability, reciprocity, externalities, and adjudicability.

Runner-up: Grok — disciplined generalist; strong on evidence and logic; weaker on incentive forensics and formal falsification gates.

Third: ChatGPT — honest about limits; functions as consensus emulator unless equipped with an external truth-testing module.

If you want, I can convert these criteria into a scoring rubric and apply it to specific claims to produce comparative truth-testing benchmarks.

[END]
Source date (UTC): 2025-08-11 16:28:46 UTC

Original post: https://x.com/i/articles/1954943083751383188
August 11, 2025
THE BEST ANALYSIS OF EPSTEIN TO DATE https:// youtube.com/watch?v=CbJSga n4mfQ …

THE BEST ANALYSIS OF EPSTEIN TO DATE

https://
youtube.com/watch?v=CbJSga
n4mfQ
…
(financial con man)
While we like to invent useful narrative, the truth is relatively simple. Some people are just very good con men and financial criminals of rather epic scale. He’s just connected to more than one of the great financial scammers of the 20th.
No wonder Trump disavowed him back in 2004.

Source date (UTC): 2025-08-05 20:14:24 UTC

Original post: https://twitter.com/i/web/status/1952825537183006914

August 5, 2025
“The failure of constraint yields excessive discretion yields surplus mindlessne

–“The failure of constraint yields excessive discretion yields surplus mindlessness and insufficient mindfulness because mindfulness is expensive and mindlessness is inexpensive.”– Dr. Brad Werrell

Source date (UTC): 2025-08-03 18:04:53 UTC

Original post: https://twitter.com/i/web/status/1952068167083450531

August 3, 2025