@dwarkesh_sp
Unfortunately, you don’t know me or my organization, but in simple terms, the evals are measuring low dimensionality easy-closure domains (math, programming, tests) with non-existent liability which we consider puzzles, whereas most problems are in high dimensionality hard-closure domains with attached liability. Ergo the evals over estimate the value of the AI in anything that is revenue producing. 😉
I work, my organization works, in high dimensional closure (real world problems), which is where liability exists and revenue to pay for AI exists. And oddly there is no one else even vaguely in the space.
So the evals are not indicative of the value of AI outside of easy closure (mathematics, programming, combinatorics).
Cheers
Curt Doolittle
http://
Runcible.com
Source date (UTC): 2025-11-26 22:20:02 UTC
Original post: https://twitter.com/i/web/status/1993806983120765323
Leave a Reply