Category: AI, Computation, and Technology

  • I should have mentioned the fourth innovation, but (foolishly) didn’t consider i

    I should have mentioned the fourth innovation, but (foolishly) didn’t consider it important at the time I posted:

    iv) The limit of weights to eight bits reduces the memory capacity without (apparently) affecting the outputs.

    So Experts, Update Range, Tokens, Bits, combined with the Reduction (Synthesis) of OpenAI’s data, and in most cases further reduction to other frameworks, compress the work effort of inference.


    Source date (UTC): 2025-01-28 18:29:26 UTC

    Original post: https://twitter.com/i/web/status/1884307815801761793

    Replying to: https://twitter.com/i/web/status/1884059118534811825



    IN REPLY TO:

    Unknown author

    (Doolittle on AI)
    RE: Deepseek Nonsense
    Ok, been through the code that’s available. It’s not only obvious that the training code isn’t shared, but from what I’ve gathered, they are afraid or ashamed to share it for good reason.

    1) There are three innovations in the code that Deepseek used to save compute:
    i) A mixture of experts divides the problem vertically into silos.
    ii) Limiting the network hierarchy that’s updated (reinforced) divides the problem scope horizontally.
    iii) Predicting phrases instead of tokens reduces the network numerically ( and I suspect produces more semantic value per byte so to speak)

    2) They used existing code from Meta’s Open Source LLM, and slightly modified it.

    3) I am not positive, but given the code thinks it’s OpenAi, the absence of the training code, and the similarity of the results, it appears that they either got a copy of the OpenAI weights, OR they traversed the OpenAI graph using multiple accounts instead of ‘training’ Deepseek from source data.

    3) In other words, yes there are innovations but they are micro innovations on existing work and another example of intellectual property theft.

    Ergo: DeepSeek is little more than a raid on OpenAI’s intellectual property under the pretense of replication of the work effort, which does in the end result in the conversion of OpenAI’s private intellectual property to an Open Source that can be used by others IF we can recreate the training code so that we can tweak the model and add new verticals (Experts) to it.

    I don’t have time for this kind of skullduggery but our company’s future depends upon access to an AI we can train by adding an expert to it, a chain of thought for that expert and an API consisting of a set of prompts to feed that chain of thought.

    Cheers
    CD

    Original post: https://x.com/i/web/status/1884059118534811825

  • RT @ThruTheHayes: IT’S ALL MOOT COMPUTE TO BOOT Is humanity going to use AI to s

    RT @ThruTheHayes: IT’S ALL MOOT COMPUTE TO BOOT

    Is humanity going to use AI to solve for something other than speeding the transfer of wea…


    Source date (UTC): 2025-01-28 15:42:07 UTC

    Original post: https://twitter.com/i/web/status/1884265709548757421

  • (Doolittle on AI) RE: Turns out it’s not complicated. So there is no moat. I sho

    (Doolittle on AI)
    RE: Turns out it’s not complicated. So there is no moat.
    I should probably write something about why there is no moat for AI, because it’s taking advantage of the intrinsic data structure of sentences (language). And that once the LLM strategy of brute forcing an n-dimensional manifold representation was invented, the simplicity of the architecture for AI was exposed, and that its logic is just a commodity.

    Once you solve the interface problem – which LLMs do very well by using ordinary language as a programming language – the entire stack of human cognitive faculties is simply a matter of recursive filtering with recursive wayfinding.

    I’ve been writing about the simplicity of consciousness for years now, and it’s going to emerge as rather obvious even for those nonsense-philosophers who lack knowledge of neuroscience and cognitive science such that they are trapped in the fantastic just as their supernatural and theological peers remain.

    (Repost)
    Curt
    Doolittle


    Source date (UTC): 2025-01-28 03:05:05 UTC

    Original post: https://twitter.com/i/web/status/1884075194861645824

  • (Doolittle on AI) RE: Deepseek Nonsense Ok, been through the code that’s availab

    (Doolittle on AI)
    RE: Deepseek Nonsense
    Ok, been through the code that’s available. It’s not only obvious that the training code isn’t shared, but from what I’ve gathered, they are afraid or ashamed to share it for good reason.

    1) There are three innovations in the code that Deepseek used to save compute:
    i) A mixture of experts divides the problem vertically into silos.
    ii) Limiting the network hierarchy that’s updated (reinforced) divides the problem scope horizontally.
    iii) Predicting phrases instead of tokens reduces the network numerically ( and I suspect produces more semantic value per byte so to speak)

    2) They used existing code from Meta’s Open Source LLM, and slightly modified it.

    3) I am not positive, but given the code thinks it’s OpenAi, the absence of the training code, and the similarity of the results, it appears that they either got a copy of the OpenAI weights, OR they traversed the OpenAI graph using multiple accounts instead of ‘training’ Deepseek from source data.

    3) In other words, yes there are innovations but they are micro innovations on existing work and another example of intellectual property theft.

    Ergo: DeepSeek is little more than a raid on OpenAI’s intellectual property under the pretense of replication of the work effort, which does in the end result in the conversion of OpenAI’s private intellectual property to an Open Source that can be used by others IF we can recreate the training code so that we can tweak the model and add new verticals (Experts) to it.

    I don’t have time for this kind of skullduggery but our company’s future depends upon access to an AI we can train by adding an expert to it, a chain of thought for that expert and an API consisting of a set of prompts to feed that chain of thought.

    Cheers
    CD


    Source date (UTC): 2025-01-28 02:01:12 UTC

    Original post: https://twitter.com/i/web/status/1884059118367113216

  • That feeling when you need to ask GROK to explain a joke

    That feeling when you need to ask GROK to explain a joke.


    Source date (UTC): 2025-01-27 19:06:43 UTC

    Original post: https://twitter.com/i/web/status/1883954810845905156

  • Well, it would be nice if the source was open. Without the training code the pro

    Well, it would be nice if the source was open. Without the training code the processing code is curious intellectually but useless for the purpose of modifying the model.


    Source date (UTC): 2025-01-27 18:58:26 UTC

    Original post: https://twitter.com/i/web/status/1883952723403616584

    Reply addressees: @marcb_xyz @percyliang @deepseek_ai

    Replying to: https://twitter.com/i/web/status/1883698222453121302

  • These things take days to produce. If you’re disciplined you can do it with a se

    These things take days to produce. If you’re disciplined you can do it with a sequence of prompts from GPT4o (general knowledge base) and then augment with Perplexity. I can show you how to do it. I have a particular gift (talent, or curse maybe) that I percieve existence through this lens. SO I only need to fill in the holes I see.
    I’ve notice (we all have) that those who become good at the method (NL), we all begin to ”see” things as spectra such as what they were what they are and what they will be. Getting people to see the field of those possibilities is just a matter of practice. Other than myself, luke and brad are probably the best at it. Brad so much so that he finds it somewhat disturbing. 😉
    Like mathematics, you develop an intuition after a while. It’s the practice that produces the intuition that delivers the value. Because it rapidly discounts the cost of all reasoning in every field, just as math does in mathematics.

    Reply addressees: @RichardArion1


    Source date (UTC): 2025-01-24 19:25:45 UTC

    Original post: https://twitter.com/i/web/status/1882872434187292672

    Replying to: https://twitter.com/i/web/status/1882870345918726236

  • RT @curtdoolittle: @garytoobock @varunrdeshpande @pmarca –“Weird to call it Ope

    RT @curtdoolittle: @garytoobock @varunrdeshpande @pmarca –“Weird to call it OpenAI when it’s not open source?”–

    Which is Musk’s complain…


    Source date (UTC): 2025-01-24 19:20:15 UTC

    Original post: https://twitter.com/i/web/status/1882871052336083128

  • “Weird to call it OpenAI when it’s not open source?”– Which is Musk’s complaint

    –“Weird to call it OpenAI when it’s not open source?”–

    Which is Musk’s complaint. But then, in the end, we can almost guarantee this was the most private sector money that chased what they thought would remain a private sector asset, and which will become an asset of the commons, because once understood it’s not even complicated. It’s just costly on current generations of hardware that process vectors in parallel using ungodly numbers of watts. Vs, while we are slower and more limited we so at something on the order of twenty watts.

    I mean, we went through the desktop boom, then the web boom, then the smartphone boom, and now the AI boom but the exploitation of opportunity curve for each was much shorter than originally anticipated.

    Reply addressees: @garytoobock @varunrdeshpande @pmarca


    Source date (UTC): 2025-01-24 19:20:10 UTC

    Original post: https://twitter.com/i/web/status/1882871029665804288

    Replying to: https://twitter.com/i/web/status/1882870105706729542

  • I should probably write something about why there is no moat for AI, because it’

    I should probably write something about why there is no moat for AI, because it’s taking advantage of the intrinsic data structure of sentences (language). And that once the LLM strategy of brute forcing an n-dimensional manifold representation was invented, the simplicity of the architecture for AI was exposed, and that its logic is just a commodity.

    Once you solve the interface problem – which LLMs do very well by using ordinary language as a programming language – the entire stack of human cognitive faculties is simply a matter of recursive filtering with recursive wayfinding.

    I’ve been writing about the simplicity of consciousness for years now, and it’s going to emerge as rather obvious even for those nonsense-philosophers who lack knowledge of neuroscience and cognitive science such that they are trapped in the fantastic just as their supernatural and theological peers remain.

    Curt Doolittle
    NLI

    Reply addressees: @varunrdeshpande @pmarca


    Source date (UTC): 2025-01-24 16:38:13 UTC

    Original post: https://twitter.com/i/web/status/1882830273941065728

    Replying to: https://twitter.com/i/web/status/1882731658849530106