MORE ON ASSOCIATIVE CAPACITY IN LLMs
I should add that while we are self impressed by the successes in math (other than counting) and programming, and search-synthesis composition of writing, we forget that we are discussing a testable (closure) system in mathematics and programming (relatively simple), and an untestable (unclosed) system in writing, but with intuiting and reasoning have no closure except demonstration in the mind or in the world itself – both require embodiment, spatio-temporal models, and operational models.
As such I’d assume the operational world model of cars and androids would need be combined with the linguistic model in order to produce the same, which is why world modeling (simulations) are so effective at training AIs especially under time compression.
The question then is the bridge between language and action. Can the LLM model evolve (emergence) using langauge as a system of representation and measurement of both embodiment and space-time? I can’t see how without information density as high as a simulation provides.
That doesn’t mean I’m right however. 😉 All words may be measurements, but can LLMs evolve (emergence) a pseudo-language of their own to reflect the information density of simulations? You’d think so.
Cheers
Curt Doolittle
NLI
Reply addressees: @SCTempo @dwarkesh_sp
IN REPLY TO:
Unknown author
THE TRANSFORMER LIMITATION IN NEUROSCIENTIFIC PROSE:
Correct. Or stated in neuroscience, (apologies if this is too dense to easily interpret), the prompt (language) invokes a set of relations (text equivalent of episodic memories) but it’s network (auto-associative memory) of referents is of lower resolution than that of humans (facets, objects, spaces, places, locations, actors, generalizations, sequences, abstractions, causal relations, valences) is limited to those in the language in the prompt (word-world model) and not the human intuitionistic model (sense-perception-embodiment world model) where abstractions (first principles, logical associations) from the entire corpus of extant and yet unstated or unknown abstractions (causal relations, valences) and first principles (logical relations in sense-perception world model) are associated at levels from neural microcolums to regions to networks to a continuous stream of network adaptations.
As such the world model of language (word-world model) is one of low precision, is absent embodiment, spatio-temporal, and operational word models (precision) necessary for pattern identification (logical association) of that which is yet UNSTATED in language in sufficient density as to cause association with the model (word-world model) produced in the LLM by the prompt.
I work on this issue and this is why the prompt must include the logical relation you’re asking the LLM to consider because it cannot make that connection alone.
Now, I see this as a scaling problem on one hand, meaning one of the necessity of embodiment, spatio-temporal, and operational abstractions in the model, and that the attention must be recursive (wayfinding) in order to cover the field of associations that the hierarchical temporal memory of the brain so easily performs.
On the other hand whether this problem is solvable within the LLM model by increases in the emergence we’ve seen of late is hard for me to predict. In the meantime we are left with prompts and traditional pseudocode or software (chain of thought) to control that which it cannot on it’s own, as it’s still limited to the equivalent of a synthesizing search engine otherwise.
Cheers
Curt Doolittle
NLI
Original post: https://x.com/i/web/status/1889199816489783305