Prediction error isn't enough. A model can be locally coherent and globally blind. On frame revision, metacognition, and what AI still can't do.
[https://t.co/ZJDUWoy9OP]
[5/5]
Full writeup with all 4 experiments (latent structure, degradation, surprise under artifact injection, action counterfactual).
Post: https://t.co/rPtfHb1o3w
Code: https://t.co/a8uADnu5jr
Always open to collaborate on world models / embodied AI — reach out!
[1/5]
I ran 4 diagnostic probes on LeWM - https://t.co/6XBCSUnbeL
Read the thread to find out how far you can trust this JEPA predictor for planning.
🧵...
[4/5]
Ran an action counterfactual: same rollout, three conditions - real actions, shuffled actions, all zeros.
Real actions give ~30-40% lower error at steps 1–5. Gap closes by step 10.
Predictor conditions on actions, but only in a short window. Beyond that it ignores them.
A more elegant, possibly hybrid-symbolic, approach would not need this scaling trickery to solve the problem. It makes you wonder if all the engineering tricks used in modern LLMs are really just forcing an unsuitable solution to work, instead of finding better approaches
Reproducing Marcus' experiment from The Algebraic Mind on multi-nodes-per-variable NN inability to generalize UQOTOM functions
Without seeing training data with a 1 in the last digit of the array, the model quickly learns to force the last digit to be 0 ...
It turns out we can! Without changing anything except the scaling of the input data, we can improve the validation results of the final digit. This solution is not elegant and does not solve the underlying problem Marcus is pointing at; backprop being local ...
@pmddomingos Wouldn't it be their whole "conscious" life, since we may not remember things from adolescence? Or do you believe those early histories still affect us subconsciously?
What's the point of following all of the twists and turns of the OpenAI drama in real time? Let's go back to tweeting about science and wait for the recap in three days after the dust settles