New Preprint! https://t.co/qbiPEwZAIh
Question: Are humans doing next-token generation like LLMs?
One important source of behavioral evidence comes from surprisal: when LLMs are used to estimate how unexpected each word is in context, those surprisal values turn out to predict human reading times remarkably well. This convergence is consistent with the idea that humans, like LLMs, are running something like next-token prediction.
But surprisal only captures the final probability distribution the model outputs. It tells us what the model concludes about the next word, not the rich sequential computation that produced that conclusion. If we want to test whether humans and LLMs share something deeper than convergent output statistics, we need a measure of what the model is doing inside.
In this paper, I introduce trajectory extrapolation error: a measure that captures a model's internal representational geometry as it processes each word. Rather than asking what the model predicts, it asks how the model's internal state is moving, and how much each new word disrupts the trajectory it had established.
I found that this measure independently predicts human reading times beyond surprisal, across multiple datasets and model architectures.
Why this matters: trajectory extrapolation error gives us a window into the model's actual sequential processing, not just its output. The fact that this internal geometry tracks human reading behavior is much stronger evidence for human/LLM correspondence than surprisal alone could provide.
More work is upcoming that shows trajectory extrapolation is a better predictor of brain activity during langauge processing too.
Artificial intelligences do not undergo experiences, do not possess a body, do not feel joy or pain, do not mature through relationships, and do not know from within what love, work, friendship or responsibility mean. Nor do they have a moral conscience, since they do not judge good and evil, grasp the ultimate meaning of situations, or bear responsibility for consequences. They may imitate or even simulate, but they do not understand what they produce, for they lack the affective, relational, and spiritual perspective through which human beings grow in wisdom. #MagnificaHumanitas