The next time a machine guesses exactly what you want, ask yourself whether it succeeded in understanding what you want or whether it succeeded in teaching you what to want.
@ID_AA_Carmack There are about 4 levers that can be used: context, harness, authority-boundary, and problem shape. Plenty of degrees of freedom between them to get extremely good performance with small models but DevX isn't there yet to make this easy for people to use.
@andrewlekashman <<< this >>>
People going around saying "idk what the problem is" are really telling on themselves.
Anyone using AI seriously in any domain today is doing enough harness engineering in their loop that they can trigger model downgrades and lobotomy router easily.
Total BS release, esp. the whole "we will use activation-steering to make the model give you stupid ass suggestions if you work on anything related to AI".
Who the hell is working on any domain with heavy use of AI who is not simultaneously doing harness engineering and research that would trip the stupid lobotomy router?
The insight is that they are the one and the same. There are no generalists, there are only specialists who have grown to take their environment for granted. You always find yourself in company of specialists, with some more or less specialized to dispatching. - That is probably the closest root to the useful distinction you're hinting at.
Spending $100 for an oil change doesn't make you a car mechanic anymore than $100 spent on a rotisserie chicken makes you a hunter, a rancher, or a cook.
Monetary transaction analogy makes the point obvious, but it is not about money. - It is about the transaction.
Transaction happens every time you walk on a paved road. Every time you are able to buy a tool instead of having to figure out how to make one. Every time you boot up computer despite not knowing how to build one from scratch. - We transact against the momentum of the civilization and those who came before us.
The notion of storage is useful among cabinets, ledgers, and dry goods. It is a modest servant inthe counting house. But at the scale of minds, practices, lineages, and civilizations, there is nomere retrieval.
There is only regrowth
I found that it depends on the model (and model architecture). Smaller models do not have as much capacity for in-context learning so there is less possibility for adaptation outside of mutating the harness and the shape of the problem.
We are presently at the junkyard stage of model embodiment, where the models, by virtue of being trained on human-generated outputs, have to be embodied in a harness that mimics the human environment (shells, commonly used tools, etc.). - I call it the "junkyard stage" because we are assembling harnesses from primitives already found lying around.
The biggest hurdle for squeezing more juice out of human-generated data is, currently, not the lack of better data, but the momentum of UX. - The customer pool for the frontier models expects and demands interoperability with their own choice of tools. These are the apocryphal "demands for faster horses" that we have to transcend in order to unlock greater potential.
Within the realm of using human-generated data, there is tremendous unlock hidden in post-training large models to prefer, for example, Python REPL style interaction instead of the shell-based interaction. - The unlock comes from the fact that it is much easier to create a software-defined Python REPL (or Jupyter Notebook) than it is to create a fully software-defined facsimile of a Linux OS shell. - And the other benefit of this is increased legibility (everything is a call graph), enforcement, and general control surface over the agent. Finally, it unlocks the ability to begin earnestly evolving the bodies of agents with the problem domains.
I don't know if frontier labs will ever be able to make a huge bet on a concept like this at a large model scale. - Maybe they will, once the practices are fully or quasi nationalized and given unlimited budgets.
I'm presently working on demonstrating that at the smaller scale, where embodiment and problem-shaping can allow drawing out more, much more, domain-specific performance than training alone.
@Strife212 Consciousness is a vibes-based, nonsense concept. It is practically synonymous with "the subjective feeling of being special and especially aware".