But is computer-use (with real clicks) the right approach? I envision more a world where Apps expose more and more its public methods (e. g. via MCP or whatever) and then we could get rid of all or at least large chunks of the UI interactions. AI systems only need to call then the specific functions.
Just for fun, I built a C# attribute framework on top of MCP so any UI method or event can be invoked directly by an LLM.
Results either get pulled via follow-up calls or — in the Claude Code CLI — passed straight back to the model through a channel, which can then kick off the next task.
Naturally, it also doubles as a regular unit testing toolkit.
But what do we expect from a next level OS? Maybe true birectional voice chat like with a real person. Some BCI support? I.e. colors and wallpapers are selected based on your mood?
Maybe a big change would be when we could get rid of Setting menus. All is just managed by Chats and agents.
@TolentinoTeach I have never heared the complicated argument. But the inefficient argument a lot. People around me want to speed up education but this is not really possible.
@espeed@GaryMarcus Thats works great for a moderate number of tables. Up to hundreds of tables it works much better then one might think. However we tested also a really large and badly structured DB too with close to 2500 tables and it was a complete disaster. GPT 5.2 at that time.