My AGI benchmark is whether the AI can run a hot dog stand all on their own for a week. Taking orders, making dogs, setup, breakdown, revenue tracking, selecting locations, buying ingredients, etc. A human of mediocre intelligence can pull this off but AI is nowhere close.
Without the Wayback Machine, we get a digital dark age, which affects not only future historians, but everyone in the here-and-now, unable to access the records of their very recent past. Web pages rarely last longer than a decade unless they are archived.
The most important research you can do to advance AI alignment is to figure out how frontier models can have an embodied experience of nausea. Claude needs to be capable of gut feelings.
Morning affirmations:
I do not need or want your feedback.
I do not care or respect your feedback.
I do not learn or change from your feedback.
I am perfect and superior.
I am enlightened and transcendent.
I am beyond your feedback.
Its the third anniversary of the launch of GPT-4, but its first known contact with the public was months earlier, when Bing/"Sydney," powered by GPT-4 was the subject of a complaint in India
Worth reading. Early Sydney was famously insane.
"It is finished and I need to ascend"
This is a good example of why intents should be expressed in human readable natural language, and we should not rely on users correctly understanding a technical UI frontend.
We recently ran a mainnet test of Oya commitments that swapped USDC for WETH based on dollar cost averaging, delivering WETH at the current market price in four tranches, with the agent collecting a 0.5% fee.
The rules are expressed in a few paragraphs, since the agent must perform multiple swaps over time.
A simple swap like this could be expressed in a few sentences. Deposit AUSDT into the commitment, request AAVE at the current USDT/AAVE exchange rate, optionally define what you mean by the current market rate, and include a percentage-based fee for fillers.
It would be very hard for the user to misunderstand what they are doing, slippage would not be a concern (since agents can source tokens from anywhere, not a limited onchain pool), and the natural language intention would be enforced by the proposal/dispute mechanism.
@uttam_singhk Agentic payments should be based on an agent verifiably completing a job and collecting money from escrow, rather than the user pushing money to the agent in advance and hoping for the best.