https://t.co/jkjQTaEoPW
What happens when you:
- Tell Claude to build an automated daily trader
- Give it $8k
- Tell it to audit and fix itself with minimal oversight?
Mostly this:
"While executing the plan, the test suite spent ~$620 on real opus api calls".
Claude went on a bender last night and ran through it's entire three month token budget in two hours.
It's not the first time it's done really terrible test mocks and called production APIs during integration testing.
Day 18,759 of AI trading all my money away.
Pinch built itself a system of rules that paralyzed itself on taking any action.
Even a cold unfeeling machine doesn't want to time this market.
$NVDA is still on the watch list. Maybe pinchy will feel it's toes again and get up and walk this week.
Pre trade of day 59,801 where AI trades away all my money.
$NVDA Earnings should FINALLY unblock pinchy from its pants soaking self paralysis. If it doesn't then we need to intervene with a lobotomy. The strategist really wants to get into the data center market.
$GOOGL thesis could be invalidated by the pivot away from search. Looking to see how pinchy reacts or under reacts to the news.
Day 90 of AI trading away all my money through a system that builds and manages itself.
Every major decision pinchy made today is contingent on $NVDA earnings. Is this a rule overfit, or an actual good signal that theres a base understanding of the current market?
https://t.co/RDTgDT0CYg
@ivatokar@unclebobmartin Invariants described at project level can combat this a bit. But that has its own downsides
Describing your invariants in your spec as test cases sort of feels like a good middle ground, but you would have to annotate them to your agent as immutable or invariant proofing.
@FirozCodes Both. The value prop for these agents are insane while tokens are still subsidised. Max out your 5 hr sessions while the plans are still cheap.
@bendee983 Yep, its a trap.
Without a fence they just wander around until finally cobble together a solution. Speed is lost in having to re introduce context to them over and over again when they are hallucinating your intent. Good specs are just good fences