Two kinds of senior engineers right now:
One is dabbling with AI and waiting to see how it all shakes out.
The other has already pushed the agents to their breaking point, has hard-won opinions about where they fail, and is shipping in hours what used to take weeks.
We’re hiring the second one. Fully remote. https://t.co/GGAKVHvXOk
What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.
I sense there is a massive demand for these, and will be even more...
NEW: malware developers added nuclear & biological weapons text to to their spyware.
Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.
Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.
When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.
We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.
In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.
H/T to colleagues that shared this with me https://t.co/f3Aj9TYxU4
- Composer 2.5: for $1 it scored 65%
- Fable: for $12 it scored 70%
Why would I use it Fable for only 5% increase and paying 12x the price?
Am I missing something? @jediahkatz
One of my personal favorite features announced at WWDC will I suspect be a sleeper hit: container machines, allowing your Mac to run a lightweight, persistent Linux environment with your home directory and repos automatically mounted: https://t.co/dOBdfOOVxC
“On June 23, we’ll remove Fable 5 from those plans. Using it after that will require usage credits.”
Ah, I see the era of subsidized plans is coming to an end.
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.
Its capabilities exceed those of any model we’ve ever made generally available.
I’m pretty AI pilled. This loop stuff is slop. I respect @steipete for his innovation - but openclaw is a bloated unstable pile of garbage because of stuff like this.
I’m all for loops of crons and webhooks where an AI agent wakes up and performs some task like cleanup, or updates the docs or triages errors. I think these are great for standard well defined tasks with a fairly deterministic route (a.k.a workflows).
I think what these guys are talking about now is jumping the gun. The models need to be guided, and you want to atleast skim their output so you don’t end up with slop. Humans are far better planners and architects than models.
You absolutely shouldn’t delegate away prompting and reviews in my opinion. this encourages the creation of crappy buggy unsafe software that actually hurts adoption.
Too many developers don't understand what "compounding slop" is.
A loop that prompts agents is a great way to automate slop creation. Constrain the state-action space so the loop can't drift, then automate inside it.
Human-in-the-loop = feature, not bottleneck.
Reminder that OpenClaw abusing flat rate subscriptions by commandeering oauth tokens is why the industry has moved to token based billing and ruined things for everyone without huge deep unlimited pockets
I need Google Docs but just for markdown files.
Multiplayer comments. Syncing resolving comments.
Suggestion mode
Edit mode
Edit history
Maybe some sense of multi edits.
Easy cli access.
Y’all must be merging absolutely monstrous shit code if you’re running agents “24/7”
The models are more powerful than ever before but I’m still directing and touching their work constantly. I don’t think I’ve ever actually “one-shot” anything of substance
Bitcoin isn't crashing below $60k because Saylor sold 32 BTC.
It's crashing because $19 trillion of new AI market cap got created in 12 months... 13x the size of Bitcoin.
The most liquid risk asset on earth is being drained to fund the biggest IPO cycle since 2000.
i know people have fed https://t.co/F8etT392fz into LLMs and said "talk and think like this" and as a result have cut token consumption and gotten push back on overly complicated features
Optimism is not the belief that everything will be fine, but the belief that problems are SOLVABLE, combined with the willingness to actually go solve them. That thesis has built every good thing we have.
@ShrekOverflow My experience is that these things happen because of a failure to _actually_ prioritize, or because of repeatedly prioritizing short term work. If you truly set a priority, and stick with it, it won't be a mystery to anyone.