@RhysSullivan straight to the docs
I want to be able to do a quick scan to orient myself and jump in wherever makes sense
If I'm sending stuff to an agent it's also just a link to the docs
Codex can now deploy and host websites for you using Sites! ๐
This includes storage for data and files using D1 and R2 ๐ฅ
We started rolling out to Business and Enterprise customers before rolling out more broadly.
More details ๐
@staysaasy I've gone back to writing by hand a bunch - trying to be a lot more deliberate about flexing the brain muscles. Similarly, find myself reading a lot more, explicitly trying prioritize time to read & reflect
@chythram1 would love to make this a one-click Deploy to Cloudflare thing. Would be my first time figuring out how to do that so will keep some notes on friction etc.
One thing I think will be problematic is generating an AI Gateway. That's admin dash only now, right?
Tried using Gemini in Google Sheets to convert a cell F4 from one currency to another in F5 since I couldn't remember the formula
First, it answered in the chat with the wrong exchange rate
Second, it only reached for the live exchange rate `=GOOGLEFINANCE(CURRENCY:GBPDKK)` when prompted by me. Even then it only printed the correct formula back in the chat instead of applying it in F5.
Verdict: Slop
itโs in gemini, just create it in ai studio. oh, thatโs for your personal google one account. for workspace you need gemini business. no, not gemini advanced, thatโs ai pro now. unless you need ai ultra. oh agents? you do that in spark actually. no, not gemini api managed agents, thatโs different. for coding use jules. unless you mean the agentic ide, thatโs antigravity. no, thatโs the old antigravity, download the new one. actually gemini cli is being deprecated, use antigravity cli. no the flash model is smarter than the pro model. unless you need pro. if itโs video, use flow. no, flow uses veo. no, nano banana is images. actually thatโs in gemini now. unless youโre in search, then itโs ai mode. no, research is notebooklm. anyway itโs all very simple.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
@benjitaylor any chance you could do something about the bookmarked folder flow? on mobile for example I'd love to be able to tap and hold the bookmark to jump to folder picker
on desktop it is also clunky, multi-step, wait for the toast, click add to folder etc
Got a chuckle out of reading Brianโs Coinbase layoff memo where he says every manager will have 15 direct reports and also be an IC
This is what I mean when I say that every tech executive has AI psychosis
๐จ New Experiment: Everyone thinks AI firms will look like little companies. A manager model decomposes the task and worker models do subtasks. The manager red-teams, revises, and recombines. A seemingly simple org chart.
But when I ran the experiment, the current in-vogue org setup, manager-subagent, cost 4x more and performed worse than letting a rather simple market do the trick.
I tested 3 ways to organize multiple AI models:
1. Solo: Onefrontier model does everything itself
2. Hub-Spoke: A "manager" model splits tasks, delegates, red-teams, revises
3. Market: Models bid on tasks, winner gets the job, reputation updates
I also tested were 3 types of tasks - Coding, Reasoning and Synthesis.
- Coding required most "global state" management, which the solo model did best at. In future @a1zhang's RLM will probably do even better here
- Reasoning is the hardest to cleanly decompose, and the market worked the best here
- Synthesis too, the market beat hub-spoke as the framing could be ambiguous
The reason is, a hub isn't a "manager" as we know it. It's a model that must somehow know:
- What the subtasks are
- What good recomposition looks like
And if either fails, as it does for complex or not-easily-decomposable tasks, competent workers still produce garbage.
As we move from coding to letting multi-agent systems do work across the entire economy we'll end up with more not-easily-verifiable tasks with ambiguous settings and uncertain payoffs. In those, we won't be able to use the factory approach to get work done.
The Coasean argument is that firms will get smaller, and the smaller firms will transact more, since the organisational premium reduces with AI. But how? Through central hubs, or markets? The fact is, Coase here needs Hayek. Setting up markets is not trivial, as @AndreyFradkin and I looked in our recent paper.
Essay: https://t.co/kK3gMQfbCs