I think we’ve reached the point where normal people can’t really determine whether new models are better than previous ones. Like Fable doesn’t seem that much better to me, but every 150 IQ person I know is like “wow the singularity came sooner than I thought”.
This feels like a legitimate case where “vibe code a replacement” is happening for real and could turn an entire industry of fitness trackers upside down.
It’s not crazy to imagine cheap/open-source hardware coming in the near future, and suddenly the entire market of Whoop/Oura/Fitbit is zero-margin commodity.
@SahilBloom@omniderek Imagine being Vlad or Sara and needing a parenthetical explanation of who you are when Sahil (listed below you) needs no introduction
I was just thinking about this topic today and I think this article is missing the forest for the trees in a few ways (in addition to being annoyingly AI-generated)…
Labs NEED the yellow brick road to avoid becoming total commodity API providers. Naturally they are going to absorb any easy, popular harness-style solutions in attempts to differentiate and create some sort of lock-in.
I agree that this is one (of several) benefits of building a more targeted offering with determinism and P&L impact, but I think the real key idea is whether you can charge the customer based on Outcome instead of Tokens.
Token-based pricing is basically an admission that you’re selling a raw material and it’s up to the customer to figure out how to get value out of it. If I need a new roof for my house, I just pay a roofer to do it all. I don’t have any idea how many shingles they used or the unit price.
You should be able to so clearly quantify ROI that you can charge based on number/size of Outcomes delivered, with the customer having no idea how many tokens were used. And you should be able to run a much healthier margin doing so, since even a few dollars of inference cost per Outcome should be negligible compared to the business value of the Outcome if you picked a good problem to solve.
me: okay codex now let's add this new feature to the app i had you start building three minutes ago
codex: absolutely! first let's add the NEW_FEATURE_ENABLED flag (defaulting to false) to ensure backwards compatibility with the legacy system,
@kennetheversole@toddsaunders STL here too. Probably could get something going locally. The local scene is small but strong. I bet @jspujji would find it interesting too.
@mosesbabychrist@Mugilan_SS@OpenAI I thought I was being clever giving it a big task to iterate on right before I hit quota, but ran into this :(
> Error running remote compact task: You've hit your usage limit.
Claude Code feels both very slow and like I'm getting a lot of "Akshually..." output and over-complicated planning. True for both Sonnet-4.6 and Opus-4.7.
Codex with gpt-5.5 is the clear winner for me at the moment.
It's wild to think about how massive 1M token context windows in LLMs really are
That's roughly equivalent to:
- The complete works of Shakespeare
- 11 hours of audio
- A 5-minute session fixing some TypeScript issue
@IanLandsman For what it's worth, as an EM I've found Linear's Agent really useful for rolling up status updates. Issues/comments are the source of truth, and the Agent can crawl all of it to summarize what the team has recently accomplished, blockers, dependencies, etc.