@ctatedev I thought this too, but after experimenting with giving agents much more structured code info it doesn’t actually help much at all
They’re just really good at figuring anything out with plain text and making changes just with text, which is almost disappointing :D
Amazing. I had the exact same problem asking 5.4 not too long ago to come up with some benches for my VM using my lower level IR before I had that compiler phase properly working.
GPT produced some very plausible looking benches but it turns out they weren’t actually exercising anything interesting and sneakily papered over some actual VM bugs by “simplifying” _failing_ fixtures.
@VictorTaelin Nothing wrong with bootstrapping to C first, that’s exactly what Static Hermes is doing also.
So, no, no strong reason for the first iteration. Do whatever works fastest and easiest :)
Seems like an unlikely prediction to me, especially now that we’re over 6 months in and have seen what an absolute train wreck heavily agent authored code bases turn into at an astonishing and terrifying pace.
I don’t think delegating everything to agents will work even in the medium term for anything but basic apps, there’s just too much lost in translation for “last mile fixes” to work well (unfortunately!).
I mean you still have the IDE here, so you’ll be fine. But I think we need much more than just “manage fleets of agents better”, that’s just not a problem any experienced dev I know actually has.
@RhysSullivan I have been longing for Cursor to acquire Zed and just get a really good autocomplete in a really fast IDE for exactly this reason
I would much rather drive one very fast agent + autocomplete than 5 terminal tabs. Agent code review is torturous
Building apps has never been easier.
With Sites, Codex can turn your work, ideas, and plans into an interactive website or app your team can explore, use, and share with a URL.
Rolling out to Business and Enterprise plans, before expanding more broadly.
I once failed to specify that the compiler task was specific to oner phase, and because it noticed that the old inference engine was stubbed out, Codex proceeded to clank out an entire new TS-grade inference phase into a single giant ungodly file that almost kind of worked.
Completely unusable, of course, but man, I was impressed. The goblins just dont care.
@samuelcolvin This is genuinely surprising to me. I know you guys are actual professional engineers, so that helps, but even those few of us who still take code seriously make mistakes.
Maybe we just get to do more novel mistakes that AI can't find yet? Did anyone try the mythical Mythos?
Anthropic has confidentially submitted a draft S-1 registration statement to the Securities and Exchange Commission.
Pending completion of SEC review, this gives us the option to pursue an initial public offering.
Read more: https://t.co/onGZAhRLvD
@samuelcolvin Yeah, but you’re using this power tool responsibly like a professional, so you’re already not part of the discussion :D
Seriously though it depends on the work. Sometimes I many approaches, or do hungry side tasks, or yeet out large bindings files, and that adds up on those days
@thsottiaux Fairly new bug but when it renders markdown tables (which, yay, great feature) it often starts truncating other parts of the text while streaming in the CLI. Once the message is finished it restores fine so it’s not a big deal, but it is slightly annoying :)
Hmm, not actually sure if this is true. It already feels only semi true and more economic / integration / contractual lock in based than actual merit based. Long term even more so probably.
The future is unknowable ofc, but for the work I do models have been smart enough for like half a year at least and not much has changed other than my own approach, except maybe some better instruction following.
We've raised $65 billion in Series H funding at a $965 billion post-money valuation, led by @AltimeterCap, Dragoneer, @Greenoaks, and @sequoia.
This investment will help us advance our research and expand our capacity to meet growing demand for Claude.
ZurichNLP#21 will be on Monday, June 8th at the @ETH_AI_Center!
Jannis Vamvas (University of Zurich) on the challenge of Romansh and Eric Chen (@EPFL) on reasoning as test time learning.
RSVP:
https://t.co/pjdh3ewyoc
@ThePrimeagen It is kind of amazing they had enough AI capability to do a Zig > Rust rewrite of an entire _JS runtime_ but _not_ enough to get their most profitable billion dollar product that is still slow and buggy 6 months in rewritten in a language that doesn’t require >1GB RAM / session.
@dhh For basic CRUD apps sure, do whatever you want, for anything non trivial I thought we had long settled that, yes, you definitely do want as much static correctness as you can get.