i used it about 12 hours a day for every day it was available. only in claude code.
it excelled at operating given ambiguity. give it a problem, even an underspecified one, and it would just figure it out.
it seemed to be able to think at a higher level of abstraction than gpt5.5 or opus.
it very clearly did a better job at solving tough architectural problems in code than other models.
it did, however, produce some of the most inane writing i've ever seen. it would make up shortcut words for concepts, and the sentences would become borderline unintelligible. usually i could fight through it, but maybe 30% of the time i genuinely couldn't figure out what it was saying.
@debreuil this took me a bit to parse but i'm reading it as "let ai reason about the idea rather than the implementation" and doing that by means of building a mathematical way of representing the relevant ideas. is that about right? curious to see where it goes!
@rexsalisbury@Austin_Federa the amount of mental exhaustion associated with a diy home reno is substantially worse than any company i've ever been a part of.
@troy_mccann awesome - i added a toy implementation of pcb design to a harness i'm building but i'd love to learn more about what it takes to do what you guys are doing - care to chat?
@notsikan I've seen references talking about Poetic's "english-defined deterministic programming language" - why is it that this is so useful? The implication seems to be that it's easier for models to write, but models are pretty good at writing code in existing programming languages.
@ryanbrewer this feature buys you a lot of good will. it's obvious that's what's going on (to me at least) and every time it does happen i think "thank you for not making me remedy the half assed mess that would've otherwise been the result"
@patrickc my harness does the bulk of that. multi user. custom processes. uses multiple providers simultaneously. general purpose (i've used it for designing PCBs). all artifacts have light weight version control. needs a bit more ui polish but it's getting there.
I built a similar thing a few months back and found the painful part to be bid pricing. I'm having trouble following exactly how bid generation works in your model.
It looks like bids aren't generated by the LLM, but rather a new agent gets a bid above the frontier, and then that bid gets frozen. If using the fixed_with_eps scheme, does that mean a newly introduced agent will keep winning whenever it's eligible (until another new agent appears)?
I also saw the holland bid scheme in the code, which sounds like it would avoid this problem, but I don't see it discussed in the paper.
Any clarification would be helpful. Thanks!
@thsottiaux Codex is refusing to do any more work because I've hit my limits until a reset later tonight, but the status section shows I have plenty of usage left. Is this the cause?
Local minima are rare in high dimensions because a strict local minimum has to curve upward in every direction, so all Hessian eigenvalues must be positive.
In a D-dimensional toy model where eigenvalue signs are independent, thatโs a 2^(-D) event. In GOE-like random matrix models, positive definiteness is even rarer, roughly exp(-cD^2).
So as dimension grows, random critical points are much more likely to be saddles than minima. This is one reason high-dimensional optimization is often a saddle-escape problem, not a bad-local-minimum problem.
Wrote up some of the math here: https://t.co/vkaVqVD64N
@levie@HungLee also i think that the job is wildly difficult to productize given the fact that improvements in AI models are constantly changing how tasks should be solved, from a structural level. i wrote my own harness and it needs constant redesign in order to effectively leverage new models