andrew

Verified account

@armillspaugh

Seattle, WA

Joined October 2012

4.6K Following

637 Followers

4.4K Posts

5 days ago

i used it about 12 hours a day for every day it was available. only in claude code. it excelled at operating given ambiguity. give it a problem, even an underspecified one, and it would just figure it out. it seemed to be able to think at a higher level of abstraction than gpt5.5 or opus. it very clearly did a better job at solving tough architectural problems in code than other models. it did, however, produce some of the most inane writing i've ever seen. it would make up shortcut words for concepts, and the sentences would become borderline unintelligible. usually i could fight through it, but maybe 30% of the time i genuinely couldn't figure out what it was saying.

0

0

0

0

34

6 days ago

@jared_western yikes

1

0

0

0

164

7 days ago

@debreuil this took me a bit to parse but i'm reading it as "let ai reason about the idea rather than the implementation" and doing that by means of building a mathematical way of representing the relevant ideas. is that about right? curious to see where it goes!

1

1

0

0

11

7 days ago

@tamaybes occasionally it’ll write a sentence that ill read ten times over and still have no clue what its saying.

0

3

0

0

491

Who to follow

Verified account

@ThemaFinancial. Prev @risczero, @intel, @google

Andrew K. Hirsch

Assistant professor at University at Buffalo. Previously MPI-SWS and Cornell. Programming languages, security, concurrency.

Verified account

Former CAS Professional 🇺🇸 I Energy 🛢& Power ⚡️ I Land

7 days ago

@debreuil are you building a product or is this just for fun?

1

0

0

0

8

7 days ago

@plsoulie @fabledoteng fwiw, the product looks great!

0

1

0

0

23

7 days ago

@rexsalisbury @Austin_Federa the amount of mental exhaustion associated with a diy home reno is substantially worse than any company i've ever been a part of.

0

1

0

0

172

7 days ago

@troy_mccann awesome - i added a toy implementation of pcb design to a harness i'm building but i'd love to learn more about what it takes to do what you guys are doing - care to chat?

1

0

0

0

6

7 days ago

@kaledzenkov dm me! building a hardware-in-the-loop harness with my sights set on factories.

0

0

0

0

37

7 days ago

@notsikan I've seen references talking about Poetic's "english-defined deterministic programming language" - why is it that this is so useful? The implication seems to be that it's easier for models to write, but models are pretty good at writing code in existing programming languages.

0

0

0

0

29

7 days ago

@l2k this is awesome! what do you think you could've done differently to reduce the likelihood of failure? better simulation?

0

0

0

0

401

8 days ago

@levie I think that making "the untrainable" easy to build and iterate on is going to be a big unlock. basically a platform for designing skills.

0

0

0

0

142

11 days ago

@ryanbrewer this feature buys you a lot of good will. it's obvious that's what's going on (to me at least) and every time it does happen i think "thank you for not making me remedy the half assed mess that would've otherwise been the result"

0

0

0

0

182

12 days ago

@patrickc my harness does the bulk of that. multi user. custom processes. uses multiple providers simultaneously. general purpose (i've used it for designing PCBs). all artifacts have light weight version control. needs a bit more ui polish but it's getting there.

armillspaugh's tweet photo. @patrickc my harness does the bulk of that. multi user. custom processes. uses multiple providers simultaneously. general purpose (i've used it for designing PCBs). all artifacts have light weight version control. needs a bit more ui polish but it's getting there. https://t.co/VVY33FlR6f

0

0

0

2

399

12 days ago

I built a similar thing a few months back and found the painful part to be bid pricing. I'm having trouble following exactly how bid generation works in your model. It looks like bids aren't generated by the LLM, but rather a new agent gets a bid above the frontier, and then that bid gets frozen. If using the fixed_with_eps scheme, does that mean a newly introduced agent will keep winning whenever it's eligible (until another new agent appears)? I also saw the holland bid scheme in the code, which sounds like it would avoid this problem, but I don't see it discussed in the paper. Any clarification would be helpful. Thanks!

0

0

0

0

30

14 days ago

@thsottiaux Codex is refusing to do any more work because I've hit my limits until a reset later tonight, but the status section shows I have plenty of usage left. Is this the cause?

0

1

0

0

651

armillspaugh retweeted

Grant Stenger (hiring)

19 days ago

Local minima are rare in high dimensions because a strict local minimum has to curve upward in every direction, so all Hessian eigenvalues must be positive. In a D-dimensional toy model where eigenvalue signs are independent, that’s a 2^(-D) event. In GOE-like random matrix models, positive definiteness is even rarer, roughly exp(-cD^2). So as dimension grows, random critical points are much more likely to be saddles than minima. This is one reason high-dimensional optimization is often a saddle-escape problem, not a bad-local-minimum problem. Wrote up some of the math here: https://t.co/vkaVqVD64N

34

2K

192

2K

304K

28 days ago

@levie @HungLee also i think that the job is wildly difficult to productize given the fact that improvements in AI models are constantly changing how tasks should be solved, from a structural level. i wrote my own harness and it needs constant redesign in order to effectively leverage new models

0

0

0

0

165

about 1 month ago

@vytalow @ycombinator @ParasmaAI congrats! what's the soon-to-be product?

0

1

0

0

48

Last Seen Users on Sotwe

Trends for you

Most Popular Users