@econoadabsurdam@TomRed43 This doesn’t disprove OP’s post. I personally know people who have negatively geared so aggressively to tip their >$400k salaried job to being net negative. (I mean, you can find examples of articles on the same)
@damian_b@bernhardsson if one is really maxxing out the usage of the 1m context window frontier models, and has like the fifteen claude code sessions running at once, sub-agents etc etc i feel like it's not that hard
This is the opposite of what we've seen at https://t.co/jfUt9MiLQ7. Our most experience SWE is taking most advantage. It's also in disagreement with the Hotz post linked:
"A trait you find in all high performing people is the ability to error correct, and they have mostly been good at seeing when slop is slop."
(The data for this analysis was gathered by Claude. Normally I would disclaim this and suggest you be skeptical, but VC data is sufficiently hallucinated already that it really can't be much worse to have an LLM make it up instead of some guy at Pitchbook)
i love the idea of being a modern day anthropologist -- i feel like see so much good stuff online totally by chance (e.g. a chinese cheat cheat on the pubbattlegrounds subreddit)
It’s 2018 and your coworker just sent you a 400 line pull request.
You get a cup of coffee and sit down to review it.
It’s beautiful. Elegant micro-refactors. Crispy method names.
You catch a few things, but that’s ok. It’s part of the dance. They didn’t consider extensibility on part of their API. Here’s a comment buddy.
They respond in an hour saying they think we should do one piece differently than your comment. Hey let’s jump into a room and figure it out. We can’t just agree to disagree, this code is too important.
The PR merges and goes to prod. You feel a shared sense of ownership and accomplishment.
That night you go to sleep and dream of that code. You can still see the shapes of it on the backs of your eyelids, your IDE syntax highlighting sparking neurons in your reptile brain.
You go to work the next day ready to go. You understand the system. N is your foundation. Time to build n+1.
Surely a model pre-trained on the web would fare much better?
Yes, and no. We also fine-tune their web-retrained model, and observe a modest +1% solve-rate on SWE-bench, achieving 5.7% pass@1 compared to 4.5%
Surprisingly little seems to be lost by throwing away the internet.
spent a few years reading through leonardo's notebooks and have often wondered what he'd be doing if he were around today
whatever kat is doing is my best guess
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
It's kinda sad how DeepSeek saved everyone billions and billions of dollars by inventing GRPO and captured exactly 0 of that value. Maybe open source is unsustainable