girlfriend happy when I make things
I make things
claude happens
I make more things (biggerโthey're some of the biggest. bigger than anything you've ever seen, they're so big)
girlfriend moar happy
this relationship thing easy
norway, okay, toughโhaaland is tough to beat. but you can't fault them, of course they failed to beat bosnia AND herzegovina, anyone would've lost! the 2v1 is inherently stacked against them
They lost 0-3 and 1-4 to Norway in their group. Then they failed to beat Bosnia and Herzegovina in a playoff. They haven't been at a WC since 2014. Perhaps the rankings are hiding a lot.
code that LLMs write is really slop. I hate it, as an experienced programmer.
but it works.
I've started using dynamic workflows with opus 4.8 agents (rip fable!) to critique every LoC change that the main agent does, much better results.
10x more token expensive though lol
actually I have started watching the Wealth and Poverty course published by (former) berkeley professor reich
if you look over my shoulder looks like we're BOTH getting educated, hah, bet you don't want that?
they keep making numbers bigger, but I can't feel any difference.
literally fable 5 ("mythos-class") struggles to find performance enhancements (v. similar to cybersec) that don't break existing functionality. it struggles to add blurry edges to images (which is ezpz)
big #...
@max_spero_ saw a good blog about a high school kid complaining abt speeches at their graduation being ai generated, very similar idea in a concrete way. worth a read: https://t.co/1Wrrag2D0d
opus 4.8 xhigh reasoning performs about as well on my codebase as sonnet 4.5 did...
have felt a significant decrease in ability from opus 4.6 to opus 4.8; am quite hesitant to trust benchmark numbers.
congrats on the bigger number, bro
feels like they are just RLHF-maxxing
not a fan of ai-generated docs @AnthropicAI website. really quite difficult to understand & feels low-effort, like the feature doesn't matter
could not explain what a dynamic workflow is if I tried after reading https://t.co/tG01hggAJs
(only 90% because I hit my pangram limit)
opus 4.8 xhigh reasoning performs about as well on my codebase as sonnet 4.5 did...
have felt a significant decrease in ability from opus 4.6 to opus 4.8; am quite hesitant to trust benchmark numbers.
congrats on the bigger number, bro
feels like they are just RLHF-maxxing
sometimes you have a good idea for your website and then it's like
I can't find an svg
let me just trace one
like trust me it's a good idea
???????????