I use high or xhigh normally but I agree that on xhigh and above it seems to just end up using more tokens. Even if it gets more things right at first, it'll end up getting confused before you're finished with the session which is really annoying.
the ultracode effort setting with the workflows it can run are wild. Its the kind of thing that if you don't think it through and give it a good prompt that will you know, utilize an agent swarm well without overloading the orchestrator agent's context (what I can the main agent idk if other people say that). I've gotten some overthinky shoddy results from that too, but multiple times has really struck gold to an almost unbelievable degree with that. It burns A LOT OF TOKENS THOUGH.
I have some friends who have also said this including similar complaints about 4.7. That's part of why I'm convinced that the strengths and weaknesses are really very nuanced, and that depending on your style some models will perform better or worse for you.
Me personally, my style is very "dark factory". I've been automating all sorts of complex workflows with specific gates/reviews and tons of established conventions etc etc etc. At work I still pay attention at the line-by-line level to ensure everything is good, but on personal projects, I barely look at the code.
These last 2 models have been magnificent performers for my work style. Not sure if that means anything or just variance that will regress to the mean.
I don't think its possible to be objective about this stuff (which is so weird), but my attempt at objectivity is "these models can be better or worse in ways that are impossible to fully understand and hard to even kind of understand, but they're definitely continuing to get better"
@flowersslop Its just like "Wow! That was amazing, worth a few hundred bucks just to spend a few weeks feeling like a kid again. Alas, this is not actually an enjoyable way to play Skyrim yet"
I think you're genuinely onto something with the whole "this is great! it just isn't good enough for the value prop to actually work at scale" angle
That's how I felt about VR when I had an Oculus 10 years ago, so I sold it.
Doesn't seem like the overall experience has changed much yet.
Plus far too much of what is unacceptable ab him is in the present and recent past.
Plus every other thing.
Plus he is DSA slop.
Plus he’s retarded.
Look, I think there are tons of things in life that aren’t hard concrete proof that ought to be ignored because it’s not proof. I also am a real person who doesn’t at the end of the day wholly view the world through a courtroom or journalistic lens, and I see what I see and I question people who don’t see it. It doesn’t really matter if all these things are true or prove the specific points anyone is making. I don’t care.
I see it *clearly enough*, him being a piece of shit is not a conspiracy theory no matter how many unproven elements there are Mr Tracy.
My advice: spend less time trying to make LLMs figure everything out and spend more time figuring out a super low friction way for you to easily digest and respond to with inline annotations the agent’s output. what your agent is trying to get out of you is #1 priority
While you’re busy optimizing yourself OUT of your agentic coding workflows…
Human-intent throughput and precision should be the heuristic you’re after, not merely hours sans human.
The latter is bolstered by the former.
Take a moment to co solder the actually astronomical differences in the realities of these crimes. The personal stakes of the perpetrator, the visibility, and the occurrence rate (for rape real rate must be a multiple of known numbers).
You’re premise isn’t fundamentally wrong, and it’s entirely possible he’s not a rapist. But you comfort yourself with this comparison by pretending there’s not an enormous gulf between the two, specifically that my statement actually is an order of magnitude more feasible.
Me personally? I’ve paid enough attention to not be able to shield my eyes from seeing Graham Platner. He might not be a rapist, but I think you’re wrong for these reasons and more nuanced reasons I can’t explain in a tweet. Think whatever you want. I think he’s almost definitely a rapist and if not a disgusting person all around top to bottom.
I stand by my opinion and if my life was on the line I’d feel confident
@devahaz That all makes sense. Question though: do you think this is a good thing, or do you think it is a bad thing? Not that the Republicans aren't good enough to win in CA. I mean that, per your prediction, nothing is going to substantially change?
@tracewoodgrains Yeah I don’t think there’s much of a chance that a person who had not raped anyone would say these words. Like what ARE the chances. Try to pick a percentage. Just try.
…
Nah I think we can actually collectively argue that the worst policy ideas of all time combined with somebody with decorum as bad as Trump is a lose-lose for the country. I mean. Maybe eventually but by then I think nobody will have a rational idea or memory of Trump by then anyway and they’ll reshape it around whatever they want at any point in time.
Of course you’re right it will be a theme in many ways
@kasratweets LLMs are allowing us to observe and think about ourselves in ways that we never could or did
we've modeled something about ourselves we've never observed in something non-human, whatever you want to call it, and conversations like THIS are just going to get more mainstream
@PerceptualPeak I am using macOS but luckily my alternative method of hoping it doesn't happen much is working pretty well now. Was happening a lot yesterday, mostly good today!