@DimitrisPapail@Samhanknr@yoavgo I think that claude is dumber because the compaction summary is not good.
I don't think the KV ever persists across turns, they cache a prompt prefix. Once your next turn prefix changes, you lose everything from T1 and T2 etc.
@DimitrisPapail@_sholtodouglas I mean to say - i think the TTL affects the prompt cache - it clears it, so you pay more when you run a turn without it. but i think the inbuilt KV cache "flushes" on every turn. as I understand it.
@DimitrisPapail@_sholtodouglas this assumes the prefill lives past turns? Doesn't every turn flush the kv cache, and then only if the next turns prefill byte matches the prompt kv cache, it loads?
For my friends who are still using UV and might be a little weary about recent compromises to PyPi packages, stick this in your pyproject.toml.
You can let all of those pip users find and report the compromises...
I painstakingly ran all 20 EsoLang-Bench hard problems through Claude webui. It solved 20/20 (100%).
No specialized scaffolding, no expert prompting, no few-shot examples, it just solves them natively.
This benchmark just suffocated the models with constrictive scaffolding.
being an adult means confronting the reality that the answer to “how much whipped cream will you spray directly into your mouth because no one can stop you” is, depressingly, not that much
The risk of AI for education is not students cheating in exams, it is people in general cheating themselves into believing they understand things they don’t.
We are both seeing the same behavior, but I'm expecting it and they are thinking it's "introspection"???
Like you injected a thought and now you are surprised or impressed it talked about it?
How is that "introspection"?
latex - it does the formatting for you - cause it knows better. don't use [h] for figures! latex knows.
....
ok... so to get it on the page I want.. I just copy paste it over and over moving it farther and farther up hoping it shows up where I want it. yay. this is good
@albfresco@ChatGPTapp yea it's there for me. in projects chat. but it's under the prompt you type in - not below the response. Sorry I didn't notice your screenshot was the response before.
you have to hover the mouse as well to see it - it's invisible otherwise.
@emollick I feel like this is still the case - unless you mean plus users never changed their setting from 4o (which is true, I regularly teach people). So I guess for plus users who didn't know to switch, this is a win. But for free users - hard to see that with the limits
It just seems weird. I thought the whole pitch was "now all users will experience a good model" - but only once a day and also, they probably will never know when or how?
I guess openAI is counting on most people just not knowing enough to notice or care.
Or perhaps they really trusted the model router - but the problem there is, how to get the users to trust it? It has no transparency. I will always doubt it unless I see "thinking"
it's really hard to see this as a model launch. This feels like the cursor moment - "no more free lunch" - but it's being presented as "GPT5!!! Amazing everyone gets it!" but really the opposite is entirely true.
I do understand the API is cheap though.