@yacineMTB I did some napkin math recently for this and for a significant reduction you need a square kilometer of solar panels and some way to prevent wind to bring in more heat from the sides unfortunately :/
@thsottiaux The current benchmarks for coding all seem to be flawed as they dont test what most people are doing :/ It feels like we would need more specific benchmarks, so everyone could figure out the benchmark that actually works for them and look at those numbers.
@thsottiaux Tbh I think if you keep shipping better models this quickly, offering a gpt-5 model name that resolves to the latest version (for apps that don't have super model-customized prompts) + sunsetting old models in the api faster would be fine.
@theo I like your videos, but does this post mean that when reviewing new models after a few days/weeks is just not possible? Or at least it's not really useful to give very strong opinions on new models after only trying them for a short while?
@karrisaarinen The featureset that GitHub has is fine. It doesn't really need much more of anything. They really have to get the basics right again though. Reliability, performance and UX. All 3 are currently bad, which makes it horrible to use
@mattpocockuk Great idea! I just wonder how it would change the way the agent writes code. I think a huge benefit right now is that the agent passively learns how code is written in the codebase because it has all of this code in context, but could maybe be replaced by a separate step
@habibislop@mark_k Yeah I see what you mean. Maybe the final image is even a composition of actual text and image elements. I can't get over how perfect the text rendering is.
Images V2 is better than any other model I've tried so far at generating character art for retro games. It's prompt adherence is incredible, loving it ๐คฉ