One of my favourite ways to get a vibe for AI models coding ability (particularly it's ui preference) is via @lmarena_ai they have their WebDev arena which pits two AI's against each other, blindly and you pick which result you like the most.
You begin to learn the styles that particular AI uses, it helps you understand why people generally prefer Anthropic outputs. Both their 3.5 models, Sonnet & Haiku share a similar design style which is more aesthetically pleasing for sure.
I was also impressed with some of the new Gemini 2.0 Pro outputs, and Qwen Max, these are definitely a leap up compared to previous gen models also. But still just behind sonnet 3.5 overall, generally*
I was generally underwhelmed by the O3 mini high outputs, (particularly UI).
*It's a very useful exercise to show you that there isn't one "best" model, even though it's easy to say "Claude is the best" - Each prompt is different and getting opinions from other models is very useful.
@_imdawon No idea who is in charge of their search but they gotta go. So annoying haha i don't want to have to install raycast to find a fcking application.
I quite like the model its performing quite well in our harness compared with the preview model, however the 3x in price is hard to justify / changes the utility of the model significantly.
if the pricing hadn't been so much higher i think the reception from people would have been a lot better.
@matvelloso Agree it's a painful process - surprisingly easy for google though via their @GoogleAIStudio they integrated their AI auth / db services quite well.
@joshwoodward@OfficialLoganK Why are you only able to revert / edit your most recent prompt / message in the gemini app? You also aren't able to roll the conversation back to a previous point. Seems like an easy win to give users more control. Like in AI Studio.
@codewithantonio@opencode I moved when claude-code had a bug with their compact feature and made the model stupid - haven't gone back as opencode is doing the trick.
@OpenAI Please update your docs and in-app experience regarding building apps with a free vs paid account.
There's no mention a paid account is required, developer mode can be enabled on a free account and building one in app presents no error, it simply just doesn't create a new app. Please throw an error letting the user to know they need to upgrade and mention it in the docs - wasted enough time!
@MoonDevOnYT More likely youโd be wanting two of these at least (this is AUD not USD) but youโd be well above 1K USD a month to start handling the bigger models with larger contexts