@mattpocockuk I just do "please resolve this merge conflict" and it just works most of the time with GPT 5.5. I'd be are curious to know what else you add.
@ryancarson I use the older flash lite versions for structured data extraction tasks, but for agent work I find the tool calls reliability is unacceptable.
@gilbert_jc ARR is driven by corporate decisions. They don't necessarily reflect model quality. Jira has a very high ARR and is a disaster of a product. It just appeals to CTOs
@enjojoyy I rarely get it to do anything longer than 15m. But tbh. I'm pretty much in the loop so I don't mind not having to go through 10k lines of code.
@JuiceSharp@mattpocockuk Agree very much about tests and code in the same session is bad. I find that forcing the agent to do proper TDD helps a lot to get better tests.
It tends to assume that the original test was correct and the code needs fixing was more than of it does all the tests at the end.
@masonddudley@theo I created an AI slop skill which is more out less a documentation for the frontend components of the design system we use and it started producing great results. Create a quick excalidraw sketch of what you want and get a ready usable competent.
@KaiXCreator I have preferences at points in time, and I pick the best model I can get my hands on. I'd recommend leaving the fan things for team sports.
@rxhit05 My theory is that it's like football, you stay with your team no matter what.
Also for corporate environments deepseek feels riskier than anthropic because there is no big company backing it and China. Not very rational considering the amount of rugs pulled by anthropic.