@sudoingX Tested GPT 5.5 xhigh vs Qwen3.6 27b on same task. ChatGPT rated Qwen 7/10—consistent flaw: lacks code depth, misses key details. Tried varied system prompts, still always 7/10. So not 90 but 70%, still good
@stevibe Have you tried mimo m2.5 pro? It turned out to be unexpectedly good in my experiments. Plus, the 1 million context works well somewhere up to 300-400k.
@sudoingX Don't get your hopes up. in my experience and posts on reddit, with the quantization of qwen3.6 27b cache, model starts processing old messages in a loop. It took me a while to figure out what was going on, but that was probably it.