Qwen 3.6 is frontier for local.
It also thinks forever.
I tried a dumb inference-time trick: make its <think> block obey a tiny grammar.
Result:
- HumanEval+: 22x fewer think tokens, no accuracy loss
- LiveCodeBench public slice: +14% pass@1, ~5x fewer total tokens