@0xSero I'm disabling reasoning more often than I anticipated.
Running Qwen3.5-397B at 10t/s without reasoning gets me the answer much faster than running Qwen3.6-27B with reasoning at 35t/s.
@antirez@AMD I daily drive ROCm for inference. Ubuntu 24.04 is the only first-class customer right now. Fedora 44 works but there's always some minor catch.
@mikepat711@3lectricBrawl +1 for Air Pump and Jumper battery. The frunk is the only part of the car you can easily access if the low voltage battery goes!
@shashankgoyal95@OpenRouter Encourage more support for hosted quantized models (not just labeling fp4/q4 or 'turbo' versions). I would love to see "does Q5 of model xyz work still work and provide speed/cost-savings?" without renting a bunch of h100's. That service doesn't exist today.
@GameNativeApp@Synaelle OOS will fight its users to the bitter end if they try and do anything high-performance (or with several forked processes) that's not installed from the Play Store.
Source: OnePlus 12 user that's lost this battle