With 4x you could run 8 x qwen36 32b at Q8. Models in that range consume about 40-50gb of vram and aren't bad at all combined with opencode.
So I'd imagine you having several of these smaller models running constantly to handle most demand on the 4x and the extra 4x you can send to me π€ or use them for testing new models before you replace the older "always on" models.