I should be clear: I'm not saying Qwen3.5-9B is bad. I'm saying that benchmarks, as they exist right now, are a terrible way to decide what model to use. And the hype around this particular set of ...