Post
nvidia/Qwen3.6-35B-A3B-NVFP4 running in vLLM nightly on my Nvidia GB10 is actually insane 50 tok/s, 4 concurrent generations. total 200 tok/s. ideal for spawning subagents or working in parallel its tool calling behavior is very good as well. I will be giving it test drive on an openclaw instance, and keep you posted More details on NVIDIA forum: