Current average generation speeds for local DeepSeek-V4-Flash-Q2, highest to lowest: | Onur Solmaz blog

Post

@onusoz · /2026/06/15 · 07:00 AM View on

Current average generation speeds for local DeepSeek-V4-Flash-Q2, highest to lowest: Mac Studio M3 Ultra: 32 tok/s MacBook Pro M5 Max: 30 tok/s Apple ??? M4 Max: 25 tok/s MacBook Pro M3 Max: 24 tok/s Mac Studio M2 Ultra: 22 tok/s NVIDIA DGX Spark / GB10: 13 tok/s It seems macs' higher memory bandwidth is contributing here, though I'm not sure if GB10 performance could be improved (I do hope so, I have one!)

@antirez · Jun 14, 2026

If you need AI to do a search for you in the real world, ds4-agent is basically SOTA, because it can access the web sites without any limitations given that it uses your local Chrome browser (no, not in headless mode, that's the trick...), and DeepSeek v4 is great at search.