Post
Current average generation speeds for local DeepSeek-V4-Flash-Q2, highest to lowest: Mac Studio M3 Ultra: 32 tok/s MacBook Pro M5 Max: 30 tok/s Apple ??? M4 Max: 25 tok/s MacBook Pro M3 Max: 24 tok/s Mac Studio M2 Ultra: 22 tok/s NVIDIA DGX Spark / GB10: 13 tok/s It seems macs' higher memory bandwidth is contributing here, though I'm not sure if GB10 performance could be improved (I do hope so, I have one!)