Post @onusoz · /2026/06/23· 04:50 PM View on If you are interested in running such demos, look into --demo mode in my local model swiss army knife localpi https://t.co/LyjwJWDjmi Thank you @googlegemma for the shoutout @googlegemma· Jun 23, 202616 parallel runs of Gemma 4 26B A4B on a single NVIDIA DGX Spark! Pushing 18 tok/s per instance and a 300 tok/s aggregate. It can even hit 32 parallel runs. This level of concurrency highlights how efficient the architecture is.Show more
@onusoz · /2026/06/23· 04:50 PM View on If you are interested in running such demos, look into --demo mode in my local model swiss army knife localpi https://t.co/LyjwJWDjmi Thank you @googlegemma for the shoutout @googlegemma· Jun 23, 202616 parallel runs of Gemma 4 26B A4B on a single NVIDIA DGX Spark! Pushing 18 tok/s per instance and a 300 tok/s aggregate. It can even hit 32 parallel runs. This level of concurrency highlights how efficient the architecture is.Show more