Onur Solmaz blog
AboutArchiveContact
Post
  1. Portrait of Onur Solmaz
    @onusoz · /2026/06/23· 04:50 PM View on
    If you are interested in running such demos, look into --demo mode in my local model swiss army knife localpi https://t.co/LyjwJWDjmi Thank you @googlegemma for the shoutout
    @googlegemma· Jun 23, 2026
    16 parallel runs of Gemma 4 26B A4B on a single NVIDIA DGX Spark! Pushing 18 tok/s per instance and a 300 tok/s aggregate. It can even hit 32 parallel runs. This level of concurrency highlights how efficient the architecture is.
  • Onur Solmaz
  • osolmaz
  • onusoz

Explorations in software, agentic systems, math, languages and more.