We’ve run into problems on my work when k8s limits cpu to few processors, while go application can see all of them (setting GO_MAX_PROCS to maximum CPUs available) and go’s scheduler is going crazy because of that during highload (like stress-tests).
It is caused by scheduling for, let’s say, 40 processors while you have cpu time for only 2 of them.
It leads to gaps in cpu utilization by your application, when 2 cpu limit is scheduled for all 40 cpus for a little amount of time.
That’s what I’m talking about
That’s because in k8s you don’t actually have 2 real CPUs out of 40 dedicated to your app, but 2/40 of total processing time.
So if we have time quant of 100ms, first 5ms we use all 40 CPUs, and remaining 95ms application is waiting for CPU.
And we see a lot of timeouts in logs.
It’s a great talk on that in russian about that and related stuff.