It'd be cool to look at more signal statistics from the CPU plot.
It appears that Go has a lower CPU floor, but it's killed by the GC spikes, presumably due to the large cache mentioned by the author.
This is interesting to me. It suggests that Rust is better at scale than Go, and I would have thought with Go's mature concurrency model and implementation would have been optimized for such cases while Rust would shine in smaller services with CPU bound problems.
My first guess for the slightly higher CPU floor of the Rust version is that the Rust code has to do slightly more work per request, since it will free memory as it gets dropped, whereas the Go code doesn't do any freeing per request, but then gets hit with the periodic spike every two minutes where the entire heap has to be traversed for GC.
tokio 0.1 was definitely less efficient, when we compare go to 0.2, tokio uses less cpu consistently, even when compared to a cluster of the same size almost a year later with our growth over the time since we switched over.
Go's CPU floor is lower compared to the naive Rust port (roughly 20% vs 23% from eyeballing). Their optimized Rust version is shown in the next series of graphs as being ~12%.
It appears that Go has a lower CPU floor, but it's killed by the GC spikes, presumably due to the large cache mentioned by the author.
This is interesting to me. It suggests that Rust is better at scale than Go, and I would have thought with Go's mature concurrency model and implementation would have been optimized for such cases while Rust would shine in smaller services with CPU bound problems.
Great post!