Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Replatforming to solve this problem was a bit silly in my opinion. The solution to the problem was "do fewer allocations" which can be done in any language.


A) They had spent a lot of time optimizing the Go service

B) They weren't allocating a lot, and Go was enforcing a GC sweep every 2 minutes, and it was spending a lot of time on their LRU cache. To "reduce allocations" they had to cut their cache down, which negatively impacted latency.


Your reply misses the point. We were already doing so few allocations that the GC only ran because it "had to" at every 2 minute mark. The issue was the large heap of many long lived objects.


Did you try to change that interval to a much larger time?


When we investigated, there was no way to change that that we could find - barring compiling go from source (something we could have done, but wanted to avoid.)


Yes, you have to rebuild go, but that is literally done in a minute. It also would be interesting, if you happen to have some conclusive benchmarks, how the latest Go runtime would perform in this sense.


I don't get, why this is downvoted without comments. Compared to a rewrite, this would have been a miniscule change. Furthermore, considering that you wrote that long blog post (which I quite appreciate, as it contains interesting information), it would have been important knowledge, whether the setting of the parameter was the real culprit - and if it was, a good reason to shout out to the Go implementors to look closer at it.


All I'm going to say is that if you think maintaining your own version of a compiler is the reasonable option compared to a rewrite in another language, you are probably deeply invested in the former language. This also applies to kernels and databases.


Well, in this case, "maintaining your own version compiler" concerns a single value change in the code base. At least, as I wrote, it should have been tried to identify the root cause for the observed behavior. If this "fix" significantly improves the behavior, it would have been a good data point to reach out to the Go developers to resolve this issue.

The problem to get down to the core of these issues are test cases. It seems, that neither the Go developers nor many other people have run into this as an issue - I only remember noticing the regular GC some years ago, but it was not an issue for me. As they have a real-life test case exposing this problem, they are possibly the only ones, who could verify a potential fix for the problem.

So, while it is great that they identified the problem and wrote a thorough blog piece about it, the only thing we learn from this is, that in Go 1.9 there was a latency issue every 2 minutes with their style of application/heap usage. Unfortunately, we don't know whether this problem was already addressed in later Go versions, and if not, whether there should be a way to control the automatic gc intervals to address this.


You haven't read the post carefully. Their garbage collection in Go was spiking every 2 minutes precisely because they were doing too few allocations to have it run more often.


They addressed this in the article:

> These latency spikes definitely smelled like garbage collection performance impact, but we had written the Go code very efficiently and had very few allocations. We were not creating a lot of garbage.

The problem was due to the GC scanning all of their allocated memory and taking a long time to do so, regardless of it all being necessary and valid memory usage.


I wonder if they attempted manual memory allocation in Go?

In many languages with GC you can actually do manual memory management relatively easily with few helper functions. You write your own allocate() and free() functions/methods. When you allocate, you check the free list first, if nothing is available, you do normal allocation. When you call free you add the object into a free list. If you memory management leaks, it triggers GC.

Usually you need to do that kind of stuff to only in few places and few data structures to cut GC 90%.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: