There was a smiley in the end you know... My point was that these "tests" really shows nothing. It's a very specific scenario and if you read the comment from jlouis you would know that his scenario was very different and would yield a different result.
You're trying to dismiss the benchmark because you don't like the result. But actually the tests do show max latency with high concurrency which is the question at hand, and AFAICS the only factor not covered is a minutes-long test that might surface any GC impacts. So they are an excellent base for comparing performance.
Again if you or anyone can prove a significant GC impact serving up text or Json definitely hit up the techempower folks and make a case for extending the test length. It might help improve Erlang's standing which right now utterly sucks in comparison to Java performance (probably because of the maturity of the JVM).