If you want fast initial load, then there is packet.city [0], "the greatest website to ever fit in a single TCP packet." :-) It also has a few other optimizations to make the response as fast as possible that are discussed on its github. [1]
That was deprecated in 2017. The sysctl is still there, but doesn't actually do anything in newer kernels. There are a few other ways to lower latency at the expense of throughput but I will let folks research that themselves.
Its a nit-pick, but the article talks a lot about packets in flight, yet tcp doesn't actually care about amount of packets in flight, as much as the amount of bytes in flight, and conflating the two just irks me.
edit: maybe i'm confusing the receive window with something else.
I think you're confusing the receive window (advertised by the receiver and measured in bytes) and the congestion window (maintained by the sender and based on acknowledged segments).