Considering scalability from the start does not just mean optimizing for million...

halostatue · on Oct 24, 2016

I chose my software stack with scalability in mind: team scalability and rapid iteration (we started with Rails, displacing a Node.js implementation that was poorly designed and messily implemented). Because of that previous proof-of-concept implementation we needed to replace, we were forced into a design (multiple services with SSO) that complicated our development and deployment, but has given us the room to manoeuvre while we work out the next growth phase (which will be a combination of several technologies, including Elixir, Go, and more Rails).

One thing we didn’t choose up front, because it’s generally a false optimization (it looks like it will save you time, but in reality it hurts you unless you really know you need it), is anything NoSQL as a primary data store. We use Redis heavily, but only for ephemeral or reconstructable data.

The reality is, though, you have to know and learn the scalability that your system needs and you can only do that properly by growing it and not making the wrong assumptions up front, and not trying to take on more than you are ready for. (For me, my benchmark was a fairly small 100 req/sec, which was an order of magnitude or two larger than the Node.js system we replaced, and we doubled the benchmark on our first performance test. We also reimplemented everything we needed to reimplement in about six months, which was the other important thing. My team was awesome, and most of them had never used Rails before but were pleased with how much more it gave them.)

kevan · on Oct 24, 2016

I think the main argument for (distributed) NoSQL as a primary data store is availability, but there's other ways to achieve that too.

softawre · on Oct 24, 2016

NoSQL is not easy to use. At least not easy to use correctly in failure conditions, if your data has any complexity to it at all.

erikwitt · on Oct 24, 2016

You're right, NoSQL systems tend to be more complex and especially failure scenarios are hard to comprehend. In most cases, however, this is due to being a distributed datastore where tradeoffs, administration and failure scenarios are simply much more complex. I think some NoSQL systems do an outstanding job to hide nasty details from their users.

If you compare using a distributed database to building sharding yourself for say your MySQL backed architecture, NoSQL will most certainly be the better choice.

I'll admit though dealing with NoSQL when you come from a SQL background isn't easy. Even finding the database that fit your needs is tough. We have blog post dedicated to this challenge: https://medium.baqend.com/nosql-databases-a-survey-and-decis...