Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well maybe for non critical data. Multi regional Kafka clustering is not easy. There are much better and cheaper data storage options that can provide eventual consistency.


Mirror Maker is fine for small scale, or you upgrade into https://github.com/uber/uReplicator ... region-homing data is generally more practical

You should provide examples of either better or cheaper to substantiate your claim.


What tech would you use to store a critical append-only data if not with Kafka ?


Perhaps store it out to hdfs/blob storage. Then use some kind of unified processing layer to reconstruct history or parts of history when you need it? Eg: https://news.ycombinator.com/item?id=20059006#20062821

Or use a durable unbounded solution like Pravega instead of Kafka? https://news.ycombinator.com/item?id=20059006#20062821

edit: this could be problematic especially when data privacy laws are involved. See the comments from gopalv below.


Kafka topics compacted to some sort of disk storage? It really depends on what you need to do with it. All of our kafka topics have a TTL on messages that's less than or equal to 7 days. If the data is critical, then that topic has a archiver job that writes the messages to files in HDFS.

Postgres, Spanner, etc. come to mind as options if you need to store critical append-only data and also have it be queryable in real-time.


postgres is a bit different use case, isn't? kafka is mainly about scalability, it has nice partitioning out of the box. and i would not say that it's not reliable, because you always replicate data with it



this is the first time I read about Kafka has durability and consistency issues... but the the article also doesn't say what the durability and consistency issues are.


Appache Cassandra.


ok,in what areas is Cassandra better than Kafka ? i am reading this https://stackoverflow.com/questions/35711129/how-to-stream-d...

According to https://issues.apache.org/jira/browse/CASSANDRA-8844: CDC seems a lot less capable than Kafka:

> Cassandra would only write to the CDC log, and never delete from it. > Cleaning up consumed logfiles would be the client daemon's responibility

on the other hand there is the Cassandra CDC Kafka Connector, that seems to do some work...

in the i still don't see why i would want to use Cassandra over Kafka for storing data, when i care about durability and streaming of append only data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: