Nice article! Updates, especially from CDC, are still a challenging area for ClickHouse. We've ReplacingMergeTree many times for this. It takes a little bit of getting used to, as it's essentially a lock-free distributed algorithm for handling updates. However, it can be very performant once you understand how to use it.
Totally. For us, the challenge lied in getting used to the async aspect of it, and architecting in such a way that queries that need to eliminate dupes remain performant...
Our initial approach was using ReplacingMergeTree but we couldn't get aggregation queries to run quickly enough... Any solution for that?
Yes, here's a blog that you might not have seen. Key factors in performance: partitioning so that ClickHouse does not need to run SELECT FINAL over a large number of parts and having a good ORDER BY. MergeTree queries are sensitive to ordering.
disclaimer: I work for Altinity.