I’m a “true believer” in CRDTs, which I have some experience in. You can implement a useful CRDT for simple applications in under 100 lines if all you care about are standard database objects - like maps, sets and values. List CRDTs are where they get complicated, but most applications aren’t collaborative text editors.
The promise of CRDTs is that unlike most conflict resolution systems, you can layer over a crdt library and basically ignore all the implantation details. Applications should (can) mostly ignore how a CRDT works when building up the stack.
The biggest roadblock to their use is that they’re poorly understood. Well, that and implementation maturity. Automerge-rs merged a PR the other day which brought a 5 minute benchmark run down to 2 seconds. But by bit we’re getting there.
I agree with your assessments here - CRDT is the way forward for most applications; no user wants to fiddle with a merge UI or picking versions like with iCloud. I think RxDB’s position here is from their CouchDB lineage.
> The biggest roadblock to their use is that they’re poorly understood. Well, that and implementation maturity.
I certainly have more understanding to do. My biggest open question is how to design my centralized server side storage system for CRDT data. To service writes from old clients I need to retain a total history of a document, but I don’t want to force new clients to download the entire history nor do I want these big histories in my cache, so I end up wanting a hot/cold system; and building that kind of thing and dealing with the edge cases seems like more than 100 lines of code.
It seems like the Yjs authors also recognize that CRDT storage on the server is an area to address, there was some work on a custom database in 2018, although my thinking is more about how to retrofit text CRDTs into my existing very conservative production cloud software stack than about writing to block storage.
> no user wants to fiddle with a merge UI or picking versions like with iCloud.
For prose text, what do you think about combining a document-scale CRDT, with fine-grained locking — e.g. splitting the document into a "list of lines/sentences", where lines have identity, and then only allowing one person to be modifying a given line at a time?
I've always felt like this was under-explored, given that in prose text it's almost always semantically incoherent for multiple people to be trying to change a single sentence in different ways at the same time anyway (i.e. they would each have a series of edits they want to do to line A to turn it into either A' or A"; but any result that's not either purely A' or A" will very likely be nonsensical. One could say that the A → A' transformation a user does to a sentence is intentionally transactional.)
I almost thought Notion would be a good example of this, but apparently not (https://www.notion.so/Real-time-collaboration-20a1873baf334d...) — they actually do allow multiple users to be editing the same leaf-node content block at the same time, and so have taken on the full scope of the CRDT problem.
> but I don’t want to force new clients to download the entire history nor do I want these big histories in my cache, so I end up wanting a hot/cold system; and building that kind of thing and dealing with the edge cases seems like more than 100 lines of code.
Yes, but these needs are able to be cleanly abstracted away on the backend — there are internally-complex infrastructure components (like CouchDB, or Kafka) that expose a pure CQRS model to clients, but internally are doing fancier stuff involving reducing changes onto snapshots and then exposing new CQRS streams that begin history with atomic "introduce all this stuff from a snapshot"-typed 'changes'.
There's also some convergent evolution happening here with non-replay sync strategies in blockchains (which can be more interesting to look at if you care about the serverless p2p Operational Transformation type use-case of CRDTs.)
My vision of this wouldn’t involve not accepting edits; but rather, the document would react to you trying to edit a locked block the same way macOS reacts to you trying to edit a locked document: offer to duplicate the block and then let you edit the duplicate. Both versions of the block would then appear in the document, perhaps with some annotation that they’re sibling forks of the same original text. (Compare and contrast: books that compare-and-contrast works with subtle changes between different versions, e.g. the four Gospels of the Bible.)
Also, I’m speaking about locking that’s fine-grained temporally as well as spacially: a line would only need to be locked with a ~10s TTL when a user begins to type in that line. Think of it like the user composing a transaction of modifications to the same line, and then committing it. A lot like typing a message into a chat program. Just the user having their cursor on the line, wouldn’t imply that the line is locked; it would only lock when they start typing.
This is already how group-chat apps work, mind you; if you’re an admin who can edit other people’s message lines, you nevertheless can’t edit someone else’s message line while they’re editing it. But they’re only considered to be editing it while they’re actively typing, and for a few seconds after that. If they go idle, someone else edits their message-line, and then you come back and try to submit your edit, it will be rejected. (Of course, that behaviour makes perfect sense for group chat software, where the only other people who can edit your text are moderators, and so moderation actions should “trump” user actions. In a p2p collaboration context, IMHO adding resistance/intentionality to per-line forking, but nevertheless allowing it, makes the most sense.)
Group chat apps don’t care about preserving functionality when the network drops. But a document editor does. And editing a message is relatively rare in chat apps so they can afford to lock; editing a line is common in documents and the chance of a lock conflict is higher.
Your suggestion also assumes that the network is reliable. What happens if a user takes a lock and there is a partition? If the document is P2P, there is no central authority; when should the other participants override the lock? How much overhead does that add to the protocol?
The main point is that there is no notion of a central clock in a distributed system; hence “lock temporally” is not precise. Relative to which participant? And what happens when messages are dropped? (Even the lock message might be dropped!) A distributed lock implementation is non-trivial.
> My vision of this wouldn’t involve not accepting edits; but rather, the document would react to you trying to edit a locked block the same way macOS reacts to you trying to edit a locked document: offer to duplicate the block and then let you edit the duplicate. Both versions of the block would then appear in the document, perhaps with some annotation that they’re sibling forks of the same original text. (Compare and contrast: books that compare-and-contrast works with subtle changes between different versions, e.g. the four Gospels of the Bible.)
If you're using a system where you're guaranteed to have knowledge of what other people are editing at all times, there's really no need to use CRDTs in the first place.
The locking in Quip is like a UI concern - it's not a guarantee, and I don't know how (or if) Quip handles concurrent offline edits. As a user of Quip (while at Airbnb) I was pretty frustrated by the lock UI, although it improved once they added the "steal this lock" button.
While I am sympathetic to the idea that users don't like merges, they also hate idiotic combinations of data without their oversight. So, bit of a rock and a hard place.
If you want collaboration between people, you have to structure it in a way that makes it a conversation, I believe.
I could almost see an idea that you could pattern it after musicians playing together, but that is a very particular kind of rehearsal that has not been done in any other practice, as far as I am aware. Improv may come close, but even that has very specific techniques that really don't make sense in a CRDT landscape.
Oh cool! I've wanted something like notion for years. Ideally on top of CRDTs (so I own my own data). I really appreciate all the work your company is doing! Feel free to get in touch if you want to have a proper chat about this stuff.
> My biggest open question is how to design my centralized server side storage system for CRDT data. To service writes from old clients I need to retain a total history of a document, but I don’t want to force new clients to download the entire history nor do I want these big histories in my cache, so I end up wanting a hot/cold system; and building that kind of thing and dealing with the edge cases seems like more than 100 lines of code.
Yeah definitely more than 100 lines of code. I'm sad to report that in diamond types (my own CRDT) I've spent ~12000 lines of code in an attempt to solve some of these problems. I could probably get that down under 3000 loc in a rewrite if I'm happy to throw away some of my optimizations. Doing so would dramatically lower the size of the compiled wasm bundle too - though the wasm bundle is still comfortably under 100kb over the wire, so maybe its fine?
Regarding history, I have a lot of thoughts. The first is that with the right approach, historical data compresses really well. Martin Kleppman's automerge-perf data set has 260k edits ending in 100kb of text. The saving system I'm working on can store the entire editing history (enough to merge changes from any version) in this example with just 23kb of overhead on disk. I think that resulting data set might only need to be accessed in the case of concurrent changes, and then only back as far as the common ancestor. But I haven't implemented that optimization yet.
And yeah; I've been thinking a lot about what a CRDT-native database could look like too. There's way too many interesting and useful problems here to explore.
I was also a "true believer" in CRDTs for a long time, implementing my first ones in Erlang about 9 years ago[1], but my opinion of where they fit has changed significantly.
The one issue with CRDT that I find is rarely mentioned and often ignored is the case where you've deployed these data structures that include merge logic to a set of participating nodes that you can't necessarily update at will. Think phones that people don't update, or IOT/sensor devices like electric meters or other devices "in the wild".
When you include merge logic – really any code or rules that dictate what happens when the the data of 2 or more CRDTs are merged – and you have bugs in this code running on devices you can never update, this can be a huge mess. Sure you can implement simple counters easily (like the ones I linked to), and you can even use model checking to validate them. But what about complex tree logic like for edits made to a document? Conflict resolution logic? Distributed file system operations? These are already very complex and hard to get right without multiple versions involved and unfixable bugs causing mayhem.
Having to deal with these bugs in the context of a fleet of participants on a wide range of versions of the code, the combinatorial explosion of the number of possible interactions and effects of these differing versions and bugs taken together can really become impossible to manage.
I'd be interested to hear from folks who have experience with these kinds of issues and how they have dealt with them, especially if they are still convinced that CRDTs were the right choice.
> When you include merge logic – really any code or rules that dictate what happens when the the data of 2 or more CRDTs are merged – and you have bugs in this code running on devices you can never update, this can be a huge mess.
This is a really important point, and its a problem with distributed systems in general. Moxie Marlinspike (the guy behind Signal) gave a great talk on this a few years ago thats well worth a watch. He argues that distributed systems will always be outcompeted by their centralised counterpart because they can't add features as quickly:
We've seen this playing out in real time watching git slowly upgrade its hashing function. The migration is taking years.
I don't know a general solution, but a partial solution in my mind is that we need to knuckle up and make these base layers be correct. Model testing and fuzz testing can absolutely iron out bugs in this stuff, no matter how complex it seems. Bugs in application software usually aren't a big deal, but CRDTs fall in the same category as databases and compilers - we need to prioritise correctness so you can deploy this stuff and forget about it.
Its not as bad as it sounds. The models which underpin even complex list CRDTs are still are orders of magnitude simpler than what compiler authors deal with daily. Correctness in a CRDT is also very easy to test for because there just aren't that many edge cases to find. You can count on one hand the number of ways you can modify a list and the invariants are straightforward to validate.
We used fuzz testing for json-ot a few years ago and never ran into a single bug in production which was in the domain of things the fuzzer tested for. (Though the fuzzer was missing some obscure tests around updating cursor positions.)
Because those other options abstract the decision making away from the ‘can never be updated’ part usually into some part of the system where the consequences of a problem/interest in the right solution are also collocated with someone who can do something about it (a API server with ossified clients, or network switch with buggy NIC’s wired into it, etc.)
If truly peer to peer, that is a lot less clear - do you end up in a collaborative p2p document model forking the documents between ‘new rev’ clients and ‘old rev’ clients? Who ‘wins’? What is the consequence of losing?
At least an API server can clearly reject the client and give an error message - if it’s a CRDT, how does that work?
I might be missing something, but I have trouble seeing how CRDTs can work for regular CRUD style applications.
Just the first example that pops into my head: edit A sets an invoice status to paid, edit b changes the invoice amount from 100 to 120. The merge is a paid invoice with an incorrect amount.
A workaround would be to record a separate PaidInvoice that wont be changed by the application logic.
But that's just a really trivial example that only involves scalar fields inside a single object, and also relies on the application logic considering all ways that the CRDT might behave.
There are countless ways to end up with data that violates the constraints of the domain.
Is there any theoretical groundwork happening on how CRDTs can preserve domain semantics?
I know nothing about CRDTs, but I feel the ultimate problem with your invoice example might be a representation issue.
> edit A sets an invoice status to paid, edit b changes the invoice amount from 100 to 120. The merge is a paid invoice with an incorrect amount.
There is no such thing as "paid status" of an invoice. That's a simplification. If you unroll it, what you get is accounts that must balance. The invoice starts with "accounts payable"[0] set to 100$, and no payment covering it. Edit A adds a transaction from company account to AP to the value of $100. Edit B changes the liability in AP to $120. The result of a merge should be a transaction record for $100 and liability of $120, meaning a partially-paid invoice.
--
[0] - Or whatever it is called, I keep getting confused by this terminology, as it's not something regular people use in my part of the world.
Thats a great example. In this specific case, you'd want to associate the invoice status (paid) with the point in time when that change was made so concurrent changes to the invoice result in merge conflicts.
I don't know anyone who's building a CRDT like that; but its definitely possible. Basho's RIAK databases had a CRDT implementation that would provide a grounding to handle cases like that. When concurrent changes happen, the database simply stored each conflicting value. It was up to the application to decide how the conflicting values merged together.
On top of something like that, you can imagine a bunch of different merging strategies which would be appropriate in different situations: Optimistic (recursively merge all children), Arbitrary-writer-wins, or Conflict-and-bother-the-user. In this case, generating a conflict state is appropriate. Conflicts also make sense when merging branches in git; and I'd love to see a CRDT which managed to reproduce that functionality.
But no, I don't know of any work happening in that space. Maybe the inventor could be you!
> When concurrent changes happen, the database simply stored each conflicting value.
Yes, this is the key! 1. store everything, 2. give the domain modelers an array of strategies to merge particular cases, with the default being conflict-and-bother-the-user. Every existing merge strategy can be represented on top of this model, so it should be the base layer. The conflict resolution strategies can be sliced up, layered, hidden, or exposed to the user according to the design criteria because conflict resolution is part the problem domain.
I would go so far as to say that assuming one particular merge strategy for all data structures as 'correct' is actively harmful. What is the worst case outcome in the example above? That nobody looks at it. In a general case like this choosing any default action at all is the wrong call because both edits were made with different assumptions and contexts, and that's the level where the conflict should be resolved, not down here in the invoice.
> Is there any theoretical groundwork happening on how CRDTs can preserve domain semantics?
Effectively using CRDTs requires a fundamentally different model of state. Typical CRUD stuff simply doesn't work. If you're unwilling or unable to refactor your state layer, then CRDTs are gonna seem totally infeasible.
I'm interesting to se how can this apply for "regular" database apps (like invoicing). I need to add sync/offline to my main app and want something solid to build upon (have done things homemade before) and has wonder if CRDT could be applied, but how?
My current understanding is that CRDTs merely guarantee derministic merging of updates to some basic data structures, where deterministic means that the outcome is well defined and always the same regardless of the order in which updates are merged.
That doesn't mean the outcome makes any sense at all in terms of the sort of application level requirements and constraints you would typically find in a transactional database application. Conflicts may still arise on that level.
So I think what we need to really fulfill the promise of CRDTs is a way to express those application level constraints on top of them.
I wonder why there isn't some open source engine based on this at least for CRUD apps since it has a lot of potential and it is really "simple" to implement and even understand.
I'm a CRDT newbie, but I'll take a stab, hopefully someone can correct me if I'm getting it wrong.
After a quick reading the "CRDTs go brr" and the wikipedia page, I think CRDT gives us a mathematical strategy for resolving conflicts. It doesn't mean that the end result will make sense.
The Wikipedia article gives an example of merging an event flag represented by a boolean variable. So the var in this case means that "someone observed this event happening". So the rule for merging this var from different sources is simple, if any source of data reports the var as true, the merged result should be true as well.
The implication is it matters what the data represents, not just whether it is a boolean or a string, etc.
I'm guessing that a colloborative notes field, or a "did someone call this customer" boolean might benefit from a CRDT more so than keeping track of bank account values.
Just read your "crdts-go-brrr" post from the other commenter and just wanted to say thanks for writing all that up! Great insight and great information.
Diamond types will beat yjs on a lot of benchmarks when I have it all working, but I'm still missing important functionality (like loading and saving, and data types other than plain text). The automerge devs have been working on incorporating a bunch of the changes I talked about in my blog post, but they have more work to do before those optimizations have all landed. If I were starting a new project today, I'd use Yjs.
But I suspect in a year or so from now there'll be several high quality CRDTs available to choose from. The space is heating up.
That is a bold claim and no supporting evidence. This forum can be so valuable and yet so maddening.
Please do share your experiences that support your opinion.
Gun.js can seem ‘magical’ but then again the 100 lines of code (iirc it’s less) are there for all to see.
I’ve found the time invested to learn about gun (the sync protocol) and RAD (storage abstraction layer) quite worth it. The addition of SEA (security encryption authorization) wrapping around gun enables a lot of implementation options which some find frustrating as they’d rather not have to think about topology before building ‘an app’. Could that be a reason for your comment?
Because when the author writes or speaks about the project, they consistently demonstrate a fundamental lack of understanding of the domain. There is a reason that GunDB is never mentioned in literature or papers.
Snake oil needs to be purchased and it has hidden ingredients. Where are those in gun? Free and OSS. If you don’t like it don’t use it but why the hate? Serious question.
I started trying to build a non-PHP Wiki for a hobby, and the notion that I was going to have to implement my own version control on top of it stymied me, so CRDTs looked good to me when I first started to get familiar with the idea.
Trac (project management) does some stuff behind the scenes stored in svn, which it uses for edit history. I always liked that idea. Why have two? I have wished for some time that someone did a Trac for git. I just don't want to be the one to write it.
I've also wished for some time that someone would make a new git, not designed by barbarian cannibals, potentially based on CRDTs.
Somehow these notions melted together and now making progress hinges on whether there's a CRDT out there that's up to the task of managing source code - and written in a language I can handle. So far those two haven't appeared, and based on the number of corner cases I've heard described when I watch lectures on CRDTs, I'm pretty sure if I tried to write one myself I'd never see daylight again, but might see the inside of a padded room. Do you think it's safe to say we're still in the distillation phase of invention where CRDT's are concerned? Is this accidental complexity we are seeing or is most of it intrinsic?
I'm instead spending a lot of my free time on the hobby instead of on writing collaborative tools and/or accidentally writing a Trac replacement.
It's not really high scale / high performance, but Apple's notes.app uses state based CRDTs internally for conflict resolution. It is perhaps high profile, since a lot of people are using it.
Can you point to a good introduction to Conflict-free replicated data types (CRDT)?
It's crossed my (short attention span) radar a couple times and seems interesting. I really love the idea about being able to use it, but also "forget" about it while developing.
Does "leaky abstraction" apply here? If you constrain things enough, can a dev really use it and forget about the details?
I would very much love to see the "simple CRDT" implementation described above, seems like it would be a great learning tool and/or foundation on which to build something more complicated!
The CRDT I was referencing was Shelf by Greg Little. He's given a few talks about it at the braid meetups. When he first showed it off, Kevin Jahns (the Yjs author) was also there and was as impressed as I was:
Don't make the same mistake as the Google Wave team. Collaborative editing is a great intellectual challenge to work on, but in reality users don't care much about documents that auto-update in real time. In fact, it can be annoying.
The promise of CRDTs is that unlike most conflict resolution systems, you can layer over a crdt library and basically ignore all the implantation details. Applications should (can) mostly ignore how a CRDT works when building up the stack.
The biggest roadblock to their use is that they’re poorly understood. Well, that and implementation maturity. Automerge-rs merged a PR the other day which brought a 5 minute benchmark run down to 2 seconds. But by bit we’re getting there.