Some purposed options for what consistency requirements we can pick for Crux. Currently Crux has full ACID capabilities which are most likely the easiest to build correct applications on-top of. But having these having full ACID support is also brings significant trade offs in terms of scaling.
So its worth discussing what type of applications we actually want to support. And what options we have.
Full ACID Single timeline
This is the model of consistency we currently have.
How Crux currently does this is with a single kafka topic and partition. The partition contains updates to the global state all updates are deterministic. The current supported updates are writes and retracts. And also CAS witch lets you coordinate with other writes.
This model can also be extended to support arbitrary transaction functions to run on the current state of the database. But it does require functions to be completely deterministic based on the position in the transaction log and the input of the function.
The scaling limit of this model is the smallest bottleneck of the writing to a single partition in kafka and the processing/indexing of the partition on Crux instances.
Per Entity/document update model
In this model we support all of the "Full ACID Single timeline" features but we only do it per document.
So you can do CAS and transaction functions but every transactions only concerns itself with a single document/entity.
This lets us have multiple partitions where any given document has all transactions that relate to it on a single partition.
This lets you scale horizontally. As long as your writes and coordination’s works well being spread somewhat evenly across the documents/entities.
Any change that needs to happen across documents in this model won’t be atomic and will have to require multiple transaction steps to perform. And for any complicated multi document updates the user of the database might also have to write cleanup code for how to backtrack the individual steps when the requirements turns out to not be met anymore. Something that is handled for you in "Full ACID Single timeline"
ActorDB shows a model that can coordinate between documents (or independent databases in ActorDB) when the user requires it. And gives you automatically then also.
Eventual consistent model with last write wins
In this model we have no guarantees as to timeline of anything. Each instance of Crux has its current view of either a partition or the entire database state.
Each write will be written against a version of the documents/entities. The requirement is that when a instance has seen all writes on a document they will reach the same conclusion of what the state of that document is. So that means trowing out writes that got written later than another write. One way to do that is by timestamps.
Its unclear if this model has any benefits on-top of Kafka (or a system like that) Since that type of tool is explicitly designed to be able to route messages based on partitioning of keys. The whole point is to maintain a timeline or multiple timeline of events. While a model like this lets you replicate data without single points of coordination.
But in this model the Crux instances can write the updates directly to its index and give a OK response back to the client even before synchronising it with other instances or with Kafka. Giving you a level of availability that other the other options does not have.
Possible other modes
Bring your other options you can think of to the phase2 discussions
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close