@AlexMillerDB@penberg Leader-based protocols probably have an easier time enforcing a nice property here once a command reaches the leader, but I imagine that quorum-like optimisations for eg performing reads from followers *probably* fall prey to these same issues.
@AlexMillerDB@penberg Skimmed paper. It doesn’t seem their definition is much better than the best I’ve come up with; it would be nice to have a better formalisation than “its outcome is decided soon, preferably before the process crashes”. I like the idea of “strict strict serializablility” though…
@AlexMillerDB@penberg It’s worth noting that we have partially addressed this in Accord - if a transaction is recovered and known not to have been agreed then we abort it. If the “following” operation does not witness it then it is aborted, for instance, however “following” is ambiguous…
@AlexMillerDB@penberg Indeed, leaderless consensus protocols are worse as a write can submarine forever, whereas here the next write will resolve the situation. Personally I think this is a flaw in the definition of linearizability, but I’m not sure the best way to formalise an improvement.
@AlexMillerDB@penberg I think timeout is essentially an operation that has no defined end, the server indicates the operation’s status is unknown. This is also true of most leaderless consensus protocols, which are considered linearizable.
@AlexMillerDB@nevgeniev@jorandirkgreef@DominikTornow@ifesdjeen By data race here, I mean a pure simple local race on some shared memory location two threads are accessing concurrently. Not a general race condition where things just happen in the wrong order, of which of course this technique has found many.
@AlexMillerDB@nevgeniev@jorandirkgreef@DominikTornow We haven’t employed this capability much yet outside of whole-system testing for distributed consensus, but this has found at least one data race that I recall. Probably others I forget, too. I think @ifesdjeen also used it to demonstrate a race we had found through other means.
@AlexMillerDB@nevgeniev@jorandirkgreef@DominikTornow We don’t do anything super clever but we are able to explore a subset of data races. We byte weave in “nemesis” points and randomly stop/start threads at these points. This won’t find most bugs that rely on incorrect usage of the Java memory model, eg missing volatile keyword
@eatonphil The paper doesn’t take a position on this; a coordinator simply does not need to be a replica, which is an important property for protocol analysis. But in practice in Cassandra coordinators are replicas. I believe DataStax intend to disaggregate coordinators from replicas.
@eatonphil EPaxos has particularly poor properties under contention, as the exact same fast-path quorum must witness every contending transaction in the exact same order for the fast-path to be taken.
@jorandirkgreef@eatonphil We plan to, but I would prefer more crc32/64 (ie covering small chunks) as they provide hard guarantees for bit flip pattern detection that cryptographic hashes do not (a single bit flip may cause undetected corruption). Koopman is a great resource on CRCs https://t.co/68EduLjKy6
@AlexMillerDB@maximecaron@penberg Cluster config can be implemented on top of Accord itself, of course, but for now we will be using Cassandra’s (which is, uh, implemented on top of Paxos for now 🙈)
@AlexMillerDB@maximecaron@penberg We’d absolutely be open to contributions providing baseline implementations of these for use cases that want to just plug it in. The library itself today only comes with non-persistent implementations of everything, or in the case of cluster config only toy ones for testing
@cetico@_Felipe Most of the time your transaction still completes successfully in one round-trip. If there were a competing transaction on the same key that the reorder buffer failed to order correctly though, it just takes one additional round trip to complete.