Benedict Elliott Smith 🇺🇦

@_belliottsmith

Apache Cassandra @apple

Joined March 2014

1.1K Following

205 Followers

1K Posts

Pinned Tweet

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 4 years ago

Fast general purpose transactions with optimal global latency in Apache Cassandra (and making today’s LWTs faster) https://t.co/BoyrQCaATp

Benedict Elliott Smith 🇺🇦 @_belliottsmith

about 2 years ago

@AlexMillerDB @penberg I think the phrasing “log position 42 is X” is broadly equivalent to having a leader-based protocol, however it is implemented.

Benedict Elliott Smith 🇺🇦 @_belliottsmith

about 2 years ago

@AlexMillerDB @penberg Leader-based protocols probably have an easier time enforcing a nice property here once a command reaches the leader, but I imagine that quorum-like optimisations for eg performing reads from followers *probably* fall prey to these same issues.

Benedict Elliott Smith 🇺🇦 @_belliottsmith

about 2 years ago

@AlexMillerDB @penberg Skimmed paper. It doesn’t seem their definition is much better than the best I’ve come up with; it would be nice to have a better formalisation than “its outcome is decided soon, preferably before the process crashes”. I like the idea of “strict strict serializablility” though…

Who to follow

Alex P

@ifesdjeen

Distributed and Storage Systems. Apache Cassandra Committer and PMC member. Author of Database Internals @therealdatabass. Discord: https://t.co/8LwhZom9eQ

Debezium Project

@debezium

Turn your databases into change event streams

Nate McCall

@zznate

Database stuff by day. Custom skates by night. Open source: Apache Cassandra Committer/PMC. The original #NoSQLJesus. https://t.co/kgsxk6mAbx

Benedict Elliott Smith 🇺🇦 @_belliottsmith

about 2 years ago

@AlexMillerDB @penberg It’s worth noting that we have partially addressed this in Accord - if a transaction is recovered and known not to have been agreed then we abort it. If the “following” operation does not witness it then it is aborted, for instance, however “following” is ambiguous…

Benedict Elliott Smith 🇺🇦 @_belliottsmith

about 2 years ago

@AlexMillerDB @penberg Indeed, leaderless consensus protocols are worse as a write can submarine forever, whereas here the next write will resolve the situation. Personally I think this is a flaw in the definition of linearizability, but I’m not sure the best way to formalise an improvement.

Benedict Elliott Smith 🇺🇦 @_belliottsmith

about 2 years ago

@AlexMillerDB @penberg I think timeout is essentially an operation that has no defined end, the server indicates the operation’s status is unknown. This is also true of most leaderless consensus protocols, which are considered linearizable.

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 2 years ago

@AlexMillerDB @nevgeniev @jorandirkgreef @DominikTornow @ifesdjeen By data race here, I mean a pure simple local race on some shared memory location two threads are accessing concurrently. Not a general race condition where things just happen in the wrong order, of which of course this technique has found many.

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 2 years ago

@AlexMillerDB @nevgeniev @jorandirkgreef @DominikTornow We haven’t employed this capability much yet outside of whole-system testing for distributed consensus, but this has found at least one data race that I recall. Probably others I forget, too. I think @ifesdjeen also used it to demonstrate a race we had found through other means.

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 2 years ago

@AlexMillerDB @nevgeniev @jorandirkgreef @DominikTornow This technique should catch simple increments that don’t use eg atomic increment.

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 2 years ago

@AlexMillerDB @nevgeniev @jorandirkgreef @DominikTornow We don’t do anything super clever but we are able to explore a subset of data races. We byte weave in “nemesis” points and randomly stop/start threads at these points. This won’t find most bugs that rely on incorrect usage of the Java memory model, eg missing volatile keyword

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 2 years ago

@eatonphil The paper doesn’t take a position on this; a coordinator simply does not need to be a replica, which is an important property for protocol analysis. But in practice in Cassandra coordinators are replicas. I believe DataStax intend to disaggregate coordinators from replicas.

219

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 2 years ago

@eatonphil EPaxos has particularly poor properties under contention, as the exact same fast-path quorum must witness every contending transaction in the exact same order for the fast-path to be taken.

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 2 years ago

@eatonphil This section actually suggests that existing leaderless protocols perform poorly under contention, not that leader-based protocols do.

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 3 years ago

@jorandirkgreef @eatonphil We plan to, but I would prefer more crc32/64 (ie covering small chunks) as they provide hard guarantees for bit flip pattern detection that cryptographic hashes do not (a single bit flip may cause undetected corruption). Koopman is a great resource on CRCs https://t.co/68EduLjKy6

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 3 years ago

@AlexMillerDB @maximecaron @penberg Cluster config can be implemented on top of Accord itself, of course, but for now we will be using Cassandra’s (which is, uh, implemented on top of Paxos for now 🙈)

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 3 years ago

@AlexMillerDB @maximecaron @penberg We’d absolutely be open to contributions providing baseline implementations of these for use cases that want to just plug it in. The library itself today only comes with non-persistent implementations of everything, or in the case of cluster config only toy ones for testing

Benedict Elliott Smith 🇺🇦 @_belliottsmith

over 3 years ago

@cetico @_Felipe Most of the time your transaction still completes successfully in one round-trip. If there were a competing transaction on the same key that the reorder buffer failed to order correctly though, it just takes one additional round trip to complete.

Benedict Elliott Smith 🇺🇦

@_belliottsmith

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users