@EliteDangerous I saw it on the BBC micro and was blown away. I bought it for my commodore 64 and played it all summer until I reached the rank of elite. Not quite elite with ED but still love playing it in VR when I have time. #EliteRuby
Want to learn what it takes to cache billions of tiny objects?
I really like this paper because it not only presents a clever algorithmic solution to an important systems problem, but also thoroughly evaluates it on real-world data.
The basic challenge here is that many systems (especially, but not only, in social media) want to cache billions of tiny objects (like new posts/messages) on SSDs to improve serving performance. However, existing cache strategies don't work well.
Log-structured caches write objects sequentially and index them in memory, but for tiny objects that index grows too large to fit in memory. Set-associative caches hash objects into "sets" so you don't need an index--you can look up an object's page by its hashed key--but every update requires an entire page write which rapidly degrades the SSD (you can only write to an SSD so many times before it wears out).
This paper's clever idea is to combine the two cache strategies to get their advantages without their disadvantages. They buffer incoming writes in a small log-structured cache, which writes to the SSD efficiently (as you're writing sequentially, so you write a page at a time) but doesn't need much memory (as it's small). Periodically, they export keys to a much larger set-associative cache, doing the exports in large batches to the same set to avoid degrading the SSD.
When a read comes in, it first checks the log-structured cache, then goes to the larger set-associative cache.
This design produces a cache that's fast, doesn't require much memory, and doesn't degrade SSDs. The authors prove this with an extensive evaluation on production Facebook traces, verifying all these objectives.
One big takeaway--there are only so many ways you can optimize a system, no matter how large or complex. Caching and buffering are basic strategies, but if used cleverly are very effective!
the tigerbeetle repository is literally a GOLD MINE of alpha -- like this document on their approach to style, elegance, performance, caching, memory safety etc
it's filled to the brim with gems:
https://t.co/fiLiJSx555
a data structure book that teaches you how to design a data structure by looking at the underlying hardware, workload and access patterns, locality - sounds awesome .. (via @eatonphil)
https://t.co/MFC3iEYKP5
"After reading this book, you will be able to reason about which existing data structure will perform best given a workload and the underlying hardware. In addition, you will be able to design new and possibly hybrid data structures to handle workloads with different composition, locality, and access patterns."