DDIA (Designing Data-Intensive Applications) IS THE BIBLE OF BACKEND ENGINEERING
Martin Kleppmann spent years inside google, linkedin & distributed systems research
then wrote everything down in one book
600 pages of how real systems fail, scale & survive
most people buy it...few finish it
this dev did & turned every chapter into clean github notes
replication & sharding..transactions & consensus...batch vs stream processing...why databases quietly lie to you
the exact knowledge that separates a junior who codes from a senior who architects
one repo no excuses
๐ https://t.co/9UlMCljPBQ
The only thing which speaks louder than your actions is your intent and it is quite visible in everything you do. If you donโt have the right intent, people eventually notice.
Software engineers don't get paid to write code; they get paid to solve problems.
The faster you realize this, the sooner you'll stop being afraid that AI will replace you and the better your career will be.
Youโre debugging a production API and notice something strange.
Your rate limiter says 100 requests per minute, but the logs show 200 requests hitting in a single second. Nothing is technically โwrongโ, but the system still gets slammed.
This usually comes down to how the rate limit is implemented. Different algorithms enforce the same rule in very different ways.
Take a simple analogy: a nightclub with a rule โ max 10 people per minute.
With a Fixed Window, the bouncer counts entries per minute block. If 10 people enter at 10:00:59 and another 10 at 10:01:00, the system allows it. Technically correct, but you just let 20 people in within a second.
A Sliding Window fixes this by checking the last 60 seconds in real time. Itโs more accurate, but requires tracking timestamps for every entry, which becomes expensive at scale.
Then thereโs Token Bucket. Tokens refill gradually and each entry consumes one. Small bursts are allowed, but sustained traffic gets throttled.
Finally, Leaky Bucket smooths everything. Requests queue up and are processed at a constant rate.
Good engineers donโt just add rate limits.
They choose the algorithm that matches the systemโs traffic patterns.
Self-balancing Binary Search Trees are one of the most important data structures that many of us tend to skip while doing competitive programming. But in real-world stateful systems, theyโre extremely useful because they keep data sorted while guaranteeing O(log n) time for insert, delete, and search.
In a normal BST, these operations take O(height) time. If the tree becomes skewed (when height == no. of elements), this can degrade to O(n). Self-balancing BSTs avoid this by automatically rebalancing the tree after insertions or deletions, keeping the height around log n while preserving all BST properties.
Think about this from first principles: how would you design a BST that automatically rebalances itself after insertions and deletions so that the tree always remains balanced?
Some common implementations include:
- โ AVL Trees
- Red-Black Trees
- B and B+ Trees (although not a binary tree, they are also self-balanced trees that guarantee log n operations)
These data structures power many real-world systems, such as:
- Databases: Index structures (like B/ B+ Trees)
- โ Language standard libraries: Ex, ordered maps/sets (like โ โฏTreeMapโฏโ or โ โฏstd::mapโฏโ ) are implemented using Red-Black Trees.
- Memory management systems: Used to track free/allocated memory blocks efficiently.
- Event scheduling systems: Like operating system schedulers that must always access the next smallest timestamped event efficiently.
These are just a few examples that come to mind right now, but there are many more practical applications of self-balancing trees because this is so powerful that it gives insert, delete, and search, all three in O(logN).
If other data structures optimise one operation, the time complexity of the other operation increases, such as:
- Array: Fast access O(1), slow insert/delete O(n).
- Linked List: Fast insert/delete O(1) at ends, slow search O(n).
- Self-Balancing BST: All operations (search, insert, delete) O(log n), keeps data sorted.
๐๐ผ๐ ๐๏ฟฝ๏ฟฝ๏ฟฝ ๐๐ฒ๐๐ถ๐ด๐ป ๐ฎ ๐๐ถ๐๐๐ฟ๐ถ๐ฏ๐๐๐ฒ๐ฑ ๐๐ฎ๐ฐ๐ต๐ฒ ๐ฆ๐๐๐๐ฒ๐บ (๐ฅ๐ฒ๐ฑ๐ถ๐)
Design a ๐ต๐ถ๐ด๐ต๐น๐ ๐ฎ๐๐ฎ๐ถ๐น๐ฎ๐ฏ๐น๐ฒ, ๐น๐ผ๐-๐น๐ฎ๐๐ฒ๐ป๐ฐ๐ ๐ถ๐ป-๐บ๐ฒ๐บ๐ผ๐ฟ๐ ๐ฑ๐ฎ๐๐ฎ ๐๐๐ผ๐ฟ๐ฒ that can scale horizontally across multiple nodes, providing sub-millisecond response times while handling massive concurrent read/write operations and automatic failover .
The system operates on a ๐ฑ๐ถ๐๐๐ฟ๐ถ๐ฏ๐๐๐ฒ๐ฑ ๐ฎ๐ฟ๐ฐ๐ต๐ถ๐๐ฒ๐ฐ๐๐๐ฟ๐ฒ ๐๐ถ๐๐ต ๐ต๐ฎ๐๐ต ๐๐น๐ผ๐ ๐ฝ๐ฎ๐ฟ๐๐ถ๐๐ถ๐ผ๐ป๐ถ๐ป๐ด. Redis Cluster splits the keyspace into 16,384 hash slots, each assigned to a primary node . When a client requests a key, the system calculates `CRC16(key) mod 16384` to determine the responsible node, enabling automatic request routing and redirection .
The platform's core consists of ๏ฟฝ๏ฟฝ๏ฟฝ๐ฎ๐๐น๐-๐๐ผ๐น๐ฒ๐ฟ๐ฎ๐ป๐, ๐ฑ๐ฒ๐ฐ๐ฒ๐ป๐๐ฟ๐ฎ๐น๐ถ๐๐ฒ๐ฑ ๐ฐ๐ผ๐บ๐ฝ๐ผ๐ป๐ฒ๐ป๐๐:
. Primary Nodes: Hold data shards and process client requests for assigned hash slots .
. Replica Nodes: Maintain near real-time copies of primary nodes, providing read scalability and automatic failover .
. Cluster Manager: Runs on each node, using gossip protocol to monitor health and coordinate cluster state .
. Proxy Layer: Routes client operations to correct shards, abstracting cluster topology .
Behind the scenes, a ๐ฟ๐ผ๐ฏ๐๐๐ ๐ต๐ถ๐ด๐ต-๐ฎ๐๐ฎ๐ถ๐น๐ฎ๐ฏ๐ถ๐น๐ถ๐๐ ๐บ๐ฒ๐ฐ๐ต๐ฎ๐ป๐ถ๐๐บ ensures continuous operation. Nodes exchange periodic PING/PONG heartbeats; if a primary fails, a replica is automatically promoted using quorum-based consensus (requiring >50% nodes online) . Sentinel provides additional monitoring and failover for non-clustered deployments .
This scale demands ๐ฎ๐ด๐ด๐ฟ๐ฒ๐๐๐ถ๐๐ฒ ๐ผ๐ฝ๐๐ถ๐บ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป ๐๐ฒ๐ฐ๐ต๐ป๐ถ๐พ๐๐ฒ๐. Multi-key operations require hash tags (`{user:1000}.profile`) to force keys into the same slot . Pipelining batches commands to reduce network round trips, while connection pooling ensures efficient resource reuse .
๐๐ฟ๐ถ๐๐ถ๐ฐ๐ฎ๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ ๐๐ฒ๐๐ถ๐ด๐ป ๐ฃ๐ฟ๐ถ๐ป๐ฐ๐ถ๐ฝ๐น๐ฒ๐: ๐ญ) ๐ฆ๐ต๐ฎ๐ฟ๐ฑ๐ฒ๐ฑ ๐๐ฟ๐ฐ๐ต๐ถ๐๐ฒ๐ฐ๐๐๐ฟ๐ฒ with 16384 hash slots, ๏ฟฝ๏ฟฝ) ๐๐๐๐ผ๐บ๐ฎ๐๐ฒ๐ฑ ๐๐ฎ๐ถ๐น๐ผ๐๐ฒ๐ฟ via replica promotion, ๐ฏ) ๐ฆ๐ต๐ฎ๐ฟ๐ฒ๐ฑ-๐ก๐ผ๐๐ต๐ถ๐ป๐ด ๐๐ฒ๐๐ถ๐ด๐ป eliminating single points of failure, ๐ฐ) ๐๐น๐๐๐๐ฒ๐ฟ ๐ค๐๐ผ๐ฟ๐๐บ requiring odd number of nodes (minimum 3) , ๐ฑ) ๐ง๐๐ป๐ฎ๐ฏ๐น๐ฒ ๐๐ผ๐ป๐๐ถ๐๐๐ฒ๐ป๐ฐ๐ prioritizing availability (AP) over strong consistency .
๐ง๐ฒ๐ฐ๐ต๐ป๐ถ๐ฐ๐ฎ๐น ๐๐ฟ๐ฐ๐ต๐ถ๐๐ฒ๐ฐ๐๐๐ฟ๐ฒ ๐ฆ๐๐ฎ๐ฐ๐ธ:
. ๐๐ผ๐ฟ๐ฒ ๐๐ป๐ด๐ถ๐ป๐ฒ: C/C++ (single-threaded event loop)
. ๐๐น๐๐๐๐ฒ๐ฟ๐ถ๐ป๐ด: Redis Cluster, Redis Sentinel
. ๐ฃ๐ฎ๐ฟ๐๐ถ๐๐ถ๐ผ๐ป๐ถ๐ป๐ด: Consistent hashing, 16384 hash slots
. ๐๐ฎ๐๐ฎ ๐ฆ๐๐ฟ๐๐ฐ๐๐๐ฟ๐ฒ๐: Strings, Hashes, Lists, Sets, Sorted Sets, Bitmaps, HyperLogLog
. ๐ ๐ผ๐ฑ๐๐น๐ฎ๐ฟ ๐๐ฎ๐ฝ๐ฎ๐ฏ๐ถ๐น๐ถ๐๏ฟฝ๏ฟฝ๐ฒ๐: RedisJSON, RediSearch, RedisTimeSeries, RedisAI
. ๐ฃ๐ฒ๐ฟ๐๐ถ๐๐๐ฒ๐ป๐ฐ๐ฒ: RDB snapshots, AOF logs
. ๐ข๐ฝ๐ฒ๐ฟ๐ฎ๐๐ถ๐ผ๐ป๐: Kubernetes, Docker, redis-trib, Prometheus + Grafana
. ๐๐น๐ผ๐๐ฑ ๐ข๐ณ๐ณ๐ฒ๐ฟ๐ถ๐ป๐ด๐: AWS ElastiCache, Azure Cache for Redis, Google Cloud Memorystore
๐ Learn more in The Modern System Design Handbook: https://t.co/2LauJpfbk4
๐ Grab the Master System Design Case Studies: https://t.co/ujlZXVdc0g
How Javaโs Garbage Collector Reclaims Memory
You write Java source code and create objects using the new keyword.
These objects are stored in the Heap memory managed by the Java Virtual Machine (JVM).
As your program runs, some objects become unused when no references point to them anymore.
At runtime:
โ The Garbage Collector (GC) identifies unreachable objects by analyzing reference chains starting from GC Roots (such as stack variables, static fields, and active threads).
โ Objects that are no longer reachable are marked for removal.
โ The GC removes (sweeps) those unused objects to free up heap space.
โ In generational GC, memory is divided into Young Generation and Old Generation, and objects are collected differently based on their lifespan.
โ The JVM may compact memory after collection to reduce fragmentation and improve allocation efficiency.
The result: Java automatically reclaims unused memory, reduces memory leaks, and keeps applications running efficiently without manual memory deallocation.
โ Want to master Java internals and memory management in depth? Check out this ebook:
Java: The Complete Handbook
https://t.co/yZQtybLiEX
Sharding is how you scale databases.
In yesterday's stream we took a detour into when to use sharding and how it works.
Hint: do this instead of using 50 read replicas.
Nobody cares what course you bought.
Build something.
โข A Rate Limiter โ understand real backend control
โข A Job Queue โ learn async like a grown engineer
โข A Mini Search Engine โ indexing > tutorials
โข A CLI Budget Tool โ edge cases will humble you
โข A Feature Flag System โ think like a product dev
โข A Log Parser โ patterns, timestamps, real data
โข A Simple Cache Layer โ performance mindset
โข A Cron Email Script โ automation > motivation
This is how you become dangerous.
Not by watching. By shipping.
Bookmark this. Come back in 6 months.
No Experience! No Problem ๐จ
Role: Data Entry
Est Salary: $17 - $27 per hour
Location: Remote
- Enter, update, and maintain data
- Verify accuracy of data
Let us know if you are Interested ๐