Ever waited an eternity for a simple query to complete? Partition pruning might be your secret weapon.
PARTITIONED TABLE & SQL QUERY
. Think of a table partitioned by year: 2022, 2023, 2024
. Run a query asking for data just from 2023
. Without optimization, all partitions scanned, wasting time
. Goal: Only read partitions that could contain relevant data
QUERY PLANNER TO THE RESCUE
. Query hits the database's Query Planner first
. Planner examines `WHERE` clause, e.g., `Year = 2023`
. Decides which partitions are worth reading
. Makes informed choices to avoid unnecessary reads
PARTITION PRUNING IN ACTION
. Planner skips irrelevant partitions like 2022 and 2024
. Only the 2023 partition is scanned
. Reduces the scanned data, minimizing I/O operations
. Significantly speeds up the query execution time
The tradeoff with partition pruning is between setup complexity and runtime efficiency. Properly partitioning your tables can lead to drastic speed improvements, especially with large datasets. However, bad partitioning strategies can do more harm than good, leading to unevenly distributed data and skewed workload. The trick is to choose partition keys that align with typical query patterns.
Common pitfalls in partition pruning:
1. Misaligning partition keys with query patterns. If queries don't match the partitioning strategy, no gain.
2. Over-partitioning. Too many partitions can increase overhead, leading to slower queries instead of faster ones.
3. Ignoring partition statistics. Outdated statistics mislead the planner, resulting in poor pruning decisions.
4. Failing to update partitions. As data grows, partitions must evolve to stay optimal.
5. Over-reliance on partition pruning. It doesn't replace the need for good indexes and query tuning.
Think of partition pruning like using a well-organized filing cabinet. Instead of rifling through every drawer, you go straight to the one labeled with the correct year. This reduces the workload and makes operations faster.
Bookmark this for when your database queries start taking longer than your coffee breaks.
Ever wonder why consistent hashing is a cornerstone in distributed systems?
It's not just about hashing. It's about preventing a single point of failure from becoming a single point of overload.
CONSISTENT HASHING
. Servers placed on a conceptual ring, each assigned a hash
. Data keys hashed and placed on the ring
. Data stored on the next server clockwise from its key's position
. Avoids complete data relocation if a server joins or leaves
PROBLEM WITH SIMPLE HASHING
. Loss of a server shifts its entire load to the next one
. Hotspots form, overloading that single server
. System stability compromised under server churn
. No smooth transition of load among servers
VIRTUAL NODES
. Physical server gets multiple positions (virtual nodes) on the ring
. Load distributed across these virtual nodes
. Failure of a server spreads load across multiple servers
. Minimizes disruption, achieves load balancing
The tradeoff here is clear: while consistent hashing with virtual nodes introduces complexity, it dramatically enhances resilience. Systems experience graceful degradation rather than catastrophic failure, allowing for smoother scalability and reliability.
Common pitfalls include:
1. Not using enough virtual nodes. Fewer nodes mean uneven load distribution. Aim for a balanced representation per server.
2. Ignoring network topology. Virtual nodes should consider physical proximity to reduce latency and bandwidth costs.
3. Treating all data equally. Some data might need more redundancy or faster access. Adjust replication strategies accordingly.
4. Overlooking key distribution. Hash functions should evenly distribute keys to prevent clustering.
5. Failing to monitor node health. Regularly check node availability to preemptively handle failures.
Consistent hashing is not just a method; it's a mindset. It asks you to think in terms of rings and nodes, not just servers. To design systems that can gracefully suffer inevitable failures, consider how your data might dance around the ring.
Bookmark for when your distributed system doesn't scale as planned.
Hot partitions aren't just bugs; they're signs of your system's success. But what do you do when one shard melts under pressure?
HOT PARTITION
. Partitioned system works until one key dominates
. All requests for dominating key hash to same partition
. Partition becomes the bottleneck, latency spikes
. Cluster is underutilized, but this partition's overloaded
WHY PARTITIONING FAILS
. Guarantees load spread by key, not even traffic
. Guarantees ordering per key, not traffic protection from skew
. One key's dominance leads to one partition's overload
The architecture of partitioned systems thrives on load distribution by key, maintaining order per key. However, it doesn't inherently balance traffic across partitions. When a single key dominates, that shard bears the brunt, causing localized overload. Fixing requires spreading, shielding, or smoothing the load.
1. Salt the key. Instead of "user#42," use "user#42#0 ... user#42#N." This spreads load across multiple partitions, preventing any single shard from melting. Trade-off: Reads now aggregate across multiple shards, and ordering becomes "eventually ordered."
2. Split or merge partitions. Detect a hot partition and split it into multiple or redistribute part of its key range. This approach is operationally complex, involving rebalancing and routing updates, and is best for persistent hotness.
3. Cache the hottest keys. Deploy a cache to handle reads for hot keys, shielding the backend from read storms. Fastest win for read-heavy hot keys. Trade-off: Cache invalidation can lead to possible staleness.
4. Buffer or batch writes. Instead of writing every event, batch writes per key or partition. Smooths out spikes but introduces higher latency and eventual consistency. Useful for write-heavy hot keys.
When you see a hot shard in production, your move should depend on the workload. Sometimes the simplest fix is the right one, but persistent issues might require more involved solutions.
Remember: hot keys lead to hot partitions, resulting in localized overload. Your job is to spread, shield, or smooth the load. Save this for when your partitions start smoking.
Everyone knows ACID, but what does each letter truly defend against?
ACID isn't just a catchy acronym. It's a framework defining four distinct failure boundaries essential for reliable databases.
ATOMICITY
. All-or-nothing principle in transactions
. Groups multiple changes into a single unit
. Ensures full commit or rollback, never partial
. Protects against half-completed transactions during failures
CONSISTENCY
. Enforces database invariants and rules
. Guarantees all committed states are valid
. Examples: Foreign keys, unique constraints, non-negative balances
. Prevents rule violations despite application errors
ISOLATION
. Transactions behave as if executed serially
. Avoids dirty, non-repeatable, and phantom reads
. Isolation levels determine permissible anomalies
. Shields one transaction's mess from another's view
DURABILITY
. Once committed, data persists despite failures
. Uses logs, WAL, and replicas for recovery
. Survives process or machine crashes
. Guarantees data remains intact post-commit
ACID provides a mental model where each component addresses a specific failure class: Atomicity ensures in-transaction crash safety, Consistency maintains correctness, Isolation offers concurrency control, and Durability secures crash recovery. These properties aren't synonymous with slowness or obsolescence but are inherent even in modern, distributed systems, albeit implemented differently.
Common pitfalls arise when practitioners misinterpret or overlook these properties:
1. Misunderstanding atomicity as requiring synchronous execution. It's about transaction boundaries, not speed.
2. Equating database consistency with CAP consistency. They're different; one ensures rule adherence, the other concerns availability and partition tolerance.
3. Neglecting isolation levels, leading to unintended concurrency issues. Choose and configure levels based on application needs.
4. Assuming durability guarantees without proper storage configuration. Durability depends on reliable hardware and software setup.
5. Believing ACID guarantees are irrelevant in distributed systems. Most modern systems still rely on these principles, adapting them for scale.
ACID isn't a relic of old databases; it's the bedrock of data integrity. Share this with anyone who thinks ACID is outdated or only for single-node systems.
Most devs know REST, but can you name all 8 API styles and when to use each?
API STYLES OVERVIEW
. REST: Stateless, resource-based, uses HTTP methods like GET, POST
. GraphQL: Flexible queries, single endpoint, client defines data structure
. gRPC: Binary protocol, HTTP/2, supports streaming, strong typing
. SOAP: XML-based, protocol-heavy, built-in error handling
. WebSockets: Full-duplex communication, persistent connection, real-time updates
. RPC: Function call abstraction, simple, tightly-coupled systems
. HATEOAS: Hypermedia-driven, guides client through application state
. CQRS: Segregates read/write paths, commands for writes, queries for reads
Each style carries tradeoffs between flexibility, complexity, and performance. REST is ubiquitous but can be overkill for simple requests. GraphQL offers flexibility but might overwhelm with too much choice. gRPC provides efficiency in microservices but requires higher client/server integration effort.
Common pitfalls include:
1. Using REST for everything. REST isn't one-size-fits-all. For real-time, consider WebSockets.
2. Neglecting data versioning in GraphQL. Changes can break clients. Schema evolution is key.
3. Misusing gRPC in high-latency environments. Its binary protocol is efficient but can struggle over unreliable networks.
4. Over-engineering with SOAP for simple tasks. Its verbosity might add unwarranted complexity.
5. Ignoring client coupling with RPC. Simple interactions can become tightly bound, affecting scalability.
6. Skipping state navigation in HATEOAS. Hypermedia controls require careful design to guide clients effectively.
7. Misapplying CQRS in systems without complex read/write needs. Sometimes a simple CRUD suffices.
8. Underestimating the learning curve of GraphQL and gRPC. Both demand upfront learning and infrastructure investment.
Understanding when to apply each style is key to architecting scalable, maintainable systems. Match your API style to your system's specific needs and constraints.
Share this with the dev who's stuck in REST land.
Access tokens and refresh tokens are misunderstood more often than you'd think. They each have distinct roles, and mixing them up can lead to security gaps.
ACCESS TOKEN
. Short-lived (5 to 15 minutes), limits potential damage if exposed
. Sent in Authorization header with every API request
. Stateless, validated by signature without DB lookup
. Exposure impact limited by rapid expiration
REFRESH TOKEN
. Long-lived (days to months)
. Only sent to auth server, not to APIs
. Stateful, must be stored in a DB for revocation
. Sole job: exchange for new access tokens when needed
The architecture balances two needs: minimal overhead per API call, and limited access in case of credential leaks. Short-lived access tokens offer quick validation and cut down unauthorized access time. Refresh tokens allow long-term sessions without frequent user interruptions.
Common pitfalls:
1. Sending refresh tokens to API endpoints. This exposes long-lived credentials. They should only be exchanged at /auth/refresh.
2. Storing access tokens in localStorage. Susceptible to XSS attacks. Use httpOnly cookies or in-memory storage.
3. Handling refresh tokens like access tokens. If you can't revoke a session by deleting a DB entry, you've got it wrong.
4. Ignoring refresh token rotation. Proper rotation invalidates old tokens upon new ones being issued, mitigating replay attacks.
5. Skipping reuse detection. Presenting a refresh token twice signals a potential breach. Invalidate the session if detected.
Misunderstanding these roles can lead to persistent unauthorized access. Know the boundaries: access tokens are fast and disposable; refresh tokens are durable and revocable.
Share this with a dev who's struggling to draw the OAuth flowchart in their mind.
Adding a server shouldn't nuke your cache. So why does it happen so often in naive setups?
The issue lies in how we traditionally hash. When using hash(key) % N, any change in N reshuffles most keys. This results in cache misses, hot shards, and a painful re-balance.
NAIVE HASHING
. hash(key) % N where N is the number of servers
. Change in N remaps many keys
. Leads to significant cache misses
. Causes hotspots and uneven load distribution
CONSISTENT HASHING
. Hash servers onto a ring structure
. Hash keys onto the same ring
. Key is owned by the next clockwise server
. Node addition/ removal only affects adjacent keys
Consistent hashing solves these problems by distributing keys more evenly. As nodes come and go, only a small portion of the keys need remapping, minimizing disruption. This method is vital for scaling services like CDNs, distributed caches, and sharded databases.
Where teams stumble with consistent hashing:
1. Forgetting virtual nodes. Without them, uneven distribution leads to hotspots. Virtual nodes smooth out the distribution, ensuring load balance.
2. Misplacing the hash function. A poor choice can result in collisions or uneven spread. Use well-regarded hashing functions like MD5 or SHA-1 for predictable results.
3. Overcomplicating node addition/removal logic. Keep it simple: only adjust the immediate slice. Complexity leads to errors and unexpected behaviors.
4. Ignoring the impact of node failure. If a node goes down, have a strategy to quickly redistribute its keys without full reshuffle.
5. Overlooking monitoring and metrics. Track key distribution and node load actively to catch imbalances early.
Consistent hashing is a must-know for anyone building distributed systems. It offers a graceful way to handle scale without the chaos. Bookmark this for when your cache needs to grow without the growing pains.
Think handling high concurrency is just about adding more servers? Not even close. Traffic spikes change load dynamics, and your architecture needs layers of defense to cope.
FRONT-DOOR PROTECTION
. Load balancers spread requests, preventing node overload
. Enables rolling deploys and autoscaling
. Rate limiting caps request rate to prevent system overload
. Circuit breakers handle downstream failures, failing fast to avoid cascading timeouts
WORK DEFERRAL
. Asynchronous processing offloads non-critical tasks
. Moves work like notifications, emails, and image processing off the main path
. Results in faster p99 latency, happier users
. Load leveling through peak shaving and valley filling stabilizes throughput, avoiding overload during spikes
DATA SHARDING
. Distributes data across shards to scale writes and storage
. Horizontal partitioning splits data by user, region, etc.
. Vertical partitioning separates different data types or tables
. Reduces single database bottlenecks, improves resilience
High concurrency isn't solved with a single technique. It requires a multi-layered approach combining protection, deferral, and scaling strategies. Each layer addresses specific challenges, and together they ensure system stability under peak loads.
Common pitfalls:
1. Ignoring rate limiting. Without it, a flood of requests can overwhelm your system, leading to degraded performance or downtime.
2. Misconfiguring circuit breakers. Set thresholds too high, and your system might crash before activating. Too low, and you might unnecessarily block healthy services.
3. Over-relying on cache. A cache miss can still hit your database hard if it's not well-tuned to handle spikes in demand.
4. Incorrect sharding strategy. Poor partition keys can lead to unbalanced loads, where one shard becomes a bottleneck.
5. Not accounting for spike duration. Short bursts might just need load leveling, but sustained high traffic requires more robust scaling strategies.
When the load spikes, it's not just about more servers. It's about intelligent design that anticipates and mitigates stress. Save this for your next system design interview or when you're up against a sudden traffic flood.
Elasticsearch feels like magic until you realize it's all about two distinct paths: indexing and searching. Let's break them down.
INDEXING PATH
. JSON document is the starting point
. Analyzers break text into terms, ready for indexing
. Terms go into inverted index (term → list of docs/positions)
. Distributed across primary shards and replicas for scale and availability
SEARCH PATH
. Client sends a query in Query DSL to a coordinating node
. Coordinator distributes query to relevant shards (primary or replica)
. Each shard returns its top-K results
. Results are merged, scored, and ranked for the final output
These paths highlight a fundamental tradeoff in Elasticsearch: balancing write and read operations. Indexing needs to be efficient to handle large data volumes quickly, while searching must retrieve and rank results accurately and promptly. The shard mechanism supports scaling but isn't a free lunch. Too many shards can lead to overhead, while too few can bottleneck performance.
Common pitfalls:
1. Over-sharding. More shards mean more overhead. Aim for a balance that matches your data size and query rate.
2. Ignoring shard replication. It ensures availability and reliability. Skipping it risks data loss during failures.
3. Misunderstanding the coordinator's role. It's the linchpin during high QPS. Monitor CPU, heap, and fan-out costs to avoid bottlenecks.
4. Overlooking query complexity. Complex queries increase latency. Simplify or precompute when possible.
5. Neglecting shard rebalancing. Data distribution changes; rebalancing avoids hot shards and helps maintain performance.
Remember, Elasticsearch's power lies in how you configure and monitor these paths. The index-write and search-read flows are distinct but interdependent. Misconfiguring one often impacts the other, creating unexpected bottlenecks and inefficiencies.
Bookmark this for when Elasticsearch's magic feels more like a black box.
Serverless architecture sounds like magic: no servers, no problems, right?
Not quite. "Serverless" means your architecture is the glue between managed services, not that you’re free from architectural concerns. Misunderstanding this leads to operational headaches.
SERVERLESS COMPONENTS
. Clients (web/mobile) hit an API
. API publishes work as events (async, not inline)
. Queue buffers spikes, protecting downstream systems
. Serverless functions scale horizontally to process messages
. Data storage receives results, can trigger more events
ARCHITECTURAL BENEFITS
. Reliability: async + retries > fragile synchronous chains
. Scalability: queue absorbs bursts, functions scale independently
. Cost control: pay-per-use, rate-limit at the queue
It's tempting to think "serverless" means you don't worry about infrastructure. In reality, the architecture demands different concerns like idempotency, dead letter queues (DLQs), and observability. They're critical for maintaining the reliability and scalability serverless promises.
Common pitfalls in serverless design:
1. Overlooking idempotency. Duplicates will happen, especially during retries. Every function and API call must handle duplicates gracefully.
2. Ignoring DLQ and replay strategy. Failures will occur. Without a DLQ, you can't capture failed messages for retries or analysis.
3. Skipping observability. Tracing across API → queue → functions is non-negotiable. Without it, debugging becomes a nightmare.
4. Mismanaging backpressure limits. Concurrency caps, batch sizes, and timeouts are essential to avoid overwhelming any part of your system.
5. Neglecting async vs. sync tradeoffs. Determine where immediate feedback is needed (sync) versus where resilience and scalability are prioritized (async).
Serverless isn't a panacea. It requires careful design choices and an understanding of how managed services interact. Missteps in these areas lead to failed promises of scalability and cost control. Save this for when you're optimizing your serverless system at 3am.
If microservices fail, it's not because they're inherently flawed. It's because teams ignore the mundane but critical rules that keep them working as intended.
SEPARATE DATA STORES
. Each service should own its data store exclusively
. Sharing a DB? You've coupled services, defeating microservices' core benefit
. Allows for independent scaling and schema evolution
. Reduces blast radius of data corruption or breach
SINGLE RESPONSIBILITY
. Service should do one thing well
. Monolithic services with multiple responsibilities = complexity and latency
. Easier to test, deploy, and understand
. Simple services adapt better to changing requirements
STATELESS SERVERS
. Persist state in databases or caches, not local disk
. Enables horizontal scaling , replace nodes without data loss
. Stateless servers reduce "works on my machine" issues
. Facilitates auto-scaling and failover
CONTAINERS
. Portable, consistent environment for microservices
. Simplifies deployment and rollback
. Abstracts OS-level dependencies, enhancing CI/CD
. Supports microservices' need for fast, frequent releases
The real challenge isn't applying one rule; it's integrating all of them into your process. The benefits of microservices , agility, scalability, fault isolation , only emerge when the whole architecture aligns. Each decision impacts the others: shared data stores undermine single responsibility, stateful designs bloat container images, etc.
Common pitfalls teams encounter:
1. Ignoring data store separation. Centralized DBs lead to bottlenecks and complex migrations. Each service should own its data.
2. Overloading services with multiple roles. This leads to tangled dependencies and increases failure points. Stick to single responsibility.
3. Forgetting server statelessness. A misplaced session state can bring down auto-scaling. Use external caches or databases for state.
4. Deploying without containers. Manual deployments introduce inconsistency. Containers standardize environments and streamline updates.
5. Misjudging when to use micro frontends. They enable UI independence but add complexity if adopted prematurely. Evaluate your team's readiness first.
The hardest part isn't adopting microservices; it's maintaining them as your teams, objectives, and technology stack evolve. Consistency is key. Share with the dev who thinks microservices are just "small services."
Scaling Kafka isn't magic. It's math and coordination. Mistake one and you're debugging at 2am.
PARTITIONS
. Control parallelism, not the number of consumers
. Each partition has exactly one consumer at any time
. Extra partitions increase throughput; more consumers without more partitions don't
. Rebalance can cause stop-the-world events unless preplanned
CONSUMER GROUPS
. Consume from topics in parallel, spreading load across partitions
. Each group gets its own offsets, independent of others
. Scaling by adding groups can improve throughput if partitions permit
. All consumers in a group must rebalance when one fails
OFFSET COMMITS
. Enable resume after failure from the last committed point
. Should be frequent to minimize data reprocessing
. Consistency issues if not managed correctly (e.g., consumer crash)
. Crucial for maintaining data integrity across failures
The tradeoffs in scaling Kafka revolve around understanding these elements' interplay. Increasing partitions raises throughput but complicates management and storage. Scaling consumers without partitions leads to idle resources. Effective offset management ensures reliability but adds overhead. Each design decision impacts latency, fault tolerance, and cost.
Common pitfalls include:
1. Increasing consumers without adding partitions. This leaves consumers idle, not "faster". Always scale partitions if you're adding consumers.
2. Mismanaging rebalance strategy. Poor planning here results in visible latency spikes during reassignments. Avoid unnecessary rebalances with careful partition design.
3. Ignoring offset commit frequency. Too infrequent, and you risk large reprocessing windows. Too frequent, and you incur overhead. Balance is key.
4. Treating partitions as interchangeable. Each has a unique role in load distribution and fault tolerance. Design it right or face uneven load.
5. Assuming more consumer groups always improves performance. Without sufficient partitions, added groups don't contribute to throughput gains.
Kafka "feels unreliable" when partition design is wrong. Correct the design, and reliability follows. Bookmark for when your Kafka cluster needs a midnight miracle.
Two-Phase Commit seems straightforward until a critical failure strikes. Why do so many systems shy away from 2PC despite its promise of atomicity?
PHASE 1 , PREPARE
. Client requests commit from the coordinator
. Coordinator issues PREPARE to all participants
. Participants log intent, lock resources, and respond VOTE-YES or VOTE-NO
. No participant commits until a global decision is reached
PHASE 2 , COMMIT / ABORT
. All YES votes → coordinator logs COMMIT
. Coordinator instructs participants to COMMIT or ABORT
. Participants finalize operations, release locks upon successful commit
The tradeoff with 2PC is its inherent blocking risk. If the coordinator crashes after all participants vote YES but before it logs COMMIT, the system freezes. Participants hold locks indefinitely, unable to resolve the transaction autonomously. The blocking problem underscores why many systems avoid 2PC for high-availability scenarios.
Where teams misunderstand 2PC:
1. Assuming participants can decide independently post-crash. They can't. Without the coordinator's directive, you're stuck.
2. Ignoring the impact of long-running transactions. 2PC's blocking nature exacerbates throughput issues, especially under load.
3. Overlooking network partitions. In distributed environments, network failures can mimic coordinator crashes, triggering the blocking problem.
4. Misconfiguring timeouts. Too short, and aborts happen prematurely. Too long, and resources remain locked, killing availability.
5. Not considering alternatives. Sagas and consensus protocols (Raft/Paxos) can offer better resiliency and performance in distributed systems requiring availability.
2PC is the go-to for strict atomicity but trades off availability. When designing systems, the choice between 2PC and alternatives hinges on your priority: atomicity or availability.
Share with the dev who's convinced every distributed system needs strict consistency.
Think Kafka durability is just on or off? It's a nuanced dance between latency and safety, and understanding it can save you from data loss disasters.
IN-SYNC REPLICAS (ISR)
. List of followers fully caught up with the leader
. If leader fails, one ISR becomes new leader
. More ISRs mean better durability, but slower writes
. ISR changes as followers fall behind or catch up
HIGH WATERMARK (HW)
. Offset of last message replicated across all ISRs
. Consumers read only up to HW, ensuring data consistency
. Advances only when all ISRs have replicated new data
. Acts as a safety boundary for committed and visible data
ACKS
. acks=1: Leader writes, sends ack to producer
. acks=all: Waits for all ISRs before acking
. acks=1: Low latency, high risk of data loss on leader failure
. acks=all: Higher latency, data more resilient to leader crashes
Kafka's durability settings revolve around a tradeoff: quick acknowledgments versus data safety. Choosing acks=1 offers speed, but risks losing messages if the leader crashes before followers sync. With acks=all, you secure your data at the cost of increased latency as you wait for all ISRs to confirm.
Common pitfalls and lessons:
1. Misconfiguring acks for critical data. Don't risk payment transactions with acks=1; they're safer with acks=all.
2. Losing messages due to insufficient ISR members. Losing one ISR could mean data loss if no other replicas are in sync.
3. Ignoring HW in monitoring. Consumers stuck behind HW indicate followers aren't keeping up, risking data availability.
4. Overloading ISRs. Too many ISRs may lead to higher replication lag, compromising throughput for durability.
5. Failing to adjust configurations under load. As traffic spikes, reevaluating ISR and acks settings prevents bottlenecks.
For logging or analytics, acks=1 suffices. For financial transactions or critical state changes, opt for acks=all. The right configuration depends on your data's value and the cost of potential loss.
Bookmark this to revisit when you're balancing speed and safety in your Kafka setup.
Two-Phase Commit (2PC) blocks, so just add a phase, right? Enter Three-Phase Commit (3PC). But don’t get too comfortable, real networks don’t care about your assumptions.
TWO-PHASE COMMIT (2PC)
. Simple, blocking protocol
. Participants vote YES, then wait indefinitely if coordinator crashes
. Preserves consistency, sacrifices availability
. Blocking can lock up resources indefinitely
THREE-PHASE COMMIT (3PC)
. Adds PreCommit phase to break deadlock
. CanCommit → PreCommit → DoCommit structure
. PreCommit informs participants of inevitable commit
. If coordinator dies post-PreCommit, nodes commit independently
3PC sounds like a better solution, but its guarantees depend on unrealistic assumptions: bounded message delays and no network partitions. Real distributed systems face unpredictable delays and partitions, making 3PC's assumptions fragile. That’s why consensus protocols like Raft and Paxos, which can handle these realities, dominate production environments.
Key lessons from practice:
1. Assuming no network partitions is risky. In real systems, networks can and will partition. 3PC's assumptions make it unsuitable for these environments.
2. Forgetting to handle message delays. Unbounded delays make 3PC unreliable, leading to potential inconsistency if messages arrive late.
3. Overlooking resource locking. Even with 3PC, resource locking can happen if PreCommit is delayed or lost, causing deadlocks.
4. Ignoring consensus protocols. Raft and Paxos offer practical, fault-tolerant solutions without needing 3PC's assumptions.
5. Failing to consider alternatives. Techniques like Sagas and compensation avoid global transactions, offering non-blocking paths for distributed systems.
If you’re designing a distributed system, ask yourself: Would you accept blocking, add unrealistic assumptions, or rethink the model entirely? The answer might guide you toward modern consensus protocols or transaction-free models.
Bookmark this for the next time you’re stuck pondering commit protocols at 3am.
A message like this means a lot.
When I started building Codemia, the goal was never just to create another interview prep platform.
I wanted to build something that actually helps people understand difficult concepts deeply:
👉 system design
👉 object-oriented design
👉 DSA
👉 distributed systems
👉 AI / agentic systems
Not just memorize answers, but truly see how things work.
Reading that someone used Codemia to break into big tech and change their career path is honestly one of the most motivating things I can receive as a founder.
Building a product is hard.
There are bugs, rewrites, failed experiments, late nights, confusing product decisions, and a lot of moments where you wonder whether the work is actually making a difference.
Then a message like this shows up.
And it reminds me:
The diagrams matter.
The explanations matter.
The details matter.
The effort matters.
Thank you to everyone who has supported Codemia, shared feedback, reported bugs, and trusted the platform as part of your learning journey.
We’re still just getting started. 🚀
Two-phase commit (2PC) might block your system, and three-phase commit (3PC) makes assumptions. Real-world systems often use Sagas for distributed transactions. So, what's a Saga?
SAGA TRANSACTIONS
. Break a global transaction into local steps + compensations
. No global locking, improving availability
. Eventual consistency instead of immediate consistency
. Two main styles: Orchestration and Choreography
ORCHESTRATION
. Central coordinator drives transaction flow
. Services follow the orchestrator's directions
. On failure, orchestrator triggers compensating actions
. Easier to reason about but creates a single point of control
CHOREOGRAPHY
. No central coordinator; services react to events
. Each service publishes events to which others react
. On failure, compensating events are published
. More scalable, but harder to debug and reason about
Sagas offer high availability and avoid global locks. They embrace eventual consistency by design, which means you don't rollback; you undo. This "undo" must be explicit, idempotent, and well-designed to handle retries and failures gracefully.
Common pitfalls in implementing Sagas:
1. Ignoring idempotency in compensating actions. Without idempotent compensations, retries can corrupt state.
2. Over-reliance on orchestration. If the orchestrator fails, it can bottleneck the system. Consider a hybrid approach.
3. Poor compensation design. Compensations should be capable of reversing partial commits without side effects.
4. Neglecting event schema evolution in choreography. Changing event formats can break listeners. Plan for backward compatibility.
5. Misjudging consistency needs. If strict consistency is paramount, Sagas might not fit the bill. Re-evaluate your consistency requirements.
Sagas are the default choice for microservices, often replacing 2PC. They allow for more availability but require careful design of compensations and consistency models. If you need strict atomicity, stick with 2PC. If availability is your priority, go for Sagas. If you need both, rethink how your business flow can be redesigned.
Would you orchestrate with a central brain or let your services dance to their own tune? Save this for when you're architecting your next system.