Ans: false
- instanceof operator returns false if LHS operand is null, regardless of the type on RHS
- It doesn't throw a NPE (null-safe operator by design) or cause a CTE; it simply evaluates to false because a null ref isn't an instance of any class
- The only time instanceof causes a CTE is if the types are inconvertible (instanceof only works if a cast between 2 types is theoretically possible - safe cast check - only applies when compiler has enough info)
E.g. "10" instanceof Integer --> CTE
---
*Static analysis tools (IDEs, linters) may flag such checks as redundant, but this is NOT part of the JLS
E.g. if (obj != null && obj instanceof String)
--> != null check is functionally redundant
In modern #Java (16+), we use Pattern Matching to capitalize on this exact logic:
E.g. if (obj instanceof String s)
--> 's' is already null-checked + casted if compatible
---
Reification:
*The instanceof operator requires reifiable type on RHS
- Reifiable type = type info is fully available at RT
E.g. String, String[], int[], List<?>, raw List
- Generic type can be used with instanceof ONLY if the type is reifiable
--> We cannot use instanceof with a specific generic type because of Type Erasure
E.g. Non-Reifiable (CTE):
- list instanceof List<String>
- list instanceof List<? super String>
- list instanceof List<? extends String>
--> use list instanceof List<?> (unbounded wildcard)
⚠️avoid raw type like list instanceof List
---
Workaround for Generics (instanceof T):
- Since obj instanceof T is a CTE (JVM "forgets" what T is at RT), use a Class Token
- Store the class: private final Class<T> type;
- Check dynamically: type.isInstance(obj);
Example:
public class Validator<T> {
private final Class<T> type;
// capture class at construction
public Validator(Class<T> type) {
this.type = type;
}
public boolean isValid(Object obj) {
// dynamic equivalent of instanceof
return type.isInstance(obj);
}
}
Validator<String> val = new Validator<>(String.class);
System.out.println(val.isValid("Hello")); // true
--> Functional equivalent of instanceof for generic types
*isInstance(null) also returns false
What will be the output of the following #Java code?
var v1 = Integer.toString(-1, 2);
var v2 = Integer.toBinaryString(-1);
System.out.println(v1.equals(v2));
Remembered preference = state placed at the correct persistence layer
*Client-side = fast, device-scoped persistence (no NW)
*Server-side = durable, user-scoped persistence (cross-device)
*SSR Boundary = consistency between server render & client state
*Zero-effort/Native = OS-driven default, overridden by app logic
*Choose storage based on lifespan (how long it must survive) + scope (who/where should see it)
---
Client-side (device-scoped persistence):
*Memory (JS state): short-lived (per render/tab)
*sessionStorage: survives reloads, dies with tab
*localStorage: survives browser restarts
--> Fast, no NW, limited to same device
*Cookie (hybrid layer)
--> sent with every HTTP request
--> enables SSR to render correct theme immediately (avoids "flash of wrong theme")
*Multi-tab sync (Storage Events):
window.addEventListener('storage', ...)
--> keeps multiple tabs consistent on same device
***
Server-side (user-scoped persistence):
Stored in DB against user profile
--> survives logout, device change
--> enables consistent UX across devices
*optimistic sync = write to localStorage instantly (snappy UI)
+ async sync to server (eventual consistency)
***
SSR/Client Boundary (Consistency layer):
Hydration mismatch:
- Server renders theme (from cookie)
- Client overrides (from localStorage)
--> mismatch (flicker/inconsistent UI)
Fix:
resolve theme before first paint (inline script in <head>) OR
defer override until mount (useEffect)
***
Zero-effort/Native Way:
prefers-color-scheme (OS-level preference)
--> browser reads OS theme, no app storage needed
*override precedence (source of truth):
User Override (localStorage) >
Server Pref (DB) >
System Pref (OS / prefers-color-scheme)
Remembered preference = state placed at the correct persistence layer
*Client-side = fast, device-scoped persistence (no NW)
*Server-side = durable, user-scoped persistence (cross-device)
*SSR Boundary = consistency between server render & client state
*Zero-effort/Native = OS-driven default, overridden by app logic
*Choose storage based on lifespan (how long it must survive) + scope (who/where should see it)
---
Client-side (device-scoped persistence):
*Memory (JS state): short-lived (per render/tab)
*sessionStorage: survives reloads, dies with tab
*localStorage: survives browser restarts
--> Fast, no NW, limited to same device
*Cookie (hybrid layer)
--> sent with every HTTP request
--> enables SSR to render correct theme immediately (avoids "flash of wrong theme")
*Multi-tab sync (Storage Events):
window.addEventListener('storage', ...)
--> keeps multiple tabs consistent on same device
***
Server-side (user-scoped persistence):
Stored in DB against user profile
--> survives logout, device change
--> enables consistent UX across devices
*optimistic sync = write to localStorage instantly (snappy UI)
+ async sync to server (eventual consistency)
***
SSR/Client Boundary (Consistency layer):
Hydration mismatch:
- Server renders theme (from cookie)
- Client overrides (from localStorage)
--> mismatch (flicker/inconsistent UI)
Fix:
resolve theme before first paint (inline script in <head>) OR
defer override until mount (useEffect)
***
Zero-effort/Native Way:
prefers-color-scheme (OS-level preference)
--> browser reads OS theme, no app storage needed
*override precedence (source of truth):
User Override (localStorage) >
Server Pref (DB) >
System Pref (OS / prefers-color-scheme)
It's all about layered, memory-first gatekeeping + smart request handling
***
*Memory-first layers = latency hiding
- L0: Reserved Word Filter = static list/regex, O(1) local check
E.g. "admin," "support"
- L1: Bloom Filter = probabilistic, in-memory, tiny footprint (~1-2GB for billions of usernames)
- L2: Redis = exact hot handles + Negative Cache (taken names cached with TTL), RAM lookup < 1ms
- L3: DB = authoritative truth, sharded + indexed
--> Most requests never reach DB
--> scales to hundreds of millions of users
*Debouncing prevents unnecessary pipeline execution
Each layer adds a tradeoff:
- Reserved Words = limited, static
- Bloom = tiny false positives, requires counting/Cuckoo filters for deletions
- Redis = memory cost + cache TTL tuning
- DB = authoritative but slow
***
Handle check = conveyor belt with VIP gates:
- Reserved Word Filter = forbidden? Stop!
- Bloom filter = Definitely free? Go!
- Redis = Hot VIP names, check here
- DB = Final authority
--> By the time "Taken" is displayed, the request has passed through 4 optimized layers in ms
---
Username Lookup = Multi-Layer Pipeline
*L0 – Reserved Word Filter (Static Gatekeeper):
- Purely local O(1) string check
- Blocks reserved/profane handles immediately
--> Avoids NW / memory cost
***
*L1 - Bloom Filter (Probabilistic Gatekeeper):
*Answers "Definitely No"
--> stop immediately
*Answers "Maybe Yes"
--> go to next layer
⚡ Avoids DB reads for ~99% of free usernames
⚠️ Deletions/Staleness: Standard Bloom Filters cannot remove items
--> Counting Bloom Filters increment/decrement counters per bit to add/remove items
--> Cuckoo Filters store actual fingerprints in slots & can move (relocate) them to make room when deleting, allowing removals
--> both incur extra memory overhead
***
*L2 - Redis/In-Memory Cache:
Stores exact hot handles + Negative Cache for taken names (short TTL)
--> RAM lookup < 1ms
--> Ensures repeated "taken" queries never hit DB
⚠️ Cache hit ratio matters: keeping hot handles in RAM = massive performance win
***
*Smart Request Handling/Debouncing:
Client waits 300-500ms after last keystroke before sending request
--> Reduces redundant queries/char
--> smooth UX + lower system load
***
*L3 - Distributed DB (Source of Truth):
- Only accessed if cache cannot confirm availability
- Sharded + indexed
--> fast O(log N) lookup
--> Last-resort guarantee; optimized pipeline rarely touches DB!
⚠️ Write-Through Consistency:
- When a handle is claimed, Bloom Filter and Redis must be updated
--> Bloom Filter: set bits to 1; deletions require Counting/Cuckoo filters OR periodic full rebuilds
--> Redis: update/invalidate TTL to maintain correctness
https://t.co/MnO3SI7UHq
Because they solve different bottlenecks:
*Replication = same data, multiple copies
--> scales reads + availability (writes still hit primary)
*Sharding = horizontal partitioning across nodes (rows split across machines)
--> scales writes + data size (load distributed across shards)
---
*All sharding is horizontal partitioning, but NOT all horizontal partitioning is sharding
*Horizontal partitioning (Local single DB) = split rows across partitions (range/hash/list)
--> improves manageability + query performance, but NOT true horizontal scaling
*Sharding (Distributed) = horizontal partitioning across machines
--> enables real scalability (compute + storage)
---
Vertical partitioning = split columns
E.g. user_core vs user_profile
Used when:
- Some columns are rarely accessed (cold data)
- Reduce row size / I/O cost
- Improve cache efficiency
- Isolate sensitive/large fields
E.g. BLOBs
---
⚠️ Replication is simple but limited; sharding scales but introduces system-level complexity:
*Cross-shard joins:
extremely expensive/avoided (data must be co-located/denormalized)
*Distributed transactions:
costly & complex
--> often replaced with eventual consistency (ACID to BASE - Basically Available, Soft state, Eventual consistency)
*Resharding complexity:
redistributing data when shards grow is operationally hard (far more complex than adding replicas)
⚠️ Why even use async if we're going to block immediately?
--> BUT it depends where/who blocks
🤔 Are we blocking at the right layer or too early?
*Async isn't about avoiding blocking - it's about deferring it (where we block + how long + what + how failures propagate)
*get() is fine at the right boundary, but if used inside flow, it collapses async into sync & kills concurrency
***
*Async = non-blocking composition + latency hiding
--> deferring the wait, not eliminating it
*get() = blocking sync point
--> decides where the wait happens
***
*It's not about syntax - it's about execution model + arch
- Async = order food --> keep working --> eat later
- get() at boundary = wait when food arrives 👍
- get() in middle = stand at door immediately 🤦♂️
---
CompletableFuture = Non-blocking pipelines
- supplyAsync() = offloads work
- thenApply()/thenCompose() = defines continuations (callbacks)
- get() = introduces blocking barrier
Similar to streams:
- Streams: terminal op triggers execution
- CompletableFuture: get() forces sync
---
*get() must be at the end of the system (boundary)
Execution Models:
*Blocking Model (Servlet/ Thread-per-request):
HTTP response is sync
--> Thread must eventually produce result
--> Blocking is inevitable, BUT defer get() as late as possible (at boundary)
Example:
public String controller() throws Exception {
return CompletableFuture.supplyAsync(...)
.thenApply(...)
.get();
}
--> OK: async/sync boundary (needed to return a value, no further async composition expected)
--> Blocking is intentional & contained
--> Gains: overlapping I/O, better utilization
*Modern blocking frameworks (e.g. Spring MVC) can return CompletableFuture, effectively behaving non-blocking at the boundary
***
*Non-Blocking Model (Reactive/Event-loop):
No thread-per-request (e.g. WebFlux, Netty)
--> Framework handles callbacks/continuation
--> Never call get()
Example:
public CompletableFuture<String> controller() {
return CompletableFuture.supplyAsync(...)
.thenApply(...);
}
--> No blocking
--> Framework: Registers callback + Releases thread + Writes response on completion
***
⚠️ *Blocking Too Early (Problem):
Example:
public String service() throws Exception {
return CompletableFuture.supplyAsync(...)
.thenApply(...)
.get();
}
--> BAD: Async/Sync collapse, no latency hiding
--> Breaks composition + introduces hidden blocking upstream
What breaks:
*Concurrency collapse
--> caller thread blocks
*Thread inefficiency: Worker thread does work & caller thread waits
--> 2 threads for 1 task!
*Breaks composability: Cannot chain further async ops
--> Forces sync model upstream
*Thread pool starvation risk: Blocking inside async flows can exhaust thread pools
--> Leads to throughput collapse under load
---
*Blocking systems: defer get() to boundary (or preferably return CompletableFuture)
*Non-blocking systems: never call .get() - return the future & let the framework handle it
---
⚠️ Timeouts: Never use indefinite .get()
--> prefer failing the pipeline itself via .orTimeout(...) (Java 9+) over just timing out the blocking thread .get(timeout) - propagates failure early & keeps the async flow consistent
E.g. future.orTimeout(5, TimeUnit.SECONDS) (better) vs future.get(5, TimeUnit.SECONDS)
⚠️ Common Pool Trap:
Default ForkJoinPool.commonPool() is JVM-wide
--> blocking it can starve unrelated tasks (incl. parallel streams)
--> prefer custom executors (requires handling ThreadLocal context propagation explicitly)
E.g. CompletableFuture.supplyAsync(task, customExecutor);
*Exception Handling (Pipeline): Prefer handling errors inside the async pipeline (exceptionally, handle) OR use join() to avoid checked exception noise & keep composition clean
E.g. future.exceptionally(ex -> fallback);
⚠️ Sync Boundary Exception Trap: get() wraps failures in ExecutionException
--> when crossing async/sync boundary, always unwrap via e.getCause() to avoid masking real errors
401 Unauthorized = Who are you? Prove it (Unauthenticated)
--> Invitation to try again with credentials
403 Forbidden = I know you, but you're not allowed (Authenticated but forbidden)
--> A hard "No" based on policy
404 Not Found = I know you, but I'm not telling you if this exists (High-security masking/BOLA mitigation)
--> Blinding the attacker
---
401 Unauthorized (Unauthenticated/Authn Issue):
- Misnomer: "Unauthorized" in HTTP spec actually means Unauthenticated
- Server doesn't know who the client is OR credentials are invalid/expired
--> Client must authenticate (login, refresh token)
E.g. Accessing /user/profile without a valid token
- Often accompanied by WWW-Authenticate header to indicate how to authenticate
---
403 Forbidden (Authz Issue):
- Server knows the client, but policy denies access (RBAC/ABAC)
--> Authn won't help; access is explicitly forbidden
- Some systems also use 403 for rate-limiting or IP blocks (WAF - Web App Firewall)
Example:
Token decoded --
User "Bob" authenticated --
But policy says "Bob does not have scope: admin"
---
404 Not Found (High-Security Variant):
- Used to mask existence of resources & mitigate Broken Object Level Authz (BOLA) attacks
- Returning 403 could leak resource existence; returning 404 hides it
--> Prevents attackers from enumerating resource IDs/mapping DB
- Common in APIs handling sensitive objects (docs, user data, payment info)
Example:
User tries GET /api/documents/123 -- User is authenticated but does not own document 123
*403: "You are not allowed to access document 123"
--> leaks existence
*404: "Resource not found"
--> hides existence
IllegalStateException on 2nd forEach()
Streams = single-pass, lazy, memory-efficient pipelines
--> Any intermediate or terminal operation consumes the original stream
***
*Streams = single-pass pipelines
--> A terminal operation consumes the stream
--> Creating a derived stream (e.g. via filter, map, flatMap) also consumes the original stream (becomes upstream for the new pipeline)
--> The linkedOrConsumed flag + spliterator ensures single-use semantics
*Multi-pass traversal is not supported without materializing elements/recreating the stream
*Streams do not store data
--> They operate lazily, processing elements ONLY as they flow through the pipeline
*Terminal operations are mandatory to trigger computation
--> Without them, the stream pipeline does NOTHING!
*Design rationale: ensures memory efficiency + avoids hidden state + prevents accidental reprocessing of non-repeatable sources
*Stream = conveyor belt, Spliterator = motor that moves elements along
--> Once the motor runs or is linked to another belt, the original belt cannot be replayed
---
Streams vs Collections:
- Streams are not containers. They define a sequence of operations on elements
- Elements are processed once, triggered by a terminal operation
***
Lazy evaluation & efficiency:
- Intermediate operations (e.g. map, filter) do not execute immediately; they just define transformations
- Computation happens only when a terminal operation runs
--> Enables efficiency:
- On-demand processing: no unnecessary computation
- Short-circuiting: operations like anyMatch/allMatch/findFirst can stop processing as soon as result is determined
Example:
boolean hasEven = Stream.of(1, 3, 5, 6, 7).anyMatch(x -> x % 2 == 0);
--> stops at 6, does NOT process 7
--> If streams were eager, all elements would have to be processed regardless
***
Multi-pass is difficult:
- Supporting multiple passes would require buffering all elements
--> breaks laziness + increases memory usage
- Streams may come from non-repeatable sources (I/O, NW, DB cursors)
--> replaying may be impossible/ unsafe
***
Terminal operation & source consumption:
- After a terminal operation (e.g. forEach, collect) executes, the stream is marked as consumed
- Using a stream to create a derived stream (any intermediate operation) also consumes the original stream, which now acts as upstream
--> Any further attempt to use it throws IllegalStateException
*Internally, this is tracked via the linkedOrConsumed flag in AbstractPipeline
--> Monitors whether the spliterator (underlying element provider of the stream) has been linked/consumed
--> Once set, the original stream ref is effectively unusable
Example:
Stream<Integer> s = Stream.of(1, 2, 3);
Stream<Integer> s2 = s.filter(n -> n > 1);
s.forEach(System.out::println);
--> Throws IllegalStateException
---
Handling Multiple Passes:
*Materialize into a collection:
List<Integer> list = Stream.of(1,2,3).collect(Collectors.toList());
list.forEach(System.out::println);
list.forEach(System.out::println);
*Recreate the stream from the source:
Stream.of(1,2,3).forEach(System.out::println);
Stream.of(1,2,3).forEach(System.out::println);
---
*Stream.teeing() (Java 12+):
- Splits a single-pass stream into two consumers simultaneously
- Useful for two derived results from one traversal without manual buffering
Example - compute sum & count in one pass:
Stream<Integer> numbers = Stream.of(1, 2, 3, 4);
Map<String, Number> result = numbers.collect(
Collectors.teeing(
Collectors.summingInt(n -> n),
Collectors.counting(),
(sum, count) -> Map.of("sum", sum, "count", count)
)
);
System.out.println(result);
--> {sum=10, count=4}
⚠️ Even with teeing(), the original stream is consumed - it's just that both downstream collectors are applied in one pass
It's all about layered, memory-first gatekeeping + smart request handling
***
*Memory-first layers = latency hiding
- L0: Reserved Word Filter = static list/regex, O(1) local check
E.g. "admin," "support"
- L1: Bloom Filter = probabilistic, in-memory, tiny footprint (~1-2GB for billions of usernames)
- L2: Redis = exact hot handles + Negative Cache (taken names cached with TTL), RAM lookup < 1ms
- L3: DB = authoritative truth, sharded + indexed
--> Most requests never reach DB
--> scales to hundreds of millions of users
*Debouncing prevents unnecessary pipeline execution
Each layer adds a tradeoff:
- Reserved Words = limited, static
- Bloom = tiny false positives, requires counting/Cuckoo filters for deletions
- Redis = memory cost + cache TTL tuning
- DB = authoritative but slow
***
Handle check = conveyor belt with VIP gates:
- Reserved Word Filter = forbidden? Stop!
- Bloom filter = Definitely free? Go!
- Redis = Hot VIP names, check here
- DB = Final authority
--> By the time "Taken" is displayed, the request has passed through 4 optimized layers in ms
---
Username Lookup = Multi-Layer Pipeline
*L0 – Reserved Word Filter (Static Gatekeeper):
- Purely local O(1) string check
- Blocks reserved/profane handles immediately
--> Avoids NW / memory cost
***
*L1 - Bloom Filter (Probabilistic Gatekeeper):
*Answers "Definitely No"
--> stop immediately
*Answers "Maybe Yes"
--> go to next layer
⚡ Avoids DB reads for ~99% of free usernames
⚠️ Deletions/Staleness: Standard Bloom Filters cannot remove items
--> Counting Bloom Filters increment/decrement counters per bit to add/remove items
--> Cuckoo Filters store actual fingerprints in slots & can move (relocate) them to make room when deleting, allowing removals
--> both incur extra memory overhead
***
*L2 - Redis/In-Memory Cache:
Stores exact hot handles + Negative Cache for taken names (short TTL)
--> RAM lookup < 1ms
--> Ensures repeated "taken" queries never hit DB
⚠️ Cache hit ratio matters: keeping hot handles in RAM = massive performance win
***
*Smart Request Handling/Debouncing:
Client waits 300-500ms after last keystroke before sending request
--> Reduces redundant queries/char
--> smooth UX + lower system load
***
*L3 - Distributed DB (Source of Truth):
- Only accessed if cache cannot confirm availability
- Sharded + indexed
--> fast O(log N) lookup
--> Last-resort guarantee; optimized pipeline rarely touches DB!
⚠️ Write-Through Consistency:
- When a handle is claimed, Bloom Filter and Redis must be updated
--> Bloom Filter: set bits to 1; deletions require Counting/Cuckoo filters OR periodic full rebuilds
--> Redis: update/invalidate TTL to maintain correctness
https://t.co/MnO3SI7UHq
true (both references point to the same cached Integer instance)
Integer Caching:
- Integer objects between -128 and 127 are cached by the JVM by default
- This is a heap-level optimization, implemented in java.lang.Integer.IntegerCache
- Autoboxing and Integer.valueOf() reuse the same cached instance in this range
- new Integer(...) always allocates a fresh object (bypasses the cache, deprecated since Java 9)
Cache Configuration:
- Upper bound can be extended via:
-XX:AutoBoxCacheMax=<value>
- Lower bound (-128) is fixed
- Use .equals() for value comparison
---
Other Wrapper Caches:
Byte, Short, Long, Character:
- All have caches for a similar small range
- Unlike Integer, their upper bounds are not configurable via JVM flags
Boolean:
- Only two instances exist: Boolean.TRUE and Boolean.FALSE
- Always cached
Float and Double:
- No caching at all
- Floating-point caching is complex due to precision/representation issues
Because they solve different bottlenecks:
*Replication = same data, multiple copies
--> scales reads + availability (writes still hit primary)
*Sharding = horizontal partitioning across nodes (rows split across machines)
--> scales writes + data size (load distributed across shards)
---
*All sharding is horizontal partitioning, but NOT all horizontal partitioning is sharding
*Horizontal partitioning (Local single DB) = split rows across partitions (range/hash/list)
--> improves manageability + query performance, but NOT true horizontal scaling
*Sharding (Distributed) = horizontal partitioning across machines
--> enables real scalability (compute + storage)
---
Vertical partitioning = split columns
E.g. user_core vs user_profile
Used when:
- Some columns are rarely accessed (cold data)
- Reduce row size / I/O cost
- Improve cache efficiency
- Isolate sensitive/large fields
E.g. BLOBs
---
⚠️ Replication is simple but limited; sharding scales but introduces system-level complexity:
*Cross-shard joins:
extremely expensive/avoided (data must be co-located/denormalized)
*Distributed transactions:
costly & complex
--> often replaced with eventual consistency (ACID to BASE - Basically Available, Soft state, Eventual consistency)
*Resharding complexity:
redistributing data when shards grow is operationally hard (far more complex than adding replicas)
⚠️ Why even use async if we're going to block immediately?
--> BUT it depends where/who blocks
🤔 Are we blocking at the right layer or too early?
*Async isn't about avoiding blocking - it's about deferring it (where we block + how long + what + how failures propagate)
*get() is fine at the right boundary, but if used inside flow, it collapses async into sync & kills concurrency
***
*Async = non-blocking composition + latency hiding
--> deferring the wait, not eliminating it
*get() = blocking sync point
--> decides where the wait happens
***
*It's not about syntax - it's about execution model + arch
- Async = order food --> keep working --> eat later
- get() at boundary = wait when food arrives 👍
- get() in middle = stand at door immediately 🤦♂️
---
CompletableFuture = Non-blocking pipelines
- supplyAsync() = offloads work
- thenApply()/thenCompose() = defines continuations (callbacks)
- get() = introduces blocking barrier
Similar to streams:
- Streams: terminal op triggers execution
- CompletableFuture: get() forces sync
---
*get() must be at the end of the system (boundary)
Execution Models:
*Blocking Model (Servlet/ Thread-per-request):
HTTP response is sync
--> Thread must eventually produce result
--> Blocking is inevitable, BUT defer get() as late as possible (at boundary)
Example:
public String controller() throws Exception {
return CompletableFuture.supplyAsync(...)
.thenApply(...)
.get();
}
--> OK: async/sync boundary (needed to return a value, no further async composition expected)
--> Blocking is intentional & contained
--> Gains: overlapping I/O, better utilization
*Modern blocking frameworks (e.g. Spring MVC) can return CompletableFuture, effectively behaving non-blocking at the boundary
***
*Non-Blocking Model (Reactive/Event-loop):
No thread-per-request (e.g. WebFlux, Netty)
--> Framework handles callbacks/continuation
--> Never call get()
Example:
public CompletableFuture<String> controller() {
return CompletableFuture.supplyAsync(...)
.thenApply(...);
}
--> No blocking
--> Framework: Registers callback + Releases thread + Writes response on completion
***
⚠️ *Blocking Too Early (Problem):
Example:
public String service() throws Exception {
return CompletableFuture.supplyAsync(...)
.thenApply(...)
.get();
}
--> BAD: Async/Sync collapse, no latency hiding
--> Breaks composition + introduces hidden blocking upstream
What breaks:
*Concurrency collapse
--> caller thread blocks
*Thread inefficiency: Worker thread does work & caller thread waits
--> 2 threads for 1 task!
*Breaks composability: Cannot chain further async ops
--> Forces sync model upstream
*Thread pool starvation risk: Blocking inside async flows can exhaust thread pools
--> Leads to throughput collapse under load
---
*Blocking systems: defer get() to boundary (or preferably return CompletableFuture)
*Non-blocking systems: never call .get() - return the future & let the framework handle it
---
⚠️ Timeouts: Never use indefinite .get()
--> prefer failing the pipeline itself via .orTimeout(...) (Java 9+) over just timing out the blocking thread .get(timeout) - propagates failure early & keeps the async flow consistent
E.g. future.orTimeout(5, TimeUnit.SECONDS) (better) vs future.get(5, TimeUnit.SECONDS)
⚠️ Common Pool Trap:
Default ForkJoinPool.commonPool() is JVM-wide
--> blocking it can starve unrelated tasks (incl. parallel streams)
--> prefer custom executors (requires handling ThreadLocal context propagation explicitly)
E.g. CompletableFuture.supplyAsync(task, customExecutor);
*Exception Handling (Pipeline): Prefer handling errors inside the async pipeline (exceptionally, handle) OR use join() to avoid checked exception noise & keep composition clean
E.g. future.exceptionally(ex -> fallback);
⚠️ Sync Boundary Exception Trap: get() wraps failures in ExecutionException
--> when crossing async/sync boundary, always unwrap via e.getCause() to avoid masking real errors
*Big-O = asymptotic growth, ignores constants
*Real systems = constants matter (CPU, RAM, I/O, NW)
*Algorithm choice should balance asymptotics + constants
*A theoretically faster algorithm can actually be slower if the hidden constant is large
---
*Real-world latency dominates small datasets
E.g. Linear scan in RAM O(n) is faster than binary search on disk O(log n) for moderate n because disk I/O constant is huge
*HW effects:
Memory hierarchy, CPU cache, disk I/O & NW latency multiply the cost per operation
E.g. Sequential RAM access vs random disk access
--> same O(n), RT differs 1_00_000x
*Parallelism & CPU efficiency:
Algorithms with lower asymptotic complexity may underperform if they cannot leverage CPU pipelines, vectorization or threading efficiently
*Cost in production:
Larger constants
--> more CPU, memory & NW usage
--> higher cost, even if Big-O is better
@javarevisited@override *equals():
1. Check this == o first (performance)
2. Be null-safe + avoid blind casts (prevent NPE, ClassCastException)
*Not overriding hashCode()
--> User objs that are equal can end up in different buckets (duplicates in HashSet)
https://t.co/QbKUOdGiEr
a == b is true:
"Ja" + "va" is CT constant
--> auto-interned in SCP (same ref)
c == b is false:
part + "va" is RT concatenation
--> creates new heap obj (different ref)
***
*If part were declared final
--> part + "va" becomes CT constant expression
--> auto-interned
--> c == b true
---
*Constant Folding:
Compiler evaluates CT constant expressions & replaces them with result in bytecode
E.g. "Ja" + "va" = "Java" at CT
*String Interning:
CT constants are auto-interned in SCP
--> == can return true for CT constant Strings pointing to the same pool object
*RT Concatenation:
Any concatenation involving vars or non-final values is done at RT
--> Creates new heap obj, not interned
--> == returns false; .equals() still works
---
*CT constants + constant folding
--> interned + ref equality possible
*RT concatenation
--> new object + ref equality fails unless .intern() is used
*Applies Beyond Strings:
Enums: behave like interned objects
--> == works
Primitives (int, boolean, etc.):
--> == compares values, always works
Wrapper classes (Integer, Long, etc): small values (-128 to 127) are cached (like SCP but configurable), beyond that == can fail
---
- Use .equals() for content comparison; == for ref
- To make RT-concatenated Strings to match a pooled literal, call .intern()
Bloom filter = Probabilistic, memory-efficient data structure (bit array + hash functions) for testing set membership
*Purpose: Quickly identify what is definitely NOT in a set
--> avoids unnecessary DB queries/ expensive searches
*Guarantees:
- No false negatives (an element that exists is never missed)
- Controlled false positives (may say "possibly present" for elements not actually in the set)
*Efficiency: Minimal memory, scales to millions/billions of elements
---
*Structure:
- Bit array (m bits): stores compact representation
- Hash functions (k): map elements to multiple bits
*Name:
- Named after Burton Bloom (1970)
- Filter = filters out elements definitely NOT present
- Bloom = bit array progressively "blooms" as elements are added
***
Adding elements:
Compute k hashes -- set the corresponding bits
--> creates a probabilistic fingerprint
***
Checking membership:
Compute k hashes -- check bits
*Any bit = 0
--> definitely not present
*All bits = 1
--> possibly present, check further
***
Example (username check):
*dave123 -- some bits 0
--> definitely available, skip DB query
*alice -- all bits 1
--> might exist, check DB to confirm
---
Parameter Tuning (n, p, m, k):
n = expected elements
p = desired false positive probability
m = #bits in array
k = #hash functions
*Goal: balance memory usage vs false positive rate (p) for expected # elements (n)
*More elements n
--> larger array needed
*Lower desired false positive rate p --> larger array needed
$ m = -(n * ln(p)) / (ln(2) ^ 2)
$ k = (m / n) * ln(2)
Example:
*Expected usernames: n = 1_000_000
*Desired false positive rate: p = 0.01 (1%)
--> m = 9_585_000 bits = 1.2 MB
--> k = 7 hash functions
--> we need a bit array of ~1.2 MB and 7 hash functions to store 1 million usernames with only 1% false positives & no false negatives
---
*Double Hashing: compute two base hashes & combine them to simulate k hashes
--> avoids computing many independent hashes
--> improves performance
*Storage: in RAM for fast access or on disk/distributed memory for very large sets
*Silent False Positives: Bloom Filter does NOT guarantee correctness, only memory efficiency + speed
--> app must handle false positives gracefully
*Practical Implementation: use battle-tested libraries/frameworks (Guava, Redis, Cassandra) for hashing, concurrency & persistence
*Use Case: pre-filter for expensive DB/NW lookups, NOT a replacement for the DB