Say goodbye to expensive software!
10 websites that can replace hundreds of dollars worth of PC software (for free)
Your wallet will thank you later! ๐ธ
Key Concepts to Understand Database Sharding
Database sharding refers to splitting data across multiple database servers and is commonly used for scaling. However, sharding introduces major operational and infrastructure complexity that should be avoided until absolutely necessary.
Approaches to postpone sharding
Vertical Scaling: Use more powerful single database servers - more CPUs, memory, storage and I/O bandwidth. Much simpler to manage than sharding while allowing sizable expansion.
SQL Optimization: Tune SQL queries and database schema to maximize performance on a single server. Requires proper indexes, efficient SQL, etc.
Caching: Use in-memory caches like Redis to reduce database load by avoiding hitting it for every common query.
Read Replicas + Load Balancer: Adds horizontal read scaleability without full complexity of sharding. Directs reads across replicas.
These optimization approaches should be exhausted before sharding given the complexity.
Horizontal vs Vertical Sharding
There are two high-level approaches:
Vertical Sharding: Split database into columnar tables or sections vs rows. For example, having one table for names and another table for emails.
Horizontal Sharding: Split database into row partitions distributed evenly across multiple servers.
Some horizontal sharding methods:
1. Range Based: Segment rows based on range values like age groups. Can cause uneven data distribution and hot spots.
2. Directory Based: Use a lookup directory to locate rows. Allows flexibility but single point of failure risk.
3. Hash Based: Apply hash functions to spread rows uniformly across shards. Harder to rebalance.
When sharding, use the simplest approach that meets requirements to minimize complexity. Seek to avoid until necessary.
โ
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://t.co/kNfv0DVDdf
๐ง๐ผ๐ฝ ๐ฎ๐ฌ ๐ฆ๐ค๐ ๐พ๐๐ฒ๐ฟ๐ ๐ผ๐ฝ๐๐ถ๐บ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป ๐๐ฒ๐ฐ๐ต๐ป๐ถ๐พ๐๐ฒ๐
Here is the list of the top 20 SQL query optimization techniques I found noteworthy:
1. Create an index on huge tables (>1.000.000) rows
2. Use EXIST() instead of COUNT() to find an element in the table
3. SELECT fields instead of using SELECT *
4. Avoid Subqueries in WHERE Clause
5. Avoid SELECT DISTINCT where possible
6. Use WHERE Clause instead of HAVING
7. Create joins with INNER JOIN (not WHERE)
8. Use LIMIT to sample query results
9. Use UNION ALL instead of UNION wherever possible
10. Use UNION where instead of WHERE ... or ... query.
11. Run your query during off-peak hours
12. Avoid using OR in join queries
14. Choose GROUP BY over window functions
15. Use derived and temporary tables
16. Drop the index before loading bulk data
16. Use materialized views instead of views
17. Avoid != or <> (not equal) operator
18. Minimize the number of subqueries
19. Use INNER join as little as possible when you can get the same output using LEFT/RIGHT join.
20. For retrieving the same dataset, frequently try to use temporary sources.
Do you know what is ๐ค๐๐ฒ๐ฟ๐ ๐ข๐ฝ๐๐ถ๐บ๐ถ๐๐ฒ๐ฟ? Its primary function is to determine ๐๐ต๐ฒ ๐บ๐ผ๐๐ ๐ฒ๐ณ๐ณ๐ถ๐ฐ๐ถ๐ฒ๐ป๐ ๐๐ฎ๐ to execute a given SQL query by finding the best execution plan. The query optimizer works by taking the SQL query as input and analyzing it to determine how best to execute it. The first step is to parse the SQL query and create a syntax tree. The optimizer then analyzes the syntax tree to determine how to run the query.
Next, the optimizer generates ๐ฎ๐น๐๐ฒ๐ฟ๐ป๐ฎ๐๐ถ๐๐ฒ ๐ฒ๐ ๐ฒ๐ฐ๐๐๐ถ๐ผ๐ป ๐ฝ๐น๐ฎ๐ป๐, which are different ways of executing the same query. Each execution plan specifies the order in which the tables should be accessed, the join methods, and any filtering or sorting operations. The optimizer then assigns a ๐ฐ๐ผ๐๐ to each execution plan based on the number of disk reads and the CPU time required to execute the query.
Finally, the optimizer ๐ฐ๐ต๐ผ๐ผ๐๐ฒ๐ ๐๐ต๐ฒ ๐ฒ๐ ๐ฒ๐ฐ๐๐๐ถ๐ผ๐ป ๐ฝ๐น๐ฎ๐ป with the lowest cost as the optimal execution plan for the query. This plan is then used to execute the query.
Check in the image the ๐ผ๐ฟ๐ฑ๐ฒ๐ฟ ๐ถ๐ป ๐๐ต๐ถ๐ฐ๐ต ๐ฆ๐ค๐ ๐พ๐๐ฒ๐ฟ๐ถ๐ฒ๐ ๐ฟ๐๐ป.
#technology #softwareengineering #programming #techworldwithmilan #sql
NEW PROJECT!
After 2 months of "silent" work, we present Filament Examples
https://t.co/Bfwdp4iz7u
We've created 22 Filament projects (many more to come), addressing its features and questions from Twitter/Discord/YouTube.
For 4 days only, 40% OFF!
Coupon FILAMENTFOREVER
This Google drive is a goldmine of resources on:
โก๏ธ Project Manag
โก๏ธ Product Manag
โก๏ธ Software Eng
โก๏ธScrum
โก๏ธPMP certification
โก๏ธSQL
โก๏ธUI/UX Design
โก๏ธ Interview Docs
โก๏ธ Data Analytics
โก๏ธ Business Analysis
โก๏ธData Visualization
โก๏ธBig Data
https://t.co/fbWXUUd9lb
SAVE & SHARE