But the query was fast in development and staging!?!
What are the signs to look for that you've created a query that is fast locally, but needs performance considerations in production?
There are two inflection points where a simple query gets meaningfully slower:
1) When the sort spills out of memory
2) When the table scan moves to disk
The examples below are in resource constrained environments, but the pattern of behavior and output is similar to what you'd see even in larger environments, only with much, much larger numbers of rows.
A simple query
```sql
SELECT user_id, event_type, created_at
FROM events
WHERE user_id = 1
ORDER BY created_at DESC
```
Without an index, Postgres must scan the entire table to find the user's events even if there are only a few rows (Narrator: if there are zero rows for a user, Postgres may choose not to scan any rows due to table and column statistics, which we talked about a while back).
Phase 1) everything fits in memory
With 10,000 rows, and user_id=1 has 5,000 events, and `work_mem = 1MB`:
```
Sort (cost=506.19..518.69 rows=5000 width=20) (actual time=1.422..1.587 rows=5000 loops=1)
Sort Key: created_at DESC
Sort Method: quicksort Memory: 427kB
Buffers: shared hit=74
-> Seq Scan on events (cost=0.00..199.00 rows=5000 width=20) (actual time=0.006..0.599 rows=5000 loops=1)
Filter: (user_id = 1)
Rows Removed by Filter: 5000
Buffers: shared hit=74
```
• `Sort Method: quicksort Memory: 427kB`: the 5,000 rows are sorted in RAM
• `Rows Removed by Filter: 5000`: the other 5,000 rows were scanned and discarded
• `Buffers: shared hit=74`: all 74 table pages were in shared_buffers
Phase 2) sort spills to disk
With 200,000 rows. and user_id=1 now has 100,000 events, and `work_mem = 1MB`, the sort keys for 100k rows no longer fit in memory:
```
Sort (cost=14314.82..14564.48 rows=99867 width=20) (actual time=25.069..29.488 rows=100000 loops=1)
Sort Key: created_at DESC
Sort Method: external merge Disk: 3352kB
Buffers: shared hit=1471, temp read=836 written=843
-> Seq Scan on events (cost=0.00..3971.00 rows=99867 width=20) (actual time=0.005..8.135 rows=100000 loops=1)
Filter: (user_id = 1)
Rows Removed by Filter: 100000
Buffers: shared hit=1471
```
• `Sort Method: external merge Disk: 3352kB`: 3.3MB spilled because 100k rows × ~20 bytes > 1MB `work_mem`
• `temp read=836 written=843`: 843 temp pages written to disk and read back during the merge
• `Rows Removed by Filter: 100000`, but scanned prior to sort
Phase 3) table scan moves to disk
600,000 rows total. user_id=1 still has 100,000 events. The other 500,000 rows belong to other users and Postgres scans all of them anyway. The table now overflows shared_buffers (16MB = 2,048 pages; table is ~4,412 pages).
```
Gather Merge (cost=12663.14..22545.49 rows=84700 width=19) (actual time=13.577..23.650 rows=100000 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1477 read=3049 written=109, temp read=422 written=429
-> Sort (cost=11663.11..11768.99 rows=42350 width=19) (actual time=12.298..14.186 rows=33333 loops=3)
Sort Key: created_at DESC
Sort Method: external merge Disk: 1416kB
Buffers: shared hit=1477 read=3049 written=109, temp read=422 written=429
Worker 0: Sort Method: external merge Disk: 968kB
Worker 1: Sort Method: external merge Disk: 992kB
-> Parallel Seq Scan on events (cost=0.00..7537.00 rows=42350 width=19) (actual time=0.008..7.273 rows=33333 loops=3)
Filter: (user_id = 1)
Rows Removed by Filter: 166667
Buffers: shared hit=1373 read=3039 written=99
```
The planner launched 2 parallel workers which means it ran faster by splitting the seq scan and sort across 3 processes. Each worker sorted ~33k rows instead of 100k, so each spill was smaller.
The things to be concerned about here are:
• `shared read=3049`: the table overflowed shared_buffers; 3,049 pages were read from disk
• `Rows Removed by Filter: 166,667 × 3 workers = 500,001`: 500k rows scanned and discarded
• All three sort operations still spilled: parallel didn't eliminate the sort, just distributed it
Solution
The solution to this problem should be driven by the needs of the application it serves. In most simplistic terms, an index on events (user_id, created_at DESC) would reduce sort load. However, solutions may also include application-level caching, a materialized view, or table partitioning.
OWASP CVE Lite CLI: nueva herramienta de escaneo de vulnerabilidades
CVE Lite CLI es un escáner de vulnerabilidades gratuito y de código abierto, reconocido oficialmente como un proyecto de la incubadora de OWASP
https://t.co/nqMctqsHj8
¡Google acaba de hacer DESIGN.md de código abierto!
Un formato para decirle a la IA cómo debe diseñar tu UI.
Colores, tipografías, espacios, componentes y reglas visuales...
Para que la IA genere interfaces siguiendo tu estilo:
→ https://t.co/rCZWdEux27
Agile and Scrum evolved in an era when software development took significant time, and two-week sprints enabled us to gather feedback and adapt the product to meet customer demands.
Nowadays, AI compresses the development time so much that it makes more sense to use Kanban instead.
The larger the company, the more difficult the AI adoption will be since it will become political, rather than practical.
🚨 Git diff está oficialmente muerto.
En vez de escupirte 300 líneas para que adivines qué cambió…
[sem] te dice la verdad:
→ Función login() fue modificada.
→ Clase UserService renombrada.
→ Método validateToken() se movió.
Diffs a nivel de funciones, clases y métodos reales.
No más ruido. No más pérdida de tiempo.
21 lenguajes.
Se integra directo con Git.
Code review 10x más rápido.
Esto es el futuro del control de versiones.
REPOOO👇
@ignacio_arriaga Si gracias a la IA los ingenieros se acercan cada vez más a producto y su capacidad de resolver problemas ¿Va a hacer que los expertos en productos se desplacen o van a ir convergiendo en el mismo rol?
@flopezluis@dei_biz@psluaces También indica que durante el proceso de selección no se hicieron las preguntas oportunas ni se explicaron cuáles iban a ser las herramientas con las que se iba a trabajar.
Tengo la sensación de que todas las plataformas de IA están enfocadas en que gastes tokens sin parar para luego poder crujirte. Daja a X trabajando toda la noche, deja a Y trabajando todo el finde. En vez de enfocarse en lo importante.