unpopular dockerfile takes (that actually work)
1 - stop using alpine — yes, it's tiny. but musl libc ≠ glibc. your python/node app will rebuild native deps from scratch or just... silently be slower. use -slim (debian-slim) instead. same size win, zero grief.
2 - layer order is your cache strategy. COPY your lockfile first, run install, then copy source. invalidating the install layer on every code change is a skill issue ngl
3 - multi-stage builds aren't just "best practice" — they're the actual reason your prod image doesn't ship gcc and 400mb of build tools. builder stage = bloat zone. final stage = lean mean container.
4 - COPY . . is fine actually — if your .dockerignore is correct. most pain here is from forgetting to ignore node_modules/, .git, *.log. fix the ignore file, not the COPY.
5 - one process per container is a vibe, not a law. if your app needs nginx + app server and you're not at k8s scale — just use supervisord. the "one process" dogma costs more complexity than it saves sometimes.
6 - pin your base image by digest, not tag. node:20 today ≠ node:20 in 6 months. prod broke because of a tag? that's a you problem tbh.
7 - BuildKit cache mounts (--mount=type=cache) will change your life. pip/apt/cargo cache between builds without it ending up in the final layer. nobody talks about this enough fr
there's no "best practice" in a vacuum. alpine is great for Go binaries. slim is great for Python. scratch is great for static bins. know your workload, then choose.
btw if you want something to catch all this stuff automatically -
check out dockerfile-roast — a linter written in Rust that literally roasts your Dockerfile. 63 rules, brutally honest output (but it can also provide just dry facts, no roast), runs on any OS or as a docker container
https://t.co/NVYpe8iD65
#docker #devops #kubernetes #backend #linux #rust #sre #containers
@karpathy What's your take on GraphRAG? I'm building a very similar system to what you described using FalkorDB and it's GraphRAG SDK, where the LLM generates the Cypher query for context fetching across a big graph database.
@larsencc@browser_use@HelloFresh Our cli was responsible for spinning up the namespace, deploying the changed services, rewiring the dozens of service addresses and manage DNS and API gateway to give the namespace a url to be accessed. It was beautiful! Or maybe it is, they may be using it to these days.
@larsencc@browser_use@HelloFresh But back to your idea: our mechanism was exactly that: a stable staging that nobody could touch, as close as possible to a prod mirror, and the ephemeral environments (kubernetes namespaces) hosted only the services that changed. The black magic was in the service discovery layer
@larsencc@browser_use@HelloFresh No, I wouldn't. It would be absurdly expensive and inefficient. I don't this this is the best way to test AI-made changes at this scale. Besides that, only production is production. It's better to build trust by applying other testing and quality techniques first.