for folks working in large orgs:
how are you managing https://t.co/qIolRIrB33 (or equivalent) at scale?
specifically curious how teams:
• version common instructions
• prevent drift across tools (Claude, Cursor, Copilot, etc.)
• allow personal overrides without polluting git
what are the best resources for learning distributed systems and AI infrastructure?
I've read Kleppmann's DDIA: https://t.co/GLc3zgDJvD and implemented some of the systems mentioned in Go like LSM-trees, replica-set with consensus
interested in AI infra/inference
using AI as a tool to do the job more efficiently, but not dying on the hill
like anything in software engineering, AI is a tool with its use-cases. knowing when and how to use it is a clear edge
I have a few friends applying for new jobs and they all ask me the same question.
What exactly does "AI-native" mean.
And as much as I have an answer for them, I really don't.
How would you explain it to them?
not reading code for large-scale, critical software is not maintainable imo,
but I can see the argument for startups who are trying to move fast. deploy and fix. can’t say I fully agree but there’s more of an argument there
curious what folks thoughts are here?
I genuinely do not understand how people ship code without reading it
I use SOTA models for basically everything, max settings, best tools available
They still make hilariously dumb mistakes that would absolutely nuke my infra at any meaningful scale if I just blindly deployed them
Am i missing something?
@YessicaAguero agreed, we also had an extensive rubric for judging things like tone, verbosity, etc
and these rubrics can grow and change, but defining them upfront is crucial
highly recommend reading this, great example of distributed systems in practice and how to build highly fault-tolerant systems
heavy emphasis on automatic failover, im curious to learn more about semi-sync replication. seems to be crucial for high-availability setups
PlanetScale's fault tolerance is built on straightforward principles and architectures. The challenge is in the execution.
Here are the principles we follow to keep our systems reliable:
https://t.co/SRC3MiyliM
@fardeentwt cursor agent is amazing, and the IDE fallback is necessary for reviewing code easily (ik terminal diff projects exist but not comparable imo)
evals are crucial for deploying any LLM or agentic system in production
at a previous job we used mlflow and LLM as a judge for our offline agent evals. for anyone just starting with agent evals these are some great resources
@forgebitz not looking at the code doesn't scale for established, large-scale companies with a high bar for latency, reliability and availability
but for startups and early-stage companies who are trying to move fast I can see the argument for it, still not a great idea imo
@ibocodes ecosystem support and popularity
easier to build, debug and customize when the ecosystem is so much richer for nextjs, although I agree with the premise
@rozziitt yea ofc it’s a trade off between token use and quality, depending on whether you’re on a plan or pay per use you may just default to sonnet for everything, but opus on a medium reasoning for planning/research is a reasonable middle ground imo
@rozziitt opus for coding and sonnet for research? or did you mean the opposite
once the plan is in place, sonnet performs nearly identical for implementation. the bulk of the work is the plan/research phase