Folks: when you write skills, ask your agent to be token efficient, relax grammer. I see too many skills that write books in the skill description, and all that crap is loaded into every context.
I wrote a skill that finds the worst offenders. https://t.co/kfaaJpxMXE
You might believe you should spend less time thinking about code because of AI.
I strongly disagree! We’re watching this play out live where tons of AI generated code becomes a liability.
At the end of the day, an engineer needs to be responsible / on call for code that gets shipped to production. If you don’t understand the system you’re trying to debug, you’re probably going to have a bad time.
Yes, AI can help with all of this, if you set up the proper systems. You can have agents triage prod logs, look at errors, etc. You can speed up parts of the investigation, but an engineer needs to make the call. There might be serious customer or financial implications from that change.
I expect the trend continue for trimming dependencies, vendoring code so you can modify it directly, preferring simpler systems with fewer abstractions, and spending waaaay more time thinking about system design and code maintenance.
I’ve said this before, but it’s a great time to get familiar with CS fundamentals and some of the history behind what great software looks like. Many parts will be different in the coming years as AI progresses, but also a lot more than people realize will stay the same.
Google 把内部工程师的代码审查(Code Review)规范公开啦
这几乎是目前业界最顶级的标准
很多程序员只会写代码,但不知道怎么审代码,可以看看 Google 是怎么做的
1.双向指南:不仅教审查者怎么挑毛病,还教作者怎么写出容易通过的代码
2.术语科普:解释了 Google 内部常用的 LGTM(看起来不错)和 CL(变更列表)到底意味着什么
3.实战价值:这套规范不是理论,而是 Google 每一位工程师都在用的实际操作准则
如果你想提升团队的代码质量,或者想知道顶级大厂的开发门槛,这份文档必读!
https://t.co/OdaozRkMYn
AI한테 코드 짜달라고 하면 기능은 돌아가는데 슬금슬금 복잡도만 올라가서 나중에 리팩토링 지옥 열리기 십상임. 이럴 때 Cursor 개발팀이 자기들 내부에서 제일 많이 쓴다는 /thermo-nuclear-code-quality-review 스킬 참고하면 직효약일 듯. 1k 라인 넘는 파일은 아예 차단하고, 알맹이 없는 껍데기 래퍼나 로직 유출 플래그 세우고, 돌아는 가는데 코드 더럽히는 PR은 가차 없이 쳐냄 ㅋㅋㅋ. 프롬프트 구조 그대로 가져다가 우리 팀 AI 에이전트 룰셋이나 Cursor Composer에 박아두고 쓰기 딱 좋네.
ANTHROPIC JUST KILLED THE DEMO AGENT ERA.
Their Agents team showed exactly what production grade looks like.
Not theory. Not a tutorial. A four layer framework for multi agent systems built to actually work in the real world.
30 minutes.
This is the video I wish existed 6 months ago.
Anthropic dropped a 33-page guide on building Claude Skills
Everything you could ever need is in here. Bookmark this and come back to it
https://t.co/jEuH95NGn3
Creator of Claude Code just dropped a 6-min workshop on new Claude feature during live session in London.
Boris Cherny: “A lot of my code these days is written by "routines". I’m not doing the prompting - I create the routines that do the prompting.”
6 minutes. Free. From a live session.
Watch this now. This will change the way you vibe-code forever.
I am fortunate that my current manager guides me through these nuances of leadership.
He also introduced me to the book "The Manager's Path" by Camille Fournier that talks about progression to through the engineer leadership ladder (I highly recommend this).
While going through it, I zeroed-in on one idea from every chapter which I always keep handy for quick reference.
Here it is (from my Obsidian vault):
Andrej Karpathy: "90% of Claude's mistakes come from missing context, not a weak model."
41% mistake rate without a CLAUDE.md. 11% with the 4-rule baseline. 3% with the 12-rule version below
here are the 12 rules senior engineers settled on:
1. think before coding: state assumptions, don't guess. the model can't read your mind, stop hoping it will
2. simplicity first: minimum code, no speculative abstractions. the moment you let Claude add "for future flexibility," you've added 200 lines you'll delete next quarter
3. surgical changes: touch only what you must. don't let it improve adjacent code, that's how PRs blow up
4. goal-driven execution: define success criteria upfront, loop until verified. without them Claude either loops forever or stops too early
5. use the model only for judgment calls: classification, drafting, summarization, extraction. NOT routing, retries, status-code handling, deterministic transforms. if code can answer, code answers
6. token budgets are not advisory: per-task 4000, per-session 30000. by message 40 of a long debug, Claude is re-suggesting fixes you rejected at message 5
7. surface conflicts, don't average them: two patterns in the codebase? pick one. Claude blending them is how errors get swallowed twice
8. read before you write: read exports, callers, shared utilities. Claude will happily add a duplicate function next to an identical one it never read
9. tests verify intent, not just behavior: a test that can't fail when business logic changes is wrong. all 12 of Claude's tests can pass while the function returns a constant
10. checkpoint every significant step: Claude finished steps 5 and 6 on top of a broken state from step 4. nobody noticed for an hour
11. match the codebase conventions: class components? don't fork to hooks silently. testing patterns assumed componentDidMount, hooks broke them without surfacing
12. fail loud: "completed successfully" with 14% of records silently skipped is the worst class of bug. surface uncertainty, don't hide it
what actually compounds instead of the next framework:
- the CLAUDE.md file as institutional memory across sessions
- eval-driven changes, not vibe-driven
- checkpoints over speed
- explicit conflicts over silent blending
- discipline over framework, every time
- one repo, one rules file, no exceptions
be a few rules ahead of AI twitter before this becomes mass-opinion
study this
Muy buen artículo de OpenAI sobre Harness Engineering y Codex.
Explican cómo usaron agentes para construir un producto interno con ~1M de líneas de código y qué problemas aparecieron en el proceso.
Algunas ideas interesantes:
• evitar que el código generado por agentes se degrade con el tiempo
• usar tests y CI como restricciones más confiables que prompts
• mantener código y documentación legible para agentes
• cómo cambia el trabajo de los engineers cuando los agentes empiezan a programar
Gran parte del desafío está en el sistema alrededor del modelo.
He said that the older generation has a debt to pay; leaving the country better for the younger generation.
“We want those studying engineering to be able to work as engineers, not just working for Grab.”
github just created an official certification for "agentic AI developer."
exam: GH-600.
skills tested: multi-agent orchestration, state management, system design.
GA: july 2026.
first 100 beta takers: 80% off. deadline may 31.
this is the first time "AI agent engineer" has a credential behind it.
not a linkedin skill tag.
not a course completion badge.
a formal certification. backed by github and microsoft.
the role is real. the credential is real.
the free roadmap to get there is 14 weeks and $0.
like + bookmark to save.
GITHUB ACABA DE LANZAR LA CERTIFICACIÓN OFICIAL DE UNO DE LOS ROLES TECH MÁS IMPORTANTES DE 2026
→ Agentic AI Developer (GH-600)
Y es la primera vez que trabajar con agentes de IA se convierte oficialmente en una disciplina reconocida de ingeniería.
Ya no hablamos de:
• prompt engineering
• vibe coding
• automatizaciones simples
Hablamos de un nuevo perfil técnico:
→ Agentic AI Developer
La persona que:
• coordina agentes de IA
• construye workflows autónomos
• integra agentes en entornos reales
• supervisa fallos en producción
• evita errores críticos en pipelines CI/CD
• sabe cuándo un agente no es fiable
Antes:
→ “Trabajo con agentes de IA” era difícil de validar.
Ahora:
→ GitHub certifica oficialmente ese skillset.
Y eso cambia el mercado.
Las empresas van a necesitar este perfil.
Pero todavía hay muy pocos developers especializados en ello.
Si ya trabajas con:
• Copilot
• Codex
• Claude Code
• workflows agentic
• automatizaciones con IA
Probablemente ya estés haciendo este trabajo.
GH-600 es la forma de demostrarlo.
Guárdate esto 🔖
Cuando entendés harness engineering, empezás a ver que gran parte de la “inteligencia” de un agente no vive solo en el modelo.
También vive en cómo organizás contexto, memoria, tools, permisos y el loop de ejecución.
Muy buen recurso para entender cómo se construyen agentes.
Karpathy just described what hiring looks like in 2026:
"Build a large project with Claude Code — like a Twitter clone. Make it secure. Have real agents using the platform doing stuff. The interviewer uses parallel agents trying to break in to verify security."
One person. Multiple agents. Shipping and defending production code simultaneously.
This is not a future job description.
This is happening right now.
The founders who get there first are not the smartest ones in the room. They are the ones who stopped doing everything themselves and built agents to do it for them.
Here is the complete playbook — 13 agents, exact prompts, 90-day build plan ↓
Read this before your competition does.