Prompt Engineering:
> ¿Cómo le escribo mejores instrucciones al modelo?
→ Formato de salida, rol, ejemplos, objetivos.
Context Engineering:
> ¿Qué información necesita para hacer bien su trabajo?
→ RAG, documentación, historial, memoria.
Harness Engineering:
> ¿Cómo diseño el entorno donde trabaja el agente?
→ Tools, subagentes, workflows, validaciones, permisos.
Lo interesante es que muchas cosas que antes atribuíamos al modelo terminan siendo problemas de contexto o de diseño del sistema.
New Anthropic Science Blog: Making Claude a chemist.
To manipulate a molecule, chemists first need to understand its structure. Their main tool is NMR spectroscopy.
We found Opus 4.7 matches—and on some tasks beats—dedicated NMR software. Read more: https://t.co/1jUvz7wdhV
@celulacecedista@ATinyGreenCell Turns out designed proteins do contain so much more trp residues than natural proteins, an it's used precisely to stabilize such designs
One early finding: evolutionary links between gene-editing enzymes across completely different branches of life — connections nobody had made before. This is what becomes possible when you can question protein space at scale, not just search it.
Explore ESM Atlas: https://t.co/n6OWfcVFRG
Anthropic engineer:
"You can build 5 assistants in one afternoon. Each one handles a task you've been doing manually every single day."
this is one of the best workflows I've seen in a long time
in this video he breaks down exactly how most people are using Claude:
- the 14% you lose to CLAUDE.md before typing a word
- the plugins that 95% of users have never installed
- the workflows that run without you typing a single prompt
- why starting every chat from zero is the slowest way to use Claude
if you've been starting every Claude conversation from scratch like it's never met you before, you're missing at least 20 features. probably 24
instead of another show tonight, watch this
make sure to bookmark it before it gets lost in your feed
the guide is in the article below
The creator of Linux just publicly called out the AI hype. Word for word.
Linus Torvalds took the stage at Open Source Summit 2026 and said this:
"When I see people saying 99% of our code is written by AI, I literally get angry. Because those same people — I can pretty much guarantee — 100% of their code is written by compilers. But they never say that."
He is not anti AI. The Linux kernel saw a 20% jump in submissions this release because of AI tools. He uses it. He gets it.
His point is something most people are too afraid to say.
AI is a productivity tool exactly like compilers were. Compilers boosted programming by 1000x. AI adds another 10x on top. Enormous. But nobody says "the compiler wrote my code." So why are we saying AI wrote it?
He also flagged something nobody is talking about.
AI is flooding small open source projects with drive-by bug reports. Someone runs a prompt, files a report and disappears when asked for a patch. Maintainers with one or two people are drowning trying to keep up.
"Sometimes AI reports a bug and when you ask for more information the person has done that drive-by and does not even answer your question. That is the real burnout issue."
And his final warning was the sharpest of all.
"People who do not understand the complexity of systems will prompt systems and write processes that will fail."
The AI hype crowd is very loud right now.
Linus has been building real systems for 35 years. When he talks, engineers listen.
Full interview here:
https://t.co/LmXJtvKc4O
Read the actual Anthropic piece. It's wilder than the hype precisely because it's so measured.
As of last month, Claude writes more than 80% of the code Anthropic merges into its own codebase. Their engineers now ship roughly 8x the code per day they did in 2024.
On a fixed test they run every release (take some training code, make it run faster), the model went from ~3x a year ago to ~52x. A skilled human gets about 4x in a few hours.
And when a researcher hits a wrong turn mid-investigation, the newest model now picks the better next step 64% of the time.
The honest part the hype skips is that Anthropic says we are NOT at self-improvement yet. Claude still can't decide which problems are worth solving. That research taste is the last thing keeping a human in the loop.
But they openly map the moment it closes, when agents can build and train models themselves and Claude starts improving Claude.
And the loudest signal in the whole thing sits at the end: the company building this is the one asking for a working pause button before the loop closes.
As an AI Engineer. Please learn:
Harness engineering, not just prompt engineering
Context engineering, not just long prompts
Prompt caching vs. semantic caching tradeoffs
KV cache management, eviction, reuse, and memory pressure at scale
Prefill vs. decode latency and why they optimize differently
Continuous batching, paged attention, and throughput optimization
Speculative decoding vs. quantization vs. distillation tradeoffs
INT8, INT4, FP8, AWQ, GPTQ, and when quantization hurts quality
Structured output failures, schema validation, repair loops, and fallback chains
Function calling reliability, tool contracts, argument validation, and idempotency
Agent guardrails, loop budgets, tool budgets, and termination conditions
Model routing, graceful fallback logic, and degraded-mode UX
RAG architecture: chunking, embeddings, hybrid search, reranking, and freshness
Retrieval evals: recall, precision, grounding, attribution, and citation quality
Evals: golden sets, regression tests, adversarial tests, LLM-as-judge, and human evals
LLM observability as a first-class discipline: traces, spans, tokens, latency, errors, and drift
Cost attribution per feature, workflow, tenant, and user journey not just per model
Safety engineering: prompt injection defense, data leakage prevention, and permission boundaries
Multi-tenant isolation, cache safety, and cross-user context contamination prevention
Fine-tuning vs. in-context learning vs. RAG vs. distillation and when each is the wrong tool
Latency, quality, cost, and reliability tradeoffs across the full inference stack
Production failure modes: hallucinated tool calls, malformed JSON, stale retrieval, runaway agents, and silent eval regressions
Shipping LLM systems as reliable infrastructure, not demos wrapped around prompts
https://t.co/OhK9MK04ld
Una solución simple a esto es:
1) Escribí todo en AGENTS.md
2) En CLAUDE.md, agregá algo como: “Leé y seguí las instrucciones de AGENTS.md”.
Así tenés un único lugar para mantener las instrucciones y evitás duplicarlas.