As an AI Engineer. Please learn:
Harness engineering, not just prompt engineering
Context engineering, not just long prompts
Prompt caching vs. semantic caching tradeoffs
KV cache management, eviction, reuse, and memory pressure at scale
Prefill vs. decode latency and why they optimize differently
Continuous batching, paged attention, and throughput optimization
Speculative decoding vs. quantization vs. distillation tradeoffs
INT8, INT4, FP8, AWQ, GPTQ, and when quantization hurts quality
Structured output failures, schema validation, repair loops, and fallback chains
Function calling reliability, tool contracts, argument validation, and idempotency
Agent guardrails, loop budgets, tool budgets, and termination conditions
Model routing, graceful fallback logic, and degraded-mode UX
RAG architecture: chunking, embeddings, hybrid search, reranking, and freshness
Retrieval evals: recall, precision, grounding, attribution, and citation quality
Evals: golden sets, regression tests, adversarial tests, LLM-as-judge, and human evals
LLM observability as a first-class discipline: traces, spans, tokens, latency, errors, and drift
Cost attribution per feature, workflow, tenant, and user journey not just per model
Safety engineering: prompt injection defense, data leakage prevention, and permission boundaries
Multi-tenant isolation, cache safety, and cross-user context contamination prevention
Fine-tuning vs. in-context learning vs. RAG vs. distillation and when each is the wrong tool
Latency, quality, cost, and reliability tradeoffs across the full inference stack
Production failure modes: hallucinated tool calls, malformed JSON, stale retrieval, runaway agents, and silent eval regressions
Shipping LLM systems as reliable infrastructure, not demos wrapped around prompts
https://t.co/OhK9MK04ld
.@ether_fi is a pioneer & one of the biggest neobanks in the world w/ $6B+ in deposits
@plumenetwork is licensed, regulated, compliant, and purpose built infrastructure for open finance
Neobanks are one of the most interesting & fastest growing markets in the world and we are incredibly excited to work alongside pioneers in this market to bring Tier 1 high quality institutional yields to users
This RWA vault makes it easy for the global https://t.co/gRaEbF4y8g community to access high quality institutional yields from multiple trillion dollar assets managers in a single click
This is the future of capital markets. This is open finance. Open, composable, and compliant.
Grateful to build in one of the biggest markets in the world where individual market segment are all hundreds of trillions of dollars. This is just the start and much more to come!!
How OpenAI Built Its Data Agent
Most teams building data agents stack routers, fine-tunes, and complex retrieval pipelines on top of multiple LLMs. OpenAI didn't.
Their data agent runs on a single model and only 13 tools, across 1.5 exabytes and 90,000 tables. It's "pretty vanilla" by design.
We spoke with Emma Tang, Head of Data Platform Engineering at OpenAI, to better understand the architecture and the engineering decisions behind it.
The article covers:
- The architecture behind the data agent
- The six layers of context that make a single LLM reliable across 90,000 tables
- How OpenAI Uses Codex Internally: 3 Use Cases
- Five practical lessons for any team building a domain agent
- Where OpenAI's data platform is headed next
Today a crazy quantum story just got wilder.
On March 31, the Google Quantum AI team published a landmark result on Shor's algorithm for elliptic curve cryptography. Technically, the paper was a bombshell: a dramatic 10x improvement over the state-of-the-art. As a stunt and wakeup call to the blockchain space, those optimisations were illustrated on secp256k1, the elliptic curve underlying Bitcoin and Ethereum signatures.
But perhaps the most striking part of the paper was sociological, not technical. Instead of following standard academic process, the optimisations were kept secret, hidden behind a zero-knowledge (ZK) proof. Google's accompanying blog post mentions they "engaged with the U.S. government". The ZK proof demonstrates the existence of algorithmic improvements without leaking details. Academic censorship with ZK, a historic first!
As a co-author of the Google paper I witnessed some of the context surrounding this censorship. To be honest, multiple aspects of that context don't sit well with me. As much as I believe the general public ought to know more, I am limited in my ability to whistleblow. Though let me be clear about one thing: the Google team's professionalism has been absolutely exemplary, and they deserve nothing but praise.
Censorship has a way of backfiring. The Streisand effect, where an attempt to bury something only draws more attention to it, is exactly what's unfolding today. First, Google's key optimisation has been rediscovered by the French. And in a thrilling turn of events, a collaborative Shor-at-home challenge just launched. The initiative, available at ecdsa[.]fail, breached a new Shor world record in a matter of hours.
Let's start with the rediscovery. Just two months after Google's paper, French quantum expert André Schrottenloher cracks the main secret optimisation. His paper, titled "Optimized Point Addition Circuits for Elliptic Curve Discrete Logarithms", landed on the arXiv today. Big congrats to André, who beat several other nerdsnipped experts to it. In a blog post also published today, Craig Gidney, the world expert on Shor optimisations, revealed that he'd been sitting on this very optimisation for a whole year under censorship pressure.
Interestingly, André missed a handful of minor optimisations, both from Google's original publication and from improvements found since. It's plausible there's still plenty of juice left to squeeze out of Shor, and this is exactly what the ecdsa[.]fail challenge is about. The verifier program developed for the ZK proof does double duty, automatically filtering for valid submissions. Dozens of compounding small and micro improvements are rolling in. As of the time of writing there's an 8.4% improvement to Google's circuit, as measured by the product of logical qubit count and Toffoli gate count. Nice!
The nerdsnipping ran deeper than anyone expected. Over the last few weeks it became clear it extended well beyond André and other quantum experts. Behind the scenes, a small army of amateurs quietly got to work. Inspired by Karpathy-style autoresearch, they turned AI on Shor. Ironically, the verifier program for the ZK proof makes an ideal reward function for AIs. The barrier to entry for this modern style of research is refreshingly low, with several non-experts, even a teenager, finding nice optimisations. Get in touch if you'd like to join a Telegram group with fellow autoresearchers :)
Part 2: neutral atoms and qday
The story doesn't end with Google. On the same day Google went public, a stealthy startup called Oratomic published its own Shor paper in a coordinated release. It made a splash, ultimately becoming the most upvoted paper on scirate[.]com, a website ranking arXiv papers.
Oratomic's claim was wild. By building on Google's logical optimisations and applying custom physical optimisations for neutral atoms, they claimed just 10K physical qubits were sufficient to run Shor's algorithm on secp256k1. That number is mind-bogglingly low.
Knowing essentially nothing about neutral atoms when Oratomic's paper landed, I was intrigued and decided to learn more about the tech. I fell straight down the rabbit hole and spent a couple hundred hours on the topic. I got a little obsessed and watched every YouTube video I could find and spoke to a bunch of experts.
My conclusion? The tech is real, very real. Even Google recently decided to start a neutral atom lab, a notable pivot from their sole focus on superconducting qubits. If you care about qday, i.e. the day a quantum computer will break the first piece of cryptography in production, neutral atoms demand your attention. I shared some of my learnings on Shor and neutral atoms in a 30min talk at the ZKProof cryptography conference. You can find it on YouTube by searching "zkproof neutral atom".
Here's an interesting observation about this duo of breakthrough papers: neither Google nor Oratomic say a word about what their results mean for qday. No timelines. Zero. Nada. That is especially baffling given that the whole point of whitehat quantum cryptanalysis is to inform qday estimations and help the general public make good decisions.
So let me attempt to partially fill the silence, similarly to what Scott Aaronson did in his April 29 post. Given everything I know, including scary non-public information, I now put the odds of qday by 2032 at 50%. 10% by 2030.
Anecdotally, the US government has its own date: 2035. Originating at the NSA and later adopted by NIST, it's when branches of the US government will be disallowed from using quantum-vulnerable cryptography. In plain language: with hindsight, that date is a joke and should be discounted entirely. I don't see how NIST avoids being forced to pull it forward by years.
Part 3: post-quantum cryptography
There are good reasons to sound the alarm today, but please do not panic. Rushing carelessly towards immature post-quantum cryptography is a recipe for disaster. IMO a good target date for migration is 2029, roughly 3.5 years out. 2029 happens to be the date selected by Google, Cloudflare, and the Ethereum Foundation.
These days most of my time goes to safely migrating Ethereum towards post-quantum cryptography as part of the broader lean Ethereum effort. There's a lot to do. We need to rip out and replace BLS signatures at the consensus layer, KZG commitments at the data layer, and ECDSA signatures at the execution layer.
The plan to get there is compelling, and is based on hash-based cryptography. Within the Ethereum Foundation we've developed a Swiss army knife called leanVM (github[.]com/leanEthereum/leanVM) powered by the magic of hash-based SNARKs. Thanks to truly exceptional work by Emile, Thomas, and others, its performance is derisked. Regarding security, leanVM is a jewel, a minimal zkVM crafted for end-to-end formal verification and maximum security.
Want to help? There are two $1M initiatives. First, the Proximity Prize (proximityprize[.]org). Solve a long-standing mathematical conjecture in coding theory, improve hash-based SNARKs, and go home a millionaire. Second, the Poseidon Initiative (poseidon-initiative[.]info), offers $1M for breaking Poseidon, the SNARK-friendly hash function.
Anthropic pays $750,000+ a year for engineers who can build LLM architectures from scratch. Stanford taught the entire thing in 1 hour lecture & released it for free.
Bookmark & watch this today before someone takes it down and read this article below
🔗 GitHub: https://t.co/gmMc7QE3Fs
---
✉️ If you’re into AI, ML, agents, and building real systems, join my newsletter (it’s free): https://t.co/zJ9uwd6qSd
A must-read survey to refresh math and gen AI basics → The Little Book of Generative AI Foundations: An Intuitive Mathematical Primer
It shows a clear walkthrough of how gen AI learns to understand, model, and create complex data, covering:
- Latent algebra foundations: PCA, SVD, autoencoders
- Latent models: PPCA and VAEs
- VAEs: ELBO, inference, reparameterization
- Diffusion: the way from noise → denoising
- Score-based and continuous-time generative modelling
- Density models: flows, autoregression
- GANs and energy-based models beyond likelihoods
Andrej Karpathy just explained the future of software engineering without directly saying it.
The best AI engineers are no longer “prompting.”
They’re building systems around the agents.
Karpathy’s biggest insight wasn’t:
“Claude can code.”
It was:
LLMs become dramatically better when you force them into disciplined workflows.
That’s why "CLAUDE.md" files are suddenly everywhere.
Not because they’re prompts.
Because they behave like an operating system for the agent.
Karpathy called out the exact problems with AI coding:
- models assume instead of asking
- they overengineer simple tasks
- they hide confusion
- they rewrite unrelated code
- they optimize for completion, not correctness
So developers started encoding rules directly into the workflow:
→ Think before coding
→ Simplicity first
→ Surgical edits only
→ Goal-driven execution
And the results are wild.
People are now running multiple Claude Code agents in parallel like engineering teams:
• one agent researching
• one debugging
• one writing tests
• one optimizing code
• one validating outputs
Not “AI assistance.”
Actual orchestration.
And this part from Karpathy changes everything:
“Don’t tell the model what to do. Give it success criteria and let it loop.”
That is the shift.
From:
“write this function”
To:
“here’s the goal, constraints, tests, and verification system — now iterate until correct.”
The craziest part?
This already feels like a phase shift in engineering.
A lot of developers quietly went from:
80% manual coding → to 80% agent-driven coding in just months.
Not because AI became perfect.
Because the leverage became impossible to ignore.
We’re entering an era where the highest leverage engineers won’t necessarily be the best coders.
They’ll be the people who build the best systems around AI agents.
Mejores cuentas de cada lab de IA para mantenerte siempre informado:
Anthropic
→ @karpathy → cuenta imprescindible en IA, acaba de unirse a Anthropic
→ @bcherny → creador de Claude Code, siempre comparte tips muy útiles
→ @trq212 → también desarrollador de Claude Code, escribe artículos increíbles sobre CC
OpenAI
→ @polynoamial → trabaja en investigación de razonamiento, comparte muchos detalles técnicos
→ @gabriel1 → desarrollador de Sora, trayectoria profesional muy interesante
→ @jxnlco → enfocado en experiencia de desarrollador, comparte mucho sobre Codex
Google AI
→ @OfficialLoganK → todas las actualizaciones importantes de Gemini y AI Studio
→ @ammaar → producto y diseño, comparte cosas geniales sobre vibe-coding en AI Studio
→ @fofrAI → casos de uso muy creativos con modelos generativos
Cursor
→ @leerob → la voz más activa detrás de las novedades de Cursor
→ @ericzakariasson → comparte muy buenos insights sobre cómo usar Cursor
→ @mntruell → CEO de Cursor, publica los lanzamientos y datos de uso más importantes
xAI
→ @milichab → se unió hace poco a xAI, comparte novedades sobre Grok
→ @skcd42 → también cubre los lanzamientos grandes de Grok
→ @elonmusk → Elon hace muy buen trabajo reposteando y dando visibilidad a todos los productos de xAI
Me he dejado a alguien?