Token costs are becoming one of the hottest topics for any enterprise I talk with right now. It’s very bullish for AI in general because it means these systems are being used at a scale that wasn’t contemplated before.
It also gives way to another form of differentiation that will emerge for the applied AI layer, which is model routing.
As tokens take on a significant amount of the cost of any given workflow, then companies will inevitably want to ensure that their dollars go into the most efficient use of tokens for the particular job at hand.
Frontier intelligence will always be relevant at the high end of tasks, like coding, legal and financial analysis, healthcare, and more. And dollars spent here will only go up over time. But, equally, you can peel off individual tasks to lower cost models (whether they’re from open weights vendors or the major labs) and deliver a more efficient end outcome.
To do this effectively, the applied AI layer needs to understand the workflows in their domain better than anyone else, and be able to mix and match models to different jobs. If you’re doing document extraction, you need to know which models perform better or worse for any given document type. If you’re legal analysis, you want to know which models perform various types of tasks best. And so on.
This will become one of the bigger differentiation points over time. The companies with the best evals, the best ability to route the workloads, and those that have business models directly aligned to customers financial goals, will be in a great position.
Hace unos dias todos se hicieron eco de la mentira de Microsoft cortando acceso a Claude, cuando la realidad era que cortaba el acceso interno a Claude Code para que los devs usen GitHub Copilot... hoy lanzaron MAI-Code-1
En ese momento dije en mi blog que esa estrategia:
"- Evita la desintermediación: Si los ingenieros usan la interfaz nativa de Claude, Anthropic captura los patrones de uso, el feedback continuo, la experiencia de usuario (UX) y el flujo de trabajo. En definitiva, Anthropic se convierte en la plataforma y relega al proveedor de software a un rol secundario.
- Consolida el ecosistema propio: Al integrar el modelo de Anthropic dentro del harness de GitHub Copilot, Microsoft acelera el dogfooding de su propia herramienta. Necesitan escala y fricción real interna para pulir la capa corporativa que después venden al mercado."
Hoy Microsoft dice:
"Coding models are most useful when they perform well in the same environment developers use every day. That is why we built MAI-Code-1-Flash with production workflows at the center, rather than optimizing only for benchmarks. The model was trained directly with GitHub Copilot harnesses used in production. This allows it to learn how to interact with surrounding tools and systems in agentic coding tasks, making it uniquely well suited to real-world Copilot workflows compared to other available models."
Y encima: "t’s leaner, solving harder problems with up to 60% fewer tokens on SWE-Bench Verified, proving that higher accuracy and greater efficiency are no longer a trade-off."
De nuevo, si piensan a Anthropic o OpenAI como modelos y no plataformas estan mirando la decima parte de la imagen.
🎙️@rauchg CON NAVAL: 15 FRASES POTENTES
“La tarea del ingeniero se está convirtiendo en otra cosa”
“Estoy completamente obsesionado con esta idea de fábricas de software”
“Ahora ya no evalúo si alguien produce una tarea. Evalúo si construye la fábrica que multiplica resultados”
“Antes se hablaba de ingenieros 10x. Ahora claramente existen ingenieros 100x o 1000x”
“La calidad del re-prompting es extremadamente importante”
“Los modelos dejaron de ser ingenieros junior. Ahora son ingenieros principales”
“Respeto mucho más a los modelos como pares intelectuales”
“El ser humano todavía completa al modelo. La pregunta es cuándo va a pasar al revés”
“La web se está volviendo nativa para agentes”
“La categoría de software de infraestructura y bloques reutilizables para agentes va a ser extremadamente valiosa”
“No querés gastar un trillón de tokens recreando algo que ya existe”
“Cualquiera que entienda APIs, flujos de datos e inputs y outputs ya tiene una ventaja enorme”
“Un buen líder de ingeniería ya hacía vibe coding con personas. Ahora hacemos lo mismo, pero con agentes”
“Los modelos ahora vuelven con trade-offs y decisiones arquitectónicas”
“Lo realmente impresionante es cuando el modelo te dice: ‘No, no uses eso. Hay una mejor arquitectura para este problema’”
"Somos lo suficientemente listos como para inventar la #InteligenciaArtificial, tan tontos como para necesitarla, y tan estúpidos que no podemos averiguar si hicimos lo correcto" #JerrySeinfeld#FallonTonight
The CEO of Take-Two, the company behind GTA, just said something the entire AI industry doesn't want to hear.
And he said it without being anti-AI.
Strauss Zelnick's argument is precise. AI is built on datasets. Datasets are backward-looking. Creativity is forward-looking. A model trained on everything that already exists cannot, by definition, produce something genuinely unexpected. And all hits, by their very nature, are unexpected.
Asset creation and hit creation are not the same thing. AI is getting very good at the first one. The second one is what actually makes money, builds franchises, and changes culture. Nobody has shown AI can do that yet.
The derivative property problem is real. You can clone GTA with existing technology. You could do it before AI. It would take 3 years and look identical. It still wouldn't sell. Because it isn't GTA. It's a clone of GTA.
And consumers, despite what the industry occasionally pretends, can feel the difference between something genuinely new and something assembled from the residue of things that already worked.
Thousands of mobile games ship every year. 0 to 5 hits get made. The same studios make them every time. The technology to make more games has been commoditized for years. It didn't democratize hit creation. It just flooded the market with more forgettable product.
The Silicon Valley thesis that AI unlocks game creation for everyone is true in the same way that cheap cameras unlocked filmmaking for everyone. They did. And the same 5 studios still make the movies everyone watches.
What Zelnick is saying, without quite saying it, is that the thing AI cannot replicate is taste. The instinct for what hasn't been done yet. The cultural antenna that detects the gap in the market before the data can see it.
Data tells you what people wanted. Hits tell people what they want next.
Those are different jobs.
Qué opina de los FDE don @alangosiker ?
Declaración oficial de que el 'human in the loop' tiene para un rato? Embellecimiento de las marcas de los hyperscalers para que no los puteen tanto?
Today we’re launching the OpenAI Deployment Company to help businesses build and deploy AI.
It's majority-owned and controlled by OpenAI. It brings together 19 leading investment firms, consultancies, and system integrators to help organizations deploy frontier AI to production for business impact. https://t.co/GnyjGFaLLA