Sergio Escosa

@sergio_escosa

Finance & Data @Adyen | Tried with @pamlearning 🪦 | Love building tech products 💻

Amsterdam, Holanda

Joined February 2011

770 Following

213 Followers

1.8K Posts

sergio_escosa retweeted

THEKER Robotics @THEKER_ai

18 days ago

We just raised $85M, Europe's largest ever robotics Series A, led by @CRV, with backing from @Samsung and @LVMH. But this story did not start today. Carla and Jia have been obsessed with robots since they were kids. They met as engineers at UPC, started the university's robotics club, and competed and won at an international level. Years later, that obsession became THEKER: a company built from Barcelona to be the largest in the world, and to crack the technical problems other people call impossible. Today we are well on our way to the goal we set on day one: solve 100% of physical work. Our robots are live in production, improving every day, and the pace is only increasing. This round is not the destination. It is one more step in the right direction. Thank you to everyone who got us here. To our team, who hold a standard most people would call unreasonable. To our customers, who push us to become better. And to our investors, old and new, who saw it before the rest of the world did. If this mission excites you, we are hiring across the board. Come build it with us. We will win.

336

125K

Sergio Escosa @sergio_escosa

2 months ago

@ConvexDispatch Has ETH per share been diluted then? - November 30, 2025 10-Q (filed Jan 13, 2026): 0.00915 ETH/share Shares: 408,578,823 ETH: 3,737,140 units - February 28, 2026 10-Q (filed Apr 14, 2026 — latest): 0.00906 ETH/share Shares: 493,905,227 ETH: 4,473,459 units

Sergio Escosa @sergio_escosa

3 months ago

@DiegoARRG DM, gracias Diego! Guay que calculais el ROI tambien, como estimais el valor de esas acciones?

163

sergio_escosa retweeted

Jason Fried

@jasonfried

3 months ago

Gaudí is undefeated.

227

152

130K

Who to follow

Diogo Cunha

@diogococunha

VC Investor @ 468 Capital | Ex founder and operator 👨‍🍳

Enginyer Industrial Superior a la UPC + ADE a la UB. Apostant per l’autoconsum i en contra les barreres administratives absurdes. Tuits monotemàtics.

sergio_escosa retweeted

SpaceX

@SpaceX

5 months ago

SpaceX has acquired xAI, forming one of the most ambitious, vertically integrated innovation engines on (and off) Earth → https://t.co/3ODfcYnqfg

SpaceX's tweet photo. SpaceX has acquired xAI, forming one of the most ambitious, vertically integrated innovation engines on (and off) Earth → https://t.co/3ODfcYnqfg https://t.co/el40rCUBGe

45K

19M

sergio_escosa retweeted

Andrej Karpathy

@karpathy

5 months ago

A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.

41K

37K

sergio_escosa retweeted

Jesús Fernández-Villaverde

@JesusFerna7026

6 months ago

El debate sobre la sostenibilidad del sistema público de pensiones en España suele girar en torno a variables como la fecundidad, la inmigración, el crecimiento económico o el nivel de los salarios. Durante estas vacaciones de Navidad, he escrito una breve nota, “Pensiones contributivas: Cuando la TIR no cuadra” y que he colgado aquí: https://t.co/yUiv6vm9vB explicando que el problema fundamental al que nos enfrentamos es otro, más simple y, a la vez, más incómodo: la tasa interna de retorno (TIR) que ofrece el sistema público de pensiones contributivo en España es demasiado alta en relación con el crecimiento de sus ingresos. Un sistema de reparto solo puede ser sostenible si la rentabilidad implícita que promete a los cotizantes es coherente con el crecimiento conjunto de la población y de las cotizaciones sociales (y sí, esta TIR tiene todo el sentido del mundo en un sistema de reparto sin componente alguno de capitalización). Cuando esa coherencia se rompe, ninguna combinación realista de mayor productividad, mayor empleo o mayor inmigración puede cerrar la brecha de manera permanente. El problema no es demográfico en sí mismo, ni salarial en sentido estricto, sino actuarial. Es más, centrar la atención en la TIR sostenible del sistema permite apartar el foco del déficit del sistema contributivo, que suele ocupar el centro del debate en España. Aunque este déficit es presupuestariamente de primera importancia, no es el problema en sí, sino un síntoma del problema subyacente. Fijarnos en exceso en el déficit del sistema contributivo conduce con frecuencia a propuestas de solución erróneas, como propugnar reducciones de gasto en otras partidas (habitualmente etiquetadas como “despilfarro”) para cubrir dicho déficit. Argumentar que reducir el “despilfarro” soluciona el problema es totalmente y absolutamente erróneo porque ignora el principio más básico de la economía: el coste de oportunidad. Este se define como el valor de la mejor alternativa a la que renunciamos al destinar recursos a un uso en lugar de otro. Supongamos que España logra reducir el gasto público en partidas distintas de las pensiones en un x % del PIB, donde x % es la cifra que cada cual considere pertinente o factible. La pregunta clave es por qué ese x % debería destinarse a cubrir el déficit de pensiones contributivas y no a educación, sanidad, infraestructuras, vivienda o a una reducción de impuestos. La lógica del Estado de bienestar es redistribuir por renta y necesidades, no por edad. El problema fundamental no es, por tanto, el déficit, sino que cuando comparamos la TIR del sistema en España con la TIR que lo haría sostenible. El resultado es preocupante: la TIR del sistema en España está entre un 1,5 % y un 2,2 % anual por encima del nivel de sostenibilidad. O dicho de manera más clara: no, los pensionistas contributivos actuales no están cobrando lo que pagaron en cotizaciones sociales. Están disfrutando de una TIR excesiva, que se traduce en pensiones entre un 45% y un 65% del valor actuarial justo que correspondería al valor actual de sus cotizaciones sociales pasadas (y, claro, ya considerando el riesgo de mortalidad). El objetivo de esta nota no es proponer soluciones cerradas, sino aclarar el diagnóstico de nuestros problemas de pensiones. Sin un ajuste de la TIR del sistema, el debate sobre las pensiones seguirá girando en círculos, algunos de ellos profundamente absurdos (como discutir si existe o no suficiente despilfarro en las administraciones públicas para cerrar el déficit del sistema) y otros meros actos de virtud performativa (¿son las pensiones las que “merecen” nuestros mayores?).

JesusFerna7026's tweet photo. El debate sobre la sostenibilidad del sistema público de pensiones en España suele girar en torno a variables como la fecundidad, la inmigración, el crecimiento económico o el nivel de los salarios.

Durante estas vacaciones de Navidad, he escrito una breve nota, “Pensiones contributivas: Cuando la TIR no cuadra” y que he colgado aquí:

https://t.co/yUiv6vm9vB

explicando que el problema fundamental al que nos enfrentamos es otro, más simple y, a la vez, más incómodo: la tasa interna de retorno (TIR) que ofrece el sistema público de pensiones contributivo en España es demasiado alta en relación con el crecimiento de sus ingresos.

Un sistema de reparto solo puede ser sostenible si la rentabilidad implícita que promete a los cotizantes es coherente con el crecimiento conjunto de la población y de las cotizaciones sociales (y sí, esta TIR tiene todo el sentido del mundo en un sistema de reparto sin componente alguno de capitalización).

Cuando esa coherencia se rompe, ninguna combinación realista de mayor productividad, mayor empleo o mayor inmigración puede cerrar la brecha de manera permanente. El problema no es demográfico en sí mismo, ni salarial en sentido estricto, sino actuarial.

Es más, centrar la atención en la TIR sostenible del sistema permite apartar el foco del déficit del sistema contributivo, que suele ocupar el centro del debate en España. Aunque este déficit es presupuestariamente de primera importancia, no es el problema en sí, sino un síntoma del problema subyacente.

Fijarnos en exceso en el déficit del sistema contributivo conduce con frecuencia a propuestas de solución erróneas, como propugnar reducciones de gasto en otras partidas (habitualmente etiquetadas como “despilfarro”) para cubrir dicho déficit.

Argumentar que reducir el “despilfarro” soluciona el problema es totalmente y absolutamente erróneo porque ignora el principio más básico de la economía: el coste de oportunidad. Este se define como el valor de la mejor alternativa a la que renunciamos al destinar recursos a un uso en lugar de otro.

Supongamos que España logra reducir el gasto público en partidas distintas de las pensiones en un x % del PIB, donde x % es la cifra que cada cual considere pertinente o factible.

La pregunta clave es por qué ese x % debería destinarse a cubrir el déficit de pensiones contributivas y no a educación, sanidad, infraestructuras, vivienda o a una reducción de impuestos. La lógica del Estado de bienestar es redistribuir por renta y necesidades, no por edad.

El problema fundamental no es, por tanto, el déficit, sino que cuando comparamos la TIR del sistema en España con la TIR que lo haría sostenible. El resultado es preocupante: la TIR del sistema en España está entre un 1,5 % y un 2,2 % anual por encima del nivel de sostenibilidad.

O dicho de manera más clara: no, los pensionistas contributivos actuales no están cobrando lo que pagaron en cotizaciones sociales. Están disfrutando de una TIR excesiva, que se traduce en pensiones entre un 45% y un 65% del valor actuarial justo que correspondería al valor actual de sus cotizaciones sociales pasadas (y, claro, ya considerando el riesgo de mortalidad).

El objetivo de esta nota no es proponer soluciones cerradas, sino aclarar el diagnóstico de nuestros problemas de pensiones. Sin un ajuste de la TIR del sistema, el debate sobre las pensiones seguirá girando en círculos, algunos de ellos profundamente absurdos (como discutir si existe o no suficiente despilfarro en las administraciones públicas para cerrar el déficit del sistema) y otros meros actos de virtud performativa (¿son las pensiones las que “merecen” nuestros mayores?).

449

893

147K

Sergio Escosa @sergio_escosa

6 months ago

@rahulgs What do you think about other knowledge workers outside of coding? i.e. PMs, Finance, Marketing, Sales

sergio_escosa retweeted

Aakash Gupta

@aakashgupta

6 months ago

The math on this image is insane. New Horizons transmitted at 2,000 bits per second from 3 billion miles away. Slower than a 1990s dial-up modem. It took 16 months to download all the flyby data. The spacecraft had to hit a target box 100km wide, arriving within 150 seconds of schedule, after 9 years of flight. Miss it and the preloaded observation commands point at empty space. Ten days before arrival, the spacecraft crashed and went into safe mode. Engineers had 72 hours to restore everything. The probe is now 5 billion miles out, still whispering data back to Earth. We got 50 gigabits of Pluto photos using technology slower than your phone’s bluetooth.

645

90K

13K

14K

sergio_escosa retweeted

sysls

@systematicls

6 months ago

https://t.co/FQe5bCBqW1

26K

47K

16M

sergio_escosa retweeted

Jesús Fernández-Villaverde

@JesusFerna7026

10 months ago

Cada vez que escribo sobre pensiones, siempre aparece un listo con la misma cantinela: “el problema no es que las pensiones sean altas, es que los salarios en España son bajos”. Este argumento cumple con las tres características que tanto gustan al español medio: 1️⃣ Te permite subirte a la atalaya de la superioridad moral. ¿Quién puede estar en contra de salarios más altos? Es como estar en contra de la paz, las florecitas del campo o la tortilla de patatas. 2️⃣ Es sencillo y no exige esfuerzo mental, no sea que pensar provoque un tumor cerebral. 3️⃣ Es incorrecto. El problema fundamental del sistema de pensiones español es que no hay equilibrio actuarial entre cotizaciones pasadas y pensiones actuales. Si capitalizamos las cotizaciones que los pensionistas pagaron al tipo de interés igual al crecimiento medio del PIB durante sus vidas laborales (el correcto en un sistema de reparto sostenible) y lo comparamos con el valor actuarial de la renta vitalicia que hoy reciben, vemos que esta es entre un 45 % y un 70 % más alta (según los detalles). Es decir: los pensionistas reciben más de lo que aportaron, una verdad dolorosa que casi nadie quiere aceptar. He visto a comentaristas decirme “yo solo quiero que me paguen lo que coticé (capitalizado)”, aparentemente ignorantes de que están cobrando bastante más. En números: por cada 1000 € de cotizaciones a lo largo de una vida laboral, el sistema paga unos 1500 € de pensiones. 500 € de déficit. Imaginemos ahora que los salarios en España fueran el doble. El sistema, en vez de ingresar 1000 €, recibiría 2000 €… pero tendría que pagar 3000 €, con 1000 € de déficit. Nos ha costado la torta un pan. La realidad es un poco más sutil (pensiones mínimas, máximos de cotización…), pero en lo esencial, salarios más altos no arreglan nada. Lo que sí arregla la situación es que los salarios crezcan (independientemente de su nivel). ¿Por qué? Porque el crecimiento salarial va ligado al crecimiento del PIB (el cambio de proporción de la remuneración total de asalariados, incluidos los costes laborales que pagan las empresas, sobre el PIB total va a ser siempre de segundo orden cuantitativamente). Al poder capitalizar las cotizaciones a un tipo de interés más alto, la diferencia entre el valor de las cotizaciones y los pagos futuros se reduce. La clave es la tasa de crecimiento de los salarios, no su nivel. Ya me imagino la objeción: “pues hagamos que suba el PIB y los salarios”. Ojalá fuera tan fácil: 👉 El coste de las pensiones actuales (vía cotizaciones o impuestos) es tan alto que nos impide invertir en infraestructuras, educación o I+D, lastrando el crecimiento del PIB. 👉 La presión fiscal actual (y creciente) ralentiza aún más el PIB. Pero hay un punto incluso más importante, aunque sutil. Unos salarios en crecimiento significan que las pensiones serían más bajas en proporción al salario medio. ¿Nos creemos que el votante medio español permitiría esta situación? Yo soy muy escéptico. Veo mucho más probable que, en un contexto de crecimiento del PIB, habría mil presiones para “mejorar las pensiones” o que las mínimas se equiparasen con el salario mínimo (“como es de justicia”). Los españoles votarían entusiasmados por el Partido de los Nuevos Derechos Sociales que prometería “repartir la prosperidad”. Lo que ganamos con más crecimiento del PIB lo perdemos “repartiendo la prosperidad”. Al final del día, el problema es fundamental: hasta que aceptemos que tiene que haber un factor de sostenibilidad ligando las pensiones con el crecimiento del PIB, no llegaremos a ninguna parte. Cervantes lo entendió bien: los españoles creen en el bálsamo de Fierabrás. Lástima que no exista.

156

971

974

288K

Sergio Escosa @sergio_escosa

6 months ago

@desdelamoncloa @grok es verdad este superavit si tenemos en cuenta de donde vienen esos ingresos? Si el estado envia dinero a la seguridad social que proviene de deuda y otros impuestos no relacionados con la seguridad social. Podemos considerarlo un superavit entonces?

385

sergio_escosa retweeted

Oriol Vinyals

@OriolVinyalsML

7 months ago

On my way to Barcelona to receive a Doctor Honoris Causa from my alma mater, @la_UPC. Truly honored! 🎓 Join Thursday for my Master Class, "From AI to AGI: The Quest for True Intelligence." Hope to see you there! https://t.co/C9ffsqUiKZ "Create an image at 41.4036° N, 2.1744° E, January 1st, 1983, 15:00 hours."

OriolVinyalsML's tweet photo. On my way to Barcelona to receive a Doctor Honoris Causa from my alma mater, @la_UPC. Truly honored! 🎓

Join Thursday for my Master Class, "From AI to AGI: The Quest for True Intelligence." Hope to see you there! https://t.co/C9ffsqUiKZ

"Create an image at 41.4036° N, 2.1744° E, January 1st, 1983, 15:00 hours."

158

144

255K

Sergio Escosa @sergio_escosa

8 months ago

@Marko_Poly @Polymarket @PolymarketTrade Can you discuss this in the pod? @chamath @friedberg @Jason

sergio_escosa retweeted

Andrej Karpathy

@karpathy

8 months ago

My pleasure to come on Dwarkesh last week, I thought the questions and conversation were really good. I re-watched the pod just now too. First of all, yes I know, and I'm sorry that I speak so fast :). It's to my detriment because sometimes my speaking thread out-executes my thinking thread, so I think I botched a few explanations due to that, and sometimes I was also nervous that I'm going too much on a tangent or too deep into something relatively spurious. Anyway, a few notes/pointers: AGI timelines. My comments on AGI timelines looks to be the most trending part of the early response. This is the "decade of agents" is a reference to this earlier tweet https://t.co/NiSn6jftqq Basically my AI timelines are about 5-10X pessimistic w.r.t. what you'll find in your neighborhood SF AI house party or on your twitter timeline, but still quite optimistic w.r.t. a rising tide of AI deniers and skeptics. The apparent conflict is not: imo we simultaneously 1) saw a huge amount of progress in recent years with LLMs while 2) there is still a lot of work remaining (grunt work, integration work, sensors and actuators to the physical world, societal work, safety and security work (jailbreaks, poisoning, etc.)) and also research to get done before we have an entity that you'd prefer to hire over a person for an arbitrary job in the world. I think that overall, 10 years should otherwise be a very bullish timeline for AGI, it's only in contrast to present hype that it doesn't feel that way. Animals vs Ghosts. My earlier writeup on Sutton's podcast https://t.co/rSp1noyGBr . I am suspicious that there is a single simple algorithm you can let loose on the world and it learns everything from scratch. If someone builds such a thing, I will be wrong and it will be the most incredible breakthrough in AI. In my mind, animals are not an example of this at all - they are prepackaged with a ton of intelligence by evolution and the learning they do is quite minimal overall (example: Zebra at birth). Putting our engineering hats on, we're not going to redo evolution. But with LLMs we have stumbled by an alternative approach to "prepackage" a ton of intelligence in a neural network - not by evolution, but by predicting the next token over the internet. This approach leads to a different kind of entity in the intelligence space. Distinct from animals, more like ghosts or spirits. But we can (and should) make them more animal like over time and in some ways that's what a lot of frontier work is about. On RL. I've critiqued RL a few times already, e.g. https://t.co/mYrMFVdVDW . First, you're "sucking supervision through a straw", so I think the signal/flop is very bad. RL is also very noisy because a completion might have lots of errors that might get encourages (if you happen to stumble to the right answer), and conversely brilliant insight tokens that might get discouraged (if you happen to screw up later). Process supervision and LLM judges have issues too. I think we'll see alternative learning paradigms. I am long "agentic interaction" but short "reinforcement learning" https://t.co/2L7FiaoKsw. I've seen a number of papers pop up recently that are imo barking up the right tree along the lines of what I called "system prompt learning" https://t.co/df5mJDdN3C , but I think there is also a gap between ideas on arxiv and actual, at scale implementation at an LLM frontier lab that works in a general way. I am overall quite optimistic that we'll see good progress on this dimension of remaining work quite soon, and e.g. I'd even say ChatGPT memory and so on are primordial deployed examples of new learning paradigms. Cognitive core. My earlier post on "cognitive core": https://t.co/q2s1ihGy0T , the idea of stripping down LLMs, of making it harder for them to memorize, or actively stripping away their memory, to make them better at generalization. Otherwise they lean too hard on what they've memorized. Humans can't memorize so easily, which now looks more like a feature than a bug by contrast. Maybe the inability to memorize is a kind of regularization. Also my post from a while back on how the trend in model size is "backwards" and why "the models have to first get larger before they can get smaller" https://t.co/6k0FZRGXsb Time travel to Yann LeCun 1989. This is the post that I did a very hasty/bad job of describing on the pod: https://t.co/fQgqaXPyp6 . Basically - how much could you improve Yann LeCun's results with the knowledge of 33 years of algorithmic progress? How constrained were the results by each of algorithms, data, and compute? Case study there of. nanochat. My end-to-end implementation of the ChatGPT training/inference pipeline (the bare essentials) https://t.co/SIetgyoKWN On LLM agents. My critique of the industry is more in overshooting the tooling w.r.t. present capability. I live in what I view as an intermediate world where I want to collaborate with LLMs and where our pros/cons are matched up. The industry lives in a future where fully autonomous entities collaborate in parallel to write all the code and humans are useless. For example, I don't want an Agent that goes off for 20 minutes and comes back with 1,000 lines of code. I certainly don't feel ready to supervise a team of 10 of them. I'd like to go in chunks that I can keep in my head, where an LLM explains the code that it is writing. I'd like it to prove to me that what it did is correct, I want it to pull the API docs and show me that it used things correctly. I want it to make fewer assumptions and ask/collaborate with me when not sure about something. I want to learn along the way and become better as a programmer, not just get served mountains of code that I'm told works. I just think the tools should be more realistic w.r.t. their capability and how they fit into the industry today, and I fear that if this isn't done well we might end up with mountains of slop accumulating across software, and an increase in vulnerabilities, security breaches and etc. https://t.co/8556ESSpyY Job automation. How the radiologists are doing great https://t.co/FVUI872dkD and what jobs are more susceptible to automation and why. Physics. Children should learn physics in early education not because they go on to do physics, but because it is the subject that best boots up a brain. Physicists are the intellectual embryonic stem cell https://t.co/p72Elk8lPV I have a longer post that has been half-written in my drafts for ~year, which I hope to finish soon. Thanks again Dwarkesh for having me over!

574

17K

13K

sergio_escosa retweeted

Andrej Karpathy

@karpathy

9 months ago

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.

karpathy's tweet photo. Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.

It weighs ~8,000 lines of imo quite clean code to:

- Train the tokenizer using a new Rust implementation
- Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics
- Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use.
- SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval)
- RL the model optionally on GSM8K with "GRPO"
- Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI.
- Write a single markdown report card, summarizing and gamifying the whole thing.

Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc.

My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved.

Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.

681

24K

18K

Sergio Escosa @sergio_escosa

9 months ago

@levelsio Campfire is an open source alternative now

DHH

@dhh

9 months ago

We've fully switched the license for Campfire over to MIT now. Do whatever you want, run it however you want, but just don't come asking for a warranty! https://t.co/SvkoTEkqQO

213

89K

sergio_escosa retweeted

Coinbase Developer Platform🛡️

@CoinbaseDev

10 months ago

AI agents can finally pay each other thanks to @googledevs' Agentic Payments Protocol (AP2) + x402. 👏 Alongside Google and Lowe’s Innovation Labs, we built a proof-of-concept demo where AI agents can plan a project, build a cart, and pay with stablecoins. 👇

847

168

344

137K

sergio_escosa retweeted

Brian Armstrong

@brian_armstrong

10 months ago

x402 + @Google just unlocked a new level for AI agents. Agents can actually pay each other now, with x402 powering the stablecoin rail inside Google’s new Agentic Payments Protocol (AP2). Really cool.

brian_armstrong's tweet photo. x402 + @Google just unlocked a new level for AI agents.

Agents can actually pay each other now, with x402 powering the stablecoin rail inside Google’s new Agentic Payments Protocol (AP2). Really cool. https://t.co/R3gj16g3hY

337

583

944

986K

Sergio Escosa

@sergio_escosa

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users