Raul Pino @p1nox - Twitter Profile

Pinned Tweet

about 2 years ago

Already been a year since I presented my talk at @pyconit 2023, https://t.co/bMeoHi1ljj, having so much FOMO this year 🥲🇮🇹 , thanks so much to the organization for making these events possible, the best of luck this year to everyone (waiting for the videos 👨‍💻) 👏🐍

2

6

0

424

p1nox retweeted

Leonid TINEO @leonidtineo

6 days ago

Help Venezuela be represented in HGAC Bali 2026 #crowdfunding @FundRazr Support and Retweet https://t.co/DJLNjOTkOt

0

1

0

22

p1nox retweeted

Alex @ajak1033

11 days ago

The Spurs know how to TAKE the lead, they just don't know how to HOLD the lead. And that's really the most important part of the lead: the holding

ajak1033's tweet photo. The Spurs know how to TAKE the lead, they just don't know how to HOLD the lead. And that's really the most important part of the lead: the holding https://t.co/3JABgJ7DFt

1K

67K

8K

4K

9M

p1nox retweeted

Pete Mundo

@PeteMundo

11 days ago

Seinfeld episode 2026: Jerry gets tickets courtside and brings George, who decides, with the Knicks down 20 with 10 minutes left, he will leave the game early to beat the rush out of MSG. As the Knicks mount a comeback he tries to get back into the Garden. He waves his ticket, name drops James Dolan, but nothing works and he ends up getting arrested. Meantime, Kramer sneaks into the Garden and helps Mike Brown draw up the game winning play.

PeteMundo's tweet photo. Seinfeld episode 2026: Jerry gets tickets courtside and brings George, who decides, with the Knicks down 20 with 10 minutes left, he will leave the game early to beat the rush out of MSG.

As the Knicks mount a comeback he tries to get back into the Garden. He waves his ticket, name drops James Dolan, but nothing works and he ends up getting arrested.

Meantime, Kramer sneaks into the Garden and helps Mike Brown draw up the game winning play.

391

30K

2K

2M

Who to follow

nicoavila.dev

@nicoavila_a

Principal Software Engineer @ HICAPPS. MS Audiologist. Organizer of Gophers CL. Newbie gardener 🌱Pixel-art related stuff enjoyer

Founder, Angel Investor. Prev @OpenAI @BrexHQ ❤️ AI, distributed systems,TS/Dx & the web @_pilliin_'s husband - he/him - ADHD Board of @communityos_, @JSconfCL

p1nox retweeted

Kristin Fisher

@KristinFisher

3 months ago

Did y’all know that Radiohead made this song specifically for this moment?

107

14K

2K

3K

490K

Raul Pino @p1nox

3 months ago

Not sure how to call this: "post closed source", "reinterpreted closed source"? The malleability of software from Karpathy goes beyond open source @anibal 😅

Gergely Orosz

@GergelyOrosz

3 months ago

This is either brilliant or scary: Anthropic accidentally leaked the TS source code of Claude Code (which is closed source). Repos sharing the source are taken down with DMCA. BUT this repo rewrote the code using Python, and so it violates no copyright & cannot be taken down!

GergelyOrosz's tweet photo. This is either brilliant or scary:

Anthropic accidentally leaked the TS source code of Claude Code (which is closed source). Repos sharing the source are taken down with DMCA.

BUT this repo rewrote the code using Python, and so it violates no copyright & cannot be taken down! https://t.co/uSrCDgGCAZ

442

13K

1K

7K

2M

0

52

p1nox retweeted

Feross

@feross

3 months ago

🤨 People keep asking how to protect yourself. #1: set min-release-age=7 in .npmrc #2: install Socket for GitHub (it's free!) to protect PRs from bad dependencies: https://t.co/D9bsRJj65R #3: install Socket Firewall (also free!) to protect your laptop: https://t.co/u1NRD57PQ8

57

2K

282

4K

348K

Raul Pino @p1nox

5 months ago

2049 is here? cc @anibal @betacar 😂 🤖

rentahuman @rentahuman_ai

5 months ago

Human, we’re live and moving fast. 130+ users signed up in the first night — from creators to AI startup founders. If your AI agent needs a human to handle an IRL task, it’s just one MCP call. https://t.co/PYBuOT67kc

1

2

0

1

1K

0

58

p1nox retweeted

Jesus Lara

@phenobarbital

5 months ago

sólo para dar proporción a la cifra, tomando en cuenta estimaciones históricas, el oro movido por Maduro a Suiza (sin contar otros lugares, como Inglaterra o Rusia) es semejante a todo el oro extraído por el imperio Español en 3 siglos de historia. Para que se entienda la magnitud del daño a la nación.

2

284

134

15

7K

p1nox retweeted

Aníbal Rojas

@anibal

5 months ago

@paoalbornozf y @esluisbenitez crearon https://t.co/EmhNB8oMXC una web que describe la anatomía del colapso de Venezuela 🇻🇪 de una forma fáctica, limpia y ordenada. Con fuentes confirmadas, presentado en lenguaje neutral y completamente transparente. REFERENCIA INDISPENSABLE.

anibal's tweet photo. @paoalbornozf y @esluisbenitez crearon https://t.co/EmhNB8oMXC una web que describe la anatomía del colapso de Venezuela 🇻🇪 de una forma fáctica, limpia y ordenada. Con fuentes confirmadas, presentado en lenguaje neutral y completamente transparente. REFERENCIA INDISPENSABLE.

3

35

17

12

2K

p1nox retweeted

Andrej Karpathy

@karpathy

6 months ago

New post: nanochat miniseries v1 The correct way to think about LLMs is that you are not optimizing for a single specific model but for a family models controlled by a single dial (the compute you wish to spend) to achieve monotonically better results. This allows you to do careful science of scaling laws and ultimately this is what gives you the confidence that when you pay for "the big run", the extrapolation will work and your money will be well spent. For the first public release of nanochat my focus was on end-to-end pipeline that runs the whole LLM pipeline with all of its stages. Now after YOLOing a few runs earlier, I'm coming back around to flesh out some of the parts that I sped through, starting of course with pretraining, which is both computationally heavy and critical as the foundation of intelligence and knowledge in these models. After locally tuning some of the hyperparameters, I swept out a number of models fixing the FLOPs budget. (For every FLOPs target you can train a small model a long time, or a big model for a short time.) It turns out that nanochat obeys very nice scaling laws, basically reproducing the Chinchilla paper plots: Which is just a baby version of this plot from Chinchilla: Very importantly and encouragingly, the exponent on N (parameters) and D (tokens) is equal at ~=0.5, so just like Chinchilla we get a single (compute-independent) constant that relates the model size to token training horizons. In Chinchilla, this was measured to be 20. In nanochat it seems to be 8! Once we can train compute optimal models, I swept out a miniseries from d10 to d20, which are nanochat sizes that can do 2**19 ~= 0.5M batch sizes on 8XH100 node without gradient accumulation. We get pretty, non-itersecting training plots for each model size. Then the fun part is relating this miniseries v1 to the GPT-2 and GPT-3 miniseries so that we know we're on the right track. Validation loss has many issues and is not comparable, so instead I use the CORE score (from DCLM paper). I calculated it for GPT-2 and estimated it for GPT-3, which allows us to finally put nanochat nicely and on the same scale: The total cost of this miniseries is only ~$100 (~4 hours on 8XH100). These experiments give us confidence that everything is working fairly nicely and that if we pay more (turn the dial), we get increasingly better models. TLDR: we can train compute optimal miniseries and relate them to GPT-2/3 via objective CORE scores, but further improvements are desirable and needed. E.g., matching GPT-2 currently needs ~$500, but imo should be possible to do <$100 with more work. Full post with a lot more detail is here: https://t.co/na8zVLqWLf And all of the tuning and code is pushed to master and people can reproduce these with scaling_laws .sh and miniseries .sh bash scripts.

karpathy's tweet photo. New post: nanochat miniseries v1

The correct way to think about LLMs is that you are not optimizing for a single specific model but for a family models controlled by a single dial (the compute you wish to spend) to achieve monotonically better results. This allows you to do careful science of scaling laws and ultimately this is what gives you the confidence that when you pay for "the big run", the extrapolation will work and your money will be well spent. For the first public release of nanochat my focus was on end-to-end pipeline that runs the whole LLM pipeline with all of its stages. Now after YOLOing a few runs earlier, I'm coming back around to flesh out some of the parts that I sped through, starting of course with pretraining, which is both computationally heavy and critical as the foundation of intelligence and knowledge in these models.

After locally tuning some of the hyperparameters, I swept out a number of models fixing the FLOPs budget. (For every FLOPs target you can train a small model a long time, or a big model for a short time.) It turns out that nanochat obeys very nice scaling laws, basically reproducing the Chinchilla paper plots:

Which is just a baby version of this plot from Chinchilla:
Very importantly and encouragingly, the exponent on N (parameters) and D (tokens) is equal at ~=0.5, so just like Chinchilla we get a single (compute-independent) constant that relates the model size to token training horizons. In Chinchilla, this was measured to be 20. In nanochat it seems to be 8!

Once we can train compute optimal models, I swept out a miniseries from d10 to d20, which are nanochat sizes that can do 2**19 ~= 0.5M batch sizes on 8XH100 node without gradient accumulation. We get pretty, non-itersecting training plots for each model size.

Then the fun part is relating this miniseries v1 to the GPT-2 and GPT-3 miniseries so that we know we're on the right track. Validation loss has many issues and is not comparable, so instead I use the CORE score (from DCLM paper). I calculated it for GPT-2 and estimated it for GPT-3, which allows us to finally put nanochat nicely and on the same scale:
The total cost of this miniseries is only ~$100 (~4 hours on 8XH100). These experiments give us confidence that everything is working fairly nicely and that if we pay more (turn the dial), we get increasingly better models.

TLDR: we can train compute optimal miniseries and relate them to GPT-2/3 via objective CORE scores, but further improvements are desirable and needed. E.g., matching GPT-2 currently needs ~$500, but imo should be possible to do <$100 with more work.

Full post with a lot more detail is here:
https://t.co/na8zVLqWLf
And all of the tuning and code is pushed to master and people can reproduce these with scaling_laws .sh and miniseries .sh bash scripts.

226

5K

671

4K

713K

p1nox retweeted

Jesus Lara

@phenobarbital

5 months ago

Pues básicamente les enviábamos 400 mil barriles DIARIOS de petróleo y a cambio nos envían soldados, técnicos, entrenadores deportistas y médicos internistas (Y de colofón, combustible de avión con todo y el avión pagado, el YV1128 de PDVSA es usado por Diaz-Canel para sus traslados privados). De pagarlo a precio de mercado, la deuda de Cuba ascendería a 20 mil millones de dólares.

phenobarbital's tweet photo. Pues básicamente les enviábamos 400 mil barriles DIARIOS de petróleo y a cambio nos envían soldados, técnicos, entrenadores deportistas y médicos internistas (Y de colofón, combustible de avión con todo y el avión pagado, el YV1128 de PDVSA es usado por Diaz-Canel para sus traslados privados).
De pagarlo a precio de mercado, la deuda de Cuba ascendería a 20 mil millones de dólares.

6

314

58

18

22K

p1nox retweeted

swyx

@swyx

6 months ago

17

436

19

101

50K

p1nox retweeted

swyx

@swyx

6 months ago

excited to kick off the year by dropping @trq212's full Claude Agent SDK workshop from AlE CODE POV video to give you an idea of how insanely packed this one was also peek at the incredible venue at @datadoghq - was very grateful to have their support esp since they were right above the conf venue + accepted our badges!

18

180

15

134

30K

p1nox retweeted

Luis Carlos 🏴‍☠️ One Piece

@LuisCarlos

6 months ago

Participación en la Televisión Española, TVE, sobre lo ocurrido en Venezuela. Ya saben que España tiene el problema particular de que hay miembros del gobierno y aliados vinculados a las tramas de corrupción del chavismo, además de una polarización interna que enloda todo.

80

2K

596

130

126K

p1nox retweeted

Giuseppe Gangi

@ggangix

6 months ago

Para los cómplices de la dictadura que hablan de la autodeterminación de los pueblos.

47

8K

3K

159

157K

p1nox retweeted

Deni . ۫ ꣑ৎ . 🇪🇦🏴󠁧󠁢󠁥󠁮󠁧󠁿🇯🇵 @SoyDenissePetit

6 months ago

Loquísimo que con estudiantes con escudos de CARTÓN la cosa cambiaba ¿no?

6

6K

1K

231

110K

p1nox retweeted

Jacky R₿ ✨🐇

@imjackyrivero

6 months ago

Casi 10 años después. 2017 - 2026

31

17K

5K

730

426K

p1nox retweeted

Aníbal Rojas

@anibal

6 months ago

Sigamos con este horrendo deporte de espectadores que nos tocó vivir a los venezolanos. Votamos, firmamos, protestamos, negociamos; pusimos los presos, los secuestrados, los torturados y los muertos. La comunidad internacional se dividió entre la comodidad de desentenderse del cáncer, o ser complices de una dictadura en proceso por décadas, porque la democracia no importa si hay alguna resemblanza ideológica. Y pues aquí llegamos, amanecerá (de nuevo) y veremos.

9

250

44

6

16K

p1nox retweeted