@twominutepapers What a coincedence. Just now my codex is saying:
"Good. No code changes needed; just run the same benchmark with [...]"
What's this "Good." ! Get to work :-)
I'm thinking of doing an in-depth Python summer camp? (online live workshop, in English). If you're potentially interested, please add your email to the list: https://t.co/eHF8Plt9iX (it's just so I can measure interest and have a way to let you know once more info is up).
Just hit 500,000 Subscribers!
I never could have imagined (what really is my weekend hobby!) growing this much.
Getting really close to overflowing a signed 20 bit word!
There’s still so, so much to explore in the field of Computer Science. My biggest advice is go make the content that would have inspired a younger you!
No, it’s not my dayjob. As much as I get a kick out of people thinking that I’m some giant private equity production (lol), it’s just me, my small bedroom, and some cameras.
Thanks everyone for the support, I’m excited to keep sharing this field with you all.
Until next time,
LaurieWired out.
Still thinking about submitting to the RF Village CFP for #DEFCON34?
You have two weeks left. The deadline is June 9.
More info: https://t.co/U1TCrANcxN
#RFVillage
De les van de geblokkeerde Amerikaanse overname van DigiD-beheerder Solvinity: de Verenigde Staten worden niet langer automatisch gezien als een bevriende natie…
(I'm firmly on team red/green TDD for agent code, I like having a test suite that protects against them breaking old features when they make new changes - https://t.co/2owEbkEFDI)
I just can’t get over how neat CXL type 3 is.
Imagine having a 1TB bucket of memory.
But! Instead of 1TB of DDR5, you have a tiered CXL accelerator. To the OS, it *looks* like regular memory, you address it in the same way.
Maybe your accelerator is actually 100GB of DDR5, and ~1TB of high bandwidth flash. The first 100GB is your buffer, and a little controller slowly flushes it out.
Many, many workloads are not hammering RAM enough for you to notice.
Wait! You could get even more clever.
With regular memory, bouncing cachelines between CPU cores is annoying. Often, you’ll program your way around this (avoiding a shared counter) by having each thread maintain a temporary local state with occasional global syncs.
But, if we have a custom CXL 3 memory device, that slow global merge could be implemented in hardware instead. You’d never have to have cores fight over the same cacheline, because the shared-counter would be local to the CXL device!
Aka, a remote atomic!
This is essentially the concept of NDP (near-data processing), and of course there are much, much more fancy algorithms you can do with it, that’s just one example. But you can imagine, especially with database-style operations, how much bandwidth you could save not having to round-trip to the CPU and back for every operation.
Imagine if your RAM could run a regex for you! We’re getting really close to that world.
@JamesMcPherson@__tinygrad__@luigifcruz@comma_ai Wow. That is such a fantastic philosophy. So fantastic that this is the first thing i've read that has made me interested in buying a 3D printer :-)
Of course, you need contact with reality. While this project has elements of philosophy, it's not ungrounded. It's a functioning artifact that can run and train the latest models at decent speed on a huge variety of hardware without vendor code in sub 25k lines of pure Python.