Israel is committing some of the worst atrocities of our age, live-streamed on a daily basis.
The politicians and media outlets who facilitated this obscenity must be held to account.
Israel does this because it believes it will be allowed to. It is correct.
Voici l’image du cerveau la plus précise de l’Histoire, obtenue grâce au scanner IRM du CEA, le plus puissant au monde.
C’est une avancée majeure et un espoir immense pour l’étude de notre santé. Félicitations à l’équipe du projet Iseult.
Fierté française !
"The secret of happiness is: Find something more important than you are and dedicate your life to it."
Daniel Dennett (March 28, 1942 - April 19, 2024)
very saddened to hear of Dan Dennett's passing. he was a transformative influence on my thinking about a great many topics, and a kind and generous man. his presence will be missed dearly by a great many people.
A new experiment for English speakers with a personal computer and 15 minutes to spare : help us understand how shapes are encoded ! Click here
https://t.co/oq06iOL1w1
# explaining llm.c in layman terms
Training Large Language Models (LLMs), like ChatGPT, involves a large amount of code and complexity.
For example, a typical LLM training project might use the PyTorch deep learning library. PyTorch is quite complex because it implements a very general Tensor abstraction (a way to arrange and manipulate arrays of numbers that hold the parameters and activations of the neural network), a very general Autograd engine for backpropagation (the algorithm that trains the neural network parameters), and a large collection of deep learning layers you may wish to use in your neural network. The PyTorch project is 3,327,184 lines of code in 11,449 files.
On top of that, PyTorch is written in Python, which is itself a very high-level language. You have to run the Python interpreter to translate your training code into low-level computer instructions. For example the cPython project that does this translation is 2,437,955 lines of code across 4,306 files.
I am deleting all of this complexity and boiling the LLM training down to its bare essentials, speaking directly to the computer in a very low-level language (C), and with no other library dependencies. The only abstraction below this is the assembly code itself. I think people find it surprising that, by comparison to the above, training an LLM like GPT-2 is actually only a ~1000 lines of code in C in a single file. I am achieving this compression by implementing the neural network training algorithm for GPT-2 directly in C. This is difficult because you have to understand the training algorithm in detail, be able to derive all the forward and backward pass of backpropagation for all the layers, and implement all the array indexing calculations very carefully because you don’t have the PyTorch tensor abstraction available. So it’s a very brittle thing to arrange, but once you do, and you verify the correctness by checking agains PyTorch, you’re left with something very simple, small and imo quite beautiful.
Okay so why don’t people do this all the time?
Number 1: you are giving up a large amount of flexibility. If you want to change your neural network around, in PyTorch you’d be changing maybe one line of code. In llm.c, the change would most likely touch a lot more code, may be a lot more difficult, and require more expertise. E.g. if it’s a new operation, you may have to do some calculus, and write both its forward pass and backward pass for backpropagation, and make sure it is mathematically correct.
Number 2: you are giving up speed, at least initially. There is no fully free lunch - you shouldn’t expect state of the art speed in just 1,000 lines. PyTorch does a lot of work in the background to make sure that the neural network is very efficient. Not only do all the Tensor operations very carefully call the most efficient CUDA kernels, but also there is for example torch.compile, which further analyzes and optimizes your neural network and how it could run on your computer most efficiently. Now, in principle, llm.c should be able to call all the same kernels and do it directly. But this requires some more work and attention, and just like in (1), if you change anything about your neural network or the computer you’re running on, you may have to call different kernels, with different parameters, and you may have to make more changes manually.
So TLDR: llm.c is a direct implementation of training GPT-2. This implementation turns out to be surprisingly short. No other neural network is supported, only GPT-2, and if you want to change anything about the network, it requires expertise. Luckily, all state of the art LLMs are actually not a very large departure from GPT-2 at all, so this is not as strong of a constraint as you might think. And llm.c has to be additionally tuned and refined, but in principle I think it should be able to almost match (or even outperform, because we get rid of all the overhead?) PyTorch, with not too much more code than where it is today, for most modern LLMs.
And why I am working on it? Because it’s fun. It’s also educational, because those 1,000 lines of very simple C are all that is needed, nothing else. It's just a few arrays of numbers and some simple math operations over their elements like + and *. And it might even turn out to be practically useful with some more work that is ongoing.
Kinda interesting 🧐 that most of the DeSci OGs were neuroscientists before, there must be something going on with that 🤯🧠
@Shamburgularara@hebbianloop@oraclide
-> maybe because once you study the brain, everything else seems easier? 😂 (Quote)
I applied yesterday as a TA!🙂
I encourage to NOT MISS this opportunity w/ @neuromatch!🤗
I had a great time last summer being a Computational Neuroscience 🧠TA for PhD & MSc students from EMEA 🌍
Looking forward to being a Deep Learning TA this year! 🤖
New #opensource tracks: Climate Science 🌱 & NeuroAI⚡!
📣 Good news!
As negotiations between the 🇪🇺 EU and 🇨🇭 Switzerland have been officially launched today, researchers based in Switzerland can now once again apply for ERC grants.
Read more 👉 https://t.co/8mi6DAx9D0
LARPing sick 🤧🤒 from my bed
If you wanna cheer me up #Anons do you know any 📣
🎨 Artist who use #GenerativeAI
🤖 Researcher who use or research on #LLM
👾 Entrepreneur who uses LLMs for their company
⚖️ Attorneys specialized in IP, AI, Web3
LFG 🥷 pls share 🔁 and DM me
The Brain Institute is hiring a Research Coordinator. Please share this to spread the word! We will fully consider all applications received by March 31. Apply here: https://t.co/2df01Hqzv7
Very happy to be starting the Brain, Body and Technology Lab (BBT Lab) at the @DondersInst Sensorimotor Neuroscience Department.
New lab website is now online! https://t.co/nhJa6MDuTr
And I will be posting a PhD position for my ERC project soon. Watch this space!
We have 2 open calls for a fully-funded PhD in #NeuroAI at Ecole Normale Supérieure:
1. https://t.co/NBeWtDc0Xg w/ Pierre Bourdillon
2. https://t.co/ZWBzIXcSJE w/ Yair Lakretz.
Open call for 2023 internships (potentially followed by PhD) in our Brain & AI team at Meta:
https://t.co/Hu4nXkEQwt
Strong experience in {deep learning, signal processing, neuroscience} (2 out of 3) required.
Application from under-represented groups encouraged.
A PhD position in Artificial Intelligence and Cognitive NeuroScience
“Linking Linguistics and Brain Dynamics with Deep Language Models” at @ENS_ULM
Co-supervised with @JeanRemiKing