These 94 lines of code are everything that is needed to train a neural network. Everything else is just efficiency.
This is my earlier project Micrograd. It implements a scalar-valued auto-grad engine. You start with some numbers at the leafs (usually the input data and the neural network parameters), build up a computational graph with operations like + and * that mix them, and the graph ends with a single value at the very end (the loss). You then go backwards through the graph applying chain rule at each node to calculate the gradients. The gradients tell you how to nudge your parameters to decrease the loss (and hence improve your network).
Sometimes when things get too complicated, I come back to this code and just breathe a little. But ok ok you also do have to know what the computational graph should be (e.g. MLP -> Transformer), what the loss function should be (e.g. autoregressive/diffusion), how to best use the gradients for a parameter update (e.g. SGD -> AdamW) etc etc. But it is the core of what is mostly happening.
The 1986 paper from Rumelhart, Hinton, Williams that popularized and used this algorithm (backpropagation) for training neural nets:
https://t.co/f52IcDNitR
micrograd on Github: https://t.co/GaTd16jRnB
and my (now somewhat old) YouTube video where I very slowly build and explain:
https://t.co/EPGG6kd5Yz
# on shortification of "learning"
There are a lot of videos on YouTube/TikTok etc. that give the appearance of education, but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are learning (but actually they are just having fun). The people creating this content also enjoy it because fun has a much larger audience, fame and revenue. But as far as learning goes, this is a trap. This content is an epsilon away from watching the Bachelorette. It's like snacking on those "Garden Veggie Straws", which feel like you're eating healthy vegetables until you look at the ingredients.
Learning is not supposed to be fun. It doesn't have to be actively not fun either, but the primary feeling should be that of effort. It should look a lot less like that "10 minute full body" workout from your local digital media creator and a lot more like a serious session at the gym. You want the mental equivalent of sweating. It's not that the quickie doesn't do anything, it's just that it is wildly suboptimal if you actually care to learn.
I find it helpful to explicitly declare your intent up front as a sharp, binary variable in your mind. If you are consuming content: are you trying to be entertained or are you trying to learn? And if you are creating content: are you trying to entertain or are you trying to teach? You'll go down a different path in each case. Attempts to seek the stuff in between actually clamp to zero.
So for those who actually want to learn. Unless you are trying to learn something narrow and specific, close those tabs with quick blog posts. Close those tabs of "Learn XYZ in 10 minutes". Consider the opportunity cost of snacking and seek the meal - the textbooks, docs, papers, manuals, longform. Allocate a 4 hour window. Don't just read, take notes, re-read, re-phrase, process, manipulate, learn.
And for those actually trying to educate, please consider writing/recording longform, designed for someone to get "sweaty", especially in today's era of quantity over quality. Give someone a real workout. This is what I aspire to in my own educational work too. My audience will decrease. The ones that remain might not even like it. But at least we'll learn something.
15 years ago my PhD advisor taught me One Weird Trick for editing your own writing. Edit **back to front**, paragraph by paragraph. I still use it and it still surprises me how well it works. When I get my students to do it, it often blows their minds. Try it!
🐍🔖 Advanced Python Tutorials — Here you'll find Python tutorials that teach you advanced concepts so you can be on your way to become a master of the Python programming language.
https://t.co/0PjjvAK0ra
"NLP with Deep Learning" by @stanfordnlp is free and open!
Find links on:
- lecture slides
- lecture videos
- notes
- codes
- suggested readings
All is here⬇️
https://t.co/9juSBrgNWf
You can learn Computer Science at Harvard University for free.
The course covers:
1. Algorithms
2. Data Structures
3. Resource management
4. Security
5. Software engineering
6. Web development
7. Artificial Intelligence
Link: https://t.co/W5D9CqSRTT
Yann LeCun’s @ylecun Deep Learning Course is now free & fully online at @NYUDataScience
Videos, slides, notes, and notebooks!
https://t.co/yKPIwp7vuT
🎓 ML YouTube Courses 🎓
In case you missed it, I maintain a highly-curated collection of some of the best and latest machine learning courses available on YouTube. So much good free content to get started with or to catch up on.
https://t.co/C1Aw41PMEm
This repository is really on fire!
123k ⭐️
All Algorithms implemented in Python
I sometimes like to browse through it to refresh my knowledge about Algorithms & Data Structures. But it also contains nice algorithms for ML, Linear Algebra, and much more!
https://t.co/S7PRUCuWgX
Principal Component Analysis (PCA) implemented from scratch in 30 lines of Python code:
It's an important technique in Machine Learning.
It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components: