These GitHub repos have leveled up my AI skills more than any bootcamp, course, or conference.
If you're serious about AI, these are the ones worth bookmarking 👇
We've just published the Smol Training Playbook: a distillation of hard earned knowledge to share exactly what it takes to train SOTA LLMs ⚡️
Featuring our protagonist SmolLM3, we cover:
🧭 Strategy on whether to train your own LLM and burn all your VC money
🪨 Pretraining, aka turning a mountain of text into a fancy auto-completer
🗿How to sculpt base models with post-training alchemy
🛠️ The underlying infra and how to debug your way out of NCCL purgatory
Highlights from the post-training chapter in the thread 👇
I think everyone should watch @Karpathy's latest video on how he uses LLMs, even those who think AI is already a big part of their lives because:
1. One of the best minds in AI is spending time showing how he uses AI personally rather he could have spent his time building AGI , that's one of the reason to watch & learn
2. @karpathy clearly explains a first-principles approach to leveraging these tools & the visual explanation really helps you build a mental model for specific use cases.
below is short high level summary & some note's that I had taken from the video
- karpathy takes us through the practical applications of various tools with lots of examples & different settings one can play around with while using these tools
- he starts off with -> to get a general Idea about what are the best models at anytime , one can always look that up In https://t.co/WuMWeCp9u7 & @scale_AI seal leaderboard ( https://t.co/HOSlEwPr0d)
- he starts with @OpenAI's ChatGPT ( the OG & the most feature rich AI tool that Is available currently & the one that has been there for the longer time) which started the era of where one can give a text Input and get a text output via an Interface.
- my fav part was he explaining from first principles what exactly is happening under the hood when we Interact with Chatgpt - it's amazing how he visually thinks about this & presents It 😅
- difference between a pre-training stage vs the post-training stage (🙂) which is the final model that we get to interact with , basically a fine tuned version of the base model via SFT , RLH or RL -> all of this compressed Into a single zip file
- one cannot directly use the base model (pre-trained) since in this stage we optimise the model to predict the next token In the sequence , but during the post-training we can actually use this for real world applications as this model can now act as an assistant basically combining loads of knowledge with some style ,form, personality & yet this knowledge has a cut-off date.
Why Neuralink is the most important company this decade!
@neuralink is delivering life-changing value to humanity and is at an inflection point as goes from being a technology in the lab to people’s homes this year.
As an investor, we look for transformative innovations — where technology redefines our understanding of what’s possible. Neuralink is not just that life-changing innovation but will deliver uncapped upside potential from here.
Here’s Why 👇
#1 Where Are We Today?
Over the past century, the Brain Machine Interface (BMI) space has evolved from early breakthroughs such as the introduction of electroencephalography (EEG) in 1924 to groundbreaking animal trials like monkeys playing Pong in 2021. This year Neuralink’s N1 implant was successfully implanted in multiple humans residing in their homes.
This is a breakthrough for people with quadriplegia, restoring autonomy by converting neural signals from the motor cortex into digital commands. This allows users to engage with the digital world (move cursors and type through thought alone), communicate via text, email, speech and engage with creative outlets like art and music - ultimately delivering life changing human value.
Noland Arbaugh, the first participant in Neuralink’s clinical trial, now plays video games, live streams, and navigates his laptop—all through the power of his mind. Similarly, Alex, another participant, is designing complex 3D objects using CAD software, unlocking creative potential once thought inaccessible. These stories are not isolated; they are the first chapters in a revolution to restore independence to millions globally.
Reviews from Noland Arbaugh @ModdedQuad, Neuralink’s first participant:
- “(Neuralink) has helped me reconnect with the world, my friends, and my family. It's given me the ability to do things on my own again without needing my family at all hours of the day and night.”
- “The biggest thing with comfort is that I can lie in my bed and use (Neuralink). Any other assistive technology had to have someone else help or have me sit up. Sitting causes stress mentally and on my body which would give me pressure sores or spasms. It lets me live on my own time, not needing to have someone adjust me, etc. throughout the day.”
- “I think it should give a lot of people a lot of hope for what this thing can do for them”
#2 How does it work?
The N1 is the size of a coin, containing a microprocessor, Bluetooth transmitter, rechargeable battery, and ultra-thin wires with 1,024 electrodes distributed across 64 threads. These highly-flexible, ultra-thin threads are key to minimize damage during implantation. The threads of our implant are so fine that they can't be inserted by the human hand. A surgical robot has been designed to reliably and efficiently insert these threads exactly where they need to be.
Once inserted, the electrodes detect neural signals and translate them into digital commands, transmitted via Bluetooth to external devices. Thinking about moving a cursor triggers corresponding neurons in the motor cortex, enabling precise cursor movement on a screen.
#3 Why Now?
This year Neuralink’s patients achieved bandwidth that’s ~50% of a normal human via the N1 implant. It’s poised to meet human level bandwidth next year and the team aims to exceed it by the end of the decade. A quick breakdown on why bandwidth matters. Today’s intelligence operates on three levels:
1⃣ Limbic Intelligence (“thinking fast” – primal needs like safety and survival)
2⃣ Higher-Order Reasoning (“thinking slow” – complex thought and planning)
3⃣ Digital Intelligence (enabled by phones, internet, software)
Our reliance on digital intelligence is undeniable. Forgetting your phone feels like losing a limb. Yet, the bottleneck lies in the bandwidth—the rate at which information flows between biology and technology. Bandwidth is quantified in Bits Per Second (BPS), as popularized by the late Professor Krishna Shenoy:
1⃣ Average human output (no devices): 1 BPS
2⃣ Typing/speaking: ~20 BPS
3⃣ Peak human performance: ~40 BPS
With BMIs (Brain-Machine Interfaces):
- 2017: Achieved ~2-4 BPS
- 2021: Reached 6.56 BPS
- 2024: Neuralink’s N1 reached 9.5 BPS
- 2025: Neuralink targets 40 BPS
- By 2030: Neuralink targets 100 BPS
The advances next year will likely come from two main levers: 1) More electrodes (going from 1024 to 3000) and increased utilization of electrodes (currently N1 utilizes ~10% of the 1024 electrodes and is expected to go as high as 40% next year).
Achieving near-human bandwidth will help dramatically improve the quality of life for people with special abilities.
#4 What's Next?
1⃣ Scaling N1: Neuralink aims for 27 procedures in 2025 and 79 in 2026, expanding the number of users.
2⃣ Vision Restoration: FDA-approved BMI for vision restoration (“Blindsight”) interacts with the visual cortex to restore basic light and shape perception.
3⃣ Speech Restoration: Expanding to auditory cortex applications to restore speech capabilities.
Beyond Restoration:
Replace smartphones: BMIs could surpass traditional devices, becoming high-bandwidth interfaces for AI-human interaction.
Augmentation: Enhancing human cognition, memory, and productivity for 100M+ users.
From restoring autonomy for quadriplegics to enhancing human potential, Neuralink will revolutionize how we interact with machines, each other, and the world. It’s not just catching up to technology by defining the next frontier for humanity.
CC: @altcap@elonmusk@djseo_@chapman_bliss
Red teaming with people and AI helps to identify potential risks and issues with AI systems. It involves using people and AI to test AI systems for potential risks. https://t.co/WPE2fvMyXX #AI#Risk
Penno just released Penno Draft, no-sign-up note-taking tool that lets you start writing & collaborating instantly. No accounts, no fuss. Feels like Excalidraw, but for words! #productivity#Notebook#AI
https://t.co/FD8rLOGS1n
I've seen how people use ChatGPT and other LLMs. Some of the products out there are aiming at completely ignore human taste and creativity in writing. We need to change the philosophy. AI needs to augment human capability and not replace it!
We are excited to launch Penno. An AI-Powered writing assistant to make everyone significantly faster with ease. Become more productive by joining Penno today! https://t.co/7mzipZVOKG
#WritingCommunity#writing#WritingPrompts
These 94 lines of code are everything that is needed to train a neural network. Everything else is just efficiency.
This is my earlier project Micrograd. It implements a scalar-valued auto-grad engine. You start with some numbers at the leafs (usually the input data and the neural network parameters), build up a computational graph with operations like + and * that mix them, and the graph ends with a single value at the very end (the loss). You then go backwards through the graph applying chain rule at each node to calculate the gradients. The gradients tell you how to nudge your parameters to decrease the loss (and hence improve your network).
Sometimes when things get too complicated, I come back to this code and just breathe a little. But ok ok you also do have to know what the computational graph should be (e.g. MLP -> Transformer), what the loss function should be (e.g. autoregressive/diffusion), how to best use the gradients for a parameter update (e.g. SGD -> AdamW) etc etc. But it is the core of what is mostly happening.
The 1986 paper from Rumelhart, Hinton, Williams that popularized and used this algorithm (backpropagation) for training neural nets:
https://t.co/f52IcDNitR
micrograd on Github: https://t.co/GaTd16jRnB
and my (now somewhat old) YouTube video where I very slowly build and explain:
https://t.co/EPGG6kd5Yz
It's a mistake to try to raise money if you're not quite attractive enough to investors. They don't say no immediately. They suck up a lot of your time and hope, and then say no. It's a huge distraction and crushingly bad for morale.
I always wondered how my a career in venture capital will look like for myself. Here are some very good resources published by @jessywu95
https://t.co/QwpfWAjudD