I'm joining @OpenAI to bring agents to everyone. @OpenClaw is becoming a foundation: open, independent, and just getting started.🦞
https://t.co/XOc7X4jOxq
Agent autonomy isn't binary—it's a spectrum from L1 copilots to L5 self-evolving systems. Unlock enterprise AI maturity: https://t.co/LLWJ3XuTvK #AgenticAI#AI
2026 career advice
Warren Buffett:
"Should you find yourself in a chronically leaking boat, energy devoted to changing vessels is likely to be more productive than energy devoted to patching leaks".
As you approach 2026, realistically evaluate the boat you're on and ensure it's worth salvaging. high.
🚀 Building agentic AI apps? (You know, the ones that actually do stuff autonomously?)
Fellow devs : How are you tackling evals? Benchmarks? Real-world simulations? Custom success metrics? Spill the beans—what's working (or hilariously failing) for you? #AgenticAI
Here’s why context engineering is such a big deal.
We just spent 2 hours debating when an agent should rely on its internal knowledge vs. trying to find relevant context within data for just one type of question. We got through 2 test cases of hundreds.
Even the people involved in the brainstorm couldn’t all agree on what they would expect humans to do in this situation. There truly was no right answer, and it’s always context specific customer by customer.
Everything in context engineering is a tradeoff between a variety of factors: how fast do you want the agent to answer a question, how much back and forth interaction do you want to require for the user, how much work should it do before trying to answer a question, how does it know it has the exhaustive source material to answer the question, what’s the risk level of the wrong answer, and so on.
Every decision you make on one of these dimensions has a consequence on the other end. There’s no free lunch. This is why building AI agents is so wild.
It also highlights how much value there is above the LLM layer. Getting these decisions right directly relates to the quality of the value proposition.
Last week the Nobel Prize in Economics went to Joel Mokyr, Philippe Aghion, and Peter Howitt for showing that real growth comes from creative destruction, not incremental innovation.
AI will test the same truth: the winners won’t automate work. They’ll reinvent it.
- you:
- want to actually learn how LLMs work
- sick of “just start with linear algebra and come back in 5 years”
- decide to build my own roadmap
- no fluff. no detours. no 200-hour generic ML playlists
- just the stuff that actually gets you from “what’s a token?” to “I trained a mini-GPT with LoRA adapters and FlashAttention”
- goal: build, fine-tune, and ship LLMs
- not vibe with them. not "learn the theory" forever
- build them
- you will:
- build an autograd engine from scratch
- write a mini-GPT from scratch
- implement LoRA and fine-tune a model on real data
- hate CUDA at least once
- cry
- keep going
- 5 phases
- if you already know something? skip
- if you're lost? rewatch
- if you’re stuck? use DeepResearch
- this is a roadmap, not a leash
- by the end: you either built the thing or you didn’t
- phase 0: foundations
- if matrix multiplication is scary, you’re not ready yet
- watch 3Blue1Brown’s linear algebra series
- MIT 18.06 with Strang, yes, he’s still the GOAT
- code Micrograd from scratch (Karpathy)
- train a mini-MLP on MNIST
- no frameworks, no shortcuts, no mercy
- phase 1: transformers
- the name is scary
- it’s just stacked matrix multiplies and attention blocks
- Jay Alammar + 3Blue1Brown for the “aha”
- Stanford CS224N for the theory
- read "Attention Is All You Need" only AFTER building mental models
- Karpathy's "Let's Build GPT" will break your brain in a good way
- project: build a decoder-only GPT from scratch
- bonus: swap tokenizers, try BPE/SentencePiece
- phase 2: scaling
- LLMs got good by scaling, not magic
- Kaplan paper -- Chinchilla paper
- learn Data, Tensor, Pipeline parallelism
- spin up multi-GPU jobs using HuggingFace Accelerate
- run into VRAM issues
- fix them
- welcome to real training hell
- phase 3: alignment & fine-tuning
- RLHF: OpenAI blog -- Ouyang paper
- SFT -- reward model -- PPO (don’t get lost here)
- Anthropic's Constitutional AI = smart constraints
- LoRA/QLoRA: read, implement, inject into HuggingFace models
- fine-tune on real data
- project: fine-tune gpt2 or distilbert with your own adapters
- not toy examples. real use cases or bust
- phase 4: production
- this is the part people skip to, but you earned it
- inference optimization: FlashAttention, quantization, sub-second latency
- read the paper, test with quantized models
- resources:
- math/coding:
- 3Blue1Brown, MIT 18.06, Goodfellow’s book
- PyTorch:
- Karpathy, Zero to Mastery
- transformers:
- Alammar, Karpathy, CS224N, Vaswani et al
- scaling:
- Kaplan, Chinchilla, HuggingFace Accelerate
- alignment:
- OpenAI, Anthropic, LoRA, QLoRA
- inference:
- FlashAttention
- the endgame:
- understand how these models actually work
- see through hype
- ignore LinkedIn noise
- build tooling
- train real stuff
- ship your own stack
- look at a paper and think “yeah I get it”
- build your own AI assistant, infra, whatever
- make it all the way through?
- ship something real?
- DM me.
- I wanna see what you built.
- happy hacking.
Coding was never really about writing code.
It was about breaking a problem into steps, imagining the edge cases, and knowing where trade-offs live.
AI just took away the boring part.
What’s left is the real work: thinking clearly.
- you are
- a random CS grad with 0 clue how LLMs work
- get tired of people gatekeeping with big words and tiny GPUs
- decide to go full monk mode
- 2 years later i can explain attention mechanisms at parties and ruin them
- here’s the forbidden knowledge map
- top to bottom, how LLMs *actually* work
- start at the beginning
- text → tokens
- tokens → embeddings
- you are now a floating point number in 4D space
- vibe accordingly
- positional embeddings:
- absolute: “i am position 5”
- rotary (RoPE): “i am a sine wave”
- alibi: “i scale attention by distance like a hater”
- attention is all you need
- self-attention: “who am i allowed to pay attention to?”
- multihead: “what if i do that 8 times in parallel?”
- QKV: query, key, value
- sounds like a crypto scam
- actually the core of intelligence
- transformers:
- take your inputs
- smash them through attention layers
- normalize, activate, repeat
- dump the logits
- congratulations, you just inferred a token
- sampling tricks for the final output:
- temperature: how chaotic you want to be
- top-k: only sample from the top K options
- top-p: sample from the smallest group of tokens whose probabilities sum to p
- beam search? never ask about beam search
- kv cache = cheat code
- saves past keys & values
- lets you skip reprocessing old tokens
- turns a 90B model from “help me I’m melting” to “real-time genius”
- long context hacks:
- sliding window: move the attention like a scanner
- infini attention: attend sparsely, like a laser sniper
- memory layers: store thoughts like a diary with read access
- mixture of experts (MoE):
- not all weights matter
- route tokens to different sub-networks
- only activate ~3B params out of 80B
- “only the experts reply” energy
- grouped query attention (GQA):
- fewer keys/values than queries
- improves inference speed
- “i want to be fast without being dumb”
- normalization & activations:
- layernorm, RMSnorm
- gelu, silu, relu
- they all sound like failed Pokémon
- but they make the network stable and smooth
- training goals:
- causal LM: guess the next word
- masked LM: guess the missing word
- span prediction, fill-in-the-middle, etc
- LLMs trained on the art of guessing and got good at it
- tuning flavors:
- finetuning: new weights
- instruction tuning: “please act helpful”
- rlhf: reinforcement from vibes and clickbait prompts
- dpo: direct preference optimization — basically “do what humans upvote”
- scaling laws:
- more data, more parameters, more compute
- loss goes down predictably
- intelligence is now a budget line item
- bonus round:
- quantization:
- post-training quantization (PTQ)
- quant-aware training (QAT)
- models shrink, inference gets cheaper
- gguf, awq, gptq — all just zip files with extra spice
- training vs inference stacks:
- deepspeed, megatron, fschat — for pain
- vllm, tgi, tensorRT-LLM — for speed
- everyone has a repo
- nobody reads the docs
- synthetic data:
- generate your own training set
- model teaches itself
- feedback loop of knowledge and hallucination
- welcome to the ouroboros era
- final boss secret:
- you can learn *all of this* in ~2 years
- no PhD
- no 10x compute
- just relentless curiosity, good bookmarks, and late nights
- the elite don’t want you to know this
- but now that you do
- choose to act
- start now
- build the models
There’s going to be split between two types of teams or companies for the foreseeable future.
Those that re-engineer their processes to take full advantage of AI agents with their given limitations, and those that wait until they’re good enough to not re-engineer anything.
To take full advantage of AI agents today, your workflows must be designed around the idea that AI agents need a lot context to be effective. By default you have a super-intelligent worker but they have no idea who they work for, what their job is, what the best practices are, what the guidelines are, how to work with the right data, and so on.
Most AI agent failures are just wishing this wasn’t true, and imagining AI will just figure all of these things out on its own. This won’t happen anytime soon for a variety of reasons.
The companies and teams that retool their workflows to get agents the right context will be ones that actually can get the most gains from agents right now.
But this will look very different from how most teams work right now. It will mean having well documented processes, data that is set up to actually get to an agent easily, hyper precise goals and prompts, and ultimately mindset that the new human in the loop element is not being involved in every single step of an agent, but editing and reviewing its final output.
The companies and teams that started thinking this way will be able to take advantage of agents right away, and they’ll blow past the ones that don’t.
Bengaluru’s roads are no longer just an inconvenience they are a full-blown danger to life. The crater-filled, slushy Panathur–Balagere stretch is a perfect example of the government’s negligence. This morning, a school bus carrying around 20 children nearly toppled over. The terrified kids had to be rescued through the back door.
What makes it worse? Just a few months ago, the Chief Minister and Deputy Chief Minister personally visited and “inspected” this very road. Photo-ops were held, promises were made and yet nothing was fixed. Taxpayers continue to fund a system that delivers press conferences instead of safe infrastructure.
How many close calls like this do we need before someone is held accountable? How many times will parents have to pray their children reach school alive? When crores are collected through taxes and cesses, but basic road safety is ignored, it’s not just bad governance it’s betrayal. Bengaluru’s citizens are entitled to more than excuses. They deserve safe, motorable roads and a government that treats public safety as a duty, not an afterthought.
#bengaluru #bangalore #panathur #balagere @DKShivakumar
@bbmpcac @thisisGBA@GBAChiefComm@BlrCityPolice@blrcitytraffic@CPBlr@Jointcptraffic@alokkumar6994@DgpKarnataka@KarnatakaCops@Lolita_TNIE@ChristinMP_
A friend of mine who lives in Carmelaram sends his son to the same school and was in one of the buses right behind the one that toppled in the video. Thankfully, the kids had a lucky escape.
As always, politicians will only show up in Panathur and Balagere next year during the floods to make empty promises. Until then, it’s the public who continue to suffer because of the terrible roads.
In Bengaluru, school children had to be rescued after their bus almost overturned on the broken, crater-filled Balagere– Panathur Road.
Netas & babus sit in AC offices while our kids gamble with death on craters they call “roads.”
How many lives before they wake up?