In nearly 5 years of modern generative ai, this is the first book I’m seeing with a super high level of coverage and comprehension.
> language modelling
> inference optimisation
> RL and its methods
> system scaling
> applied concepts like agentic ai, rag, memory
> environments and benchmarking
These fields have a subtle boundary differentiating them, but ultimately overlap in modern applications. Agents require system scaling, memory needs inference optimisation, rl requires understanding of environments and benchmarks.
For the first time in my exp, all in one place. Found this on paperswithcode[.]co
Every student needs to read "You Are NOT Dumb, You Just Lack the Prerequisites" by @lelouchdaily.
"It’s like walking into a movie halfway through—you can’t understand the plot because you missed the beginning."
Unfortunately, those who need to hear it most, seldom do.
this is f*cking dangerous
someone just open sourced the entire "LOOP ENGINEERING" framework for free
build a hedge fund printing alpha 24/7 by feeding it into claude code with my article below
bookmark before someone takes it down
A senior Anthropic engineer just dropped 11-page PDF on "Loop Engineering" for agentic systems.
The shift: you stop prompting the agent. You build the system that prompts it instead.
Schedule → Discover → Build → Verify → Repeat
Every loop runs one turn, five moves:
• Discovery: it finds its own work - failing CI, open issues, recent commits - instead of being handed a list.
• Handoff: each task gets an isolated git worktree so parallel agents don't collide.
• Verification: a second agent, told to assume the code is broken, reviews the first. The "thing that can say no."
• Persistence: results get written to disk, never left in a context window that gets flushed.
• Scheduling: an automation wakes it on a timer. That's what makes it a loop.
The key insight: an agent grading its own work always praises it.
This 11-page PDF changed how I'm building agentic systems today.
Read it now, then explore the article below.
A senior Google engineer just dropped a 19-page PDF on "Loop Engineering" for LLM and agentic systems.
Act → Observe → Learn → Repeat
• Act: the LLM proposes a code transformation (tile this loop, parallelize that one).
• Observe: a compiler runs it and reports back - is it valid? faster? slower? by how much?
• Learn: the LLM reads that feedback and adjusts its next move.
• Repeat until it stops finding improvements.
The agent gets smarter purely from grounded feedback inside its own context window.
This 19-page PDF totally changed the way I’m building agentic systems today.
Read it now, then explore the article below.
The Ultimate Step-By-Step LLM Engineering Projects Roadmap (2026 Edition)
- Build a tokenizer
- Learn embeddings
- Implement RoPE / ALiBi
- Hand-wire attention
- Build MHA
- Build a Transformer block
- Train a mini-former
- Compare objectives
- Build sampling
- Speculative decoding
- KV cache
- MQA / GQA / MLA
- Long context
- FlashAttention
- Hardware budgets
- Toy MoE
- Sparse model trade-offs
- State-space / linear attention
- Diffusion language models
- Data pipelines
- Synthetic data
- Scaling laws
- SFT / DPO / RLHF / GRPO
- Quantization
- Serving stacks
- Eval harnesses
- RAG
- Tool use / agents
- Vision-language adapters
- Interpretability
- Red-team suite
- Full capstone model system
One request: Choose an Opensource AI lab when you make it
Opensource is where humanity gets to keep the tools
DM me when you've made it ;)
this PhD student had 47 interviews and 4 offers before she was hired at OpenAI.
she practiced with her “notes on LLMs” and math and they’re a goldmine. super concise and organic and shared to everyone for free. you can use her notes or her topic list to study on your own.
A 19-year-old student from China, Zhang Wei, developed an AI radar and sold it to Hong Kong for $550,000
He created it using Claude, spending just $20 and a month on development
He walked into the Hong Kong administration office with a flash drive and asked for just 5 minutes of their time. 30 minutes later, he walked out with a check for $550,000
The code, connected to a camera, detects speed in real time. If the speed exceeds the limit, Claude takes a video clip and identifies the owner by the car's license plate. The video and the fine are then automatically sent to the owner's email address
Unlike a conventional radar that only takes a photo and doesn't always work, this AI radar eliminates disputes because it captures video and makes the process fully autonomous by sending out the fines on its own
The article includes the ready-to-use configurations.
Web scraping will never be the same.
(100% open-source visual search at scale)
PixelRAG is a retrieval system that skips HTML parsing completely.
Instead of scraping a page into text and embedding chunks, it screenshots the page and retrieves the image. A vision-language model reads the answer straight off the pixels.
Why that matters: parsing is where web RAG quietly loses information.
- A single HTML-to-text parser can drop 40%+ of a page.
- Tables, charts, and layout get flattened or thrown out.
- Swapping parsers alone can move accuracy ~10 points on the same docs.
PixelRAG indexes the page a person actually sees. The team built a visual index of all of Wikipedia, 30M+ screenshots, and it still beats the strongest text RAG baseline by 18.1% on text-only QA.
The repo also ships a Claude Code plugin that gives Claude eyes.
It lets Claude screenshot any URL and read the rendered page instead of scraping the DOM. So you can hand it a live page, an arXiv paper, or your local site and ask what it actually looks like.
One setup script. No MCP server, no backend.
How the pipeline works:
- Renders each document (web, PDF, image) to image tiles.
- Embeds them with Qwen3-VL-Embedding, LoRA fine-tuned on screenshots.
- Builds a FAISS index and serves a search API.
A stronger reader model lifts accuracy with no re-indexing, since the index is just pixels.
Everything is open-source under Apache-2.0.
GitHub repo: https://t.co/qun9TjAdmw
Talking about RAG, I recently wrote an article on a new approach that makes retrieval much more efficient by cutting corpus size by 40x, reducing tokens per query by 3x, and improving vector search relevance by 2.3x.
The article is quoted below.
Claude Code creator:
"At Anthropic, 90% of our engineers are running agents with self-improving loops.
in 3-6 months, everyone will be running /loops - this is the future of engineering"
in a 1-hour podcast, Boris Cherny reveals the best tips for building Claude Code automations.
Claude + loops + routines + dynamic workflows - that’s the secret.
Watch the talk, then read how to apply and build the same setup in the article below.
Bro I'm so sick of pretending this isn't weird.
The internet spent 20 years creating tutorials, open-source projects, blog posts & answers for free.
AI companies turned all of it into products worth billions.
And now the same people who created that knowledge are being told they're replaceable.
We built the library.
Someone else started charging admission.
GOLDMAN SACHS open-sourced most dangerous quant repo on the internet.
THE EXACT FRAMEWORK THEIR INTERNAL DESKS USE TO BUILD & RUN TRADING STRATEGIES.
They even left their Claude skills inside. Plug them in & you've a Goldman Sachs quant building strategies for you. BOOKMARK.
🤯 This is a website, a simple web-based game built with WebGL and Three.js.
Website: https://t.co/G0cDgaWzKe
It's honestly surprising how far web development has come.
@hugolowell@WIRED What does Dario expect with him acting like the wolves are coming to get everyone?? He was spreading fear rhetorics and now he gives the surprise pikachu face when consequences are dealt, I can’t with these tech leaders
Anthropic pays $750,000+ a year for engineers who can build LLMs from scratch.
Not how to prompt them.
Not how to fine-tune them.
Not how to build RAG pipelines.
But how to build them from scratch.
This 2-hour Stanford lecture teaches you everything.
Scaling laws.
Data collection.
Architecture design.
Post-training alignment.
Free. From Stanford.
Watch first. Then read this.
The lecture is the theory.
And this article shows you how to actually build it (with code) ↓