Good to see this post from our team PM @thomasgauvin.
I've been _extremely_ deep in agent reliability for a bit now. Seen things you people wouldn't believe. (Attack ships on fire off the shoulder of Orion...) Flue, Think, and I bet many more on the future, can find a reliable _machine_ in using the agents sdk to build ambitious things. Today is day 1 in that journey. More to come from the platform(s), runtime, libraries, ecosystem.
Back to work.
Second for second, @tylercowen packs more substance into a talk than anyone I'm aware of. This is a clear, non-hysterical, and somewhat soothing discussion of our AI future.
Yesterday, I was building a small Agentic PR reviewer for myself, and I realised how critical topological sort was in its entire application.
Essentially, every AI workflow is fundamentally a dependency problem. We do not just chain the prompts together, but also have situations where we may need to do things in an order that respects the dependencies.
This is where the classic DAG and topological sort come in handy.
I just wrote an essay on why DAGs and topological sorting are the core primitives required to design, debug, and scale AI workflows.
In this article, I break down how the dependency problem breaks linear pipelines, how to unlock "free" parallelism and cycle detection at definition time, and how this mental model scales seamlessly to multi-agent orchestration.
If you are building AI workflows (which I am sure you are), RAG pipelines, or multi-agent systems, this essay will give you a solid first-principles framework.
Give it a read.
Database table size impacts performance in more ways than one:
a) B-tree depth. Using 8k pages and a 16b uuid:
1 level = ~370 rows
2 levels = ~138k rows
3 levels = ~50m rows
4 levels = ~20b rows
The lookup cost on a table with 100k rows is not the same as one with 1b rows. This can apply both to the table itself (MySQL cluster index) as well as the indexes. Sometimes a single query requires many of them.
b) Small table → fits in RAM → fast reads. The larger the table, the more likely to read from disk plus churn the cache.
c) # of indexes. Each adds maintenance overhead for insertions, and for Postgres vacuum overhead as well.
Keep an eye on this! It's useful to take regular stock of your tables + indexes. Clean bloat. Remove unused indexes. Partition if needed.
In the past I usually wrote tests against postgres stuff by creating a transaction and rolling back. This is getting harder and harder, particularly if you have more than one service. I wonder what people do nowadays for tests to run efficiently and concurrently.
You have 800GB worth of unique IDs.
How do you solve the cardinality problem using just 120MB?
Simple. Read the story of how Reddit did it: 👇
In 2017, Reddit wanted to better communicate the scale of its communities to its users.
The easiest way to do that? Show a view counter.
But with scale comes challenges. 😓
Naively storing a set of unique IDs as longs (8 byte each) can quickly rack up memory and disk usage - a single 10 million view post is 80MB in that implementation.
Considering you need to:
• read
• modify
• persist
this collection every time a user views the post for their first time, you can imagine how it can become expensive. 💰
Now apply this to thousands of such posts (old ones, etc.)
• 10k posts with 10 million views equal 800GB.
Almost a terabyte which needs to be accessed concurrently in the system as views come and go on all posts... It suddenly becomes a very hard problem to solve.
Thankfully, Reddit realized they can use a very specific set of algorithms that are a perfect fit for this sort of Big Data problem:
✨ Sketch Algorithms ✨
Sketch algorithms are a set of algorithms that trade off accuracy for disproportionately massive efficiency gains.
In other words, their result isn't 100% correct. But that's fine because their main benefits are:
• small & consistent size 📏
• sub-linear space growth - input data grows linearly while space requirement does not. 🐣
• mathematically-proven error bound 🙅♂️
• and more...
Because of that, sketch algos see wide use across the industry.
The sketch algorithm Reddit used is called HyperLogLog. They leveraged its implementation in Redis, which is designed such that:
• it supports up to 2^64 elements - 18 quintillion 🤯
• it uses up to 12KB of memory.
That’s just 0.015% usage of the original naive implementation, and said percentage only becomes smaller the more input elements we add.
In other words, there is no space growth - you can store 18 quintillion objects in 12KB of memory.
• its maximum error rate is 0.81%
This means that if you have a set of 1 million IDs, the algorithm will return anything between 991,900 and 1,008,900.
A very acceptable error for such massive memory savings. One could call it negligible.
Now... of course they use Kafka 😎
It’s actually the key part of their data pipeline. (are you surprised?)
The end to end flow of their event counting looks roughly like this:
1. a user views a post → an event gets fired into an event collector server.
2. this server batches the events and produces them to Kafka.
3. Nazar, a Kafka consumer app, reads each event and decides whether it should be counted or not (based on rules in Redis).
4. Nazar produces the event back into Kafka with a boolean denoting the decision on whether to count it.
5. Abacus, another Kafka consumer app, reads the processed events and attempts to count each valid event.
6. To execute the counting, Abacus uses the HyperLogLog data structure in Redis and periodically persists it to Cassandra every 10 seconds. This helps restore it in case it’s evicted from Redis’ memory.
That's it.
A very simple pipeline, using a well-abstracted "simple" algorithm which solves a very hard big data problem.
Cloudflare built a global cache purge system that runs under 150 milliseconds. This is how they did it.
Using RockDB to maintain local CDN cache, and a peer-to-peer data center distributed system and clever engineering, they went from 1.5 second purge, down to 150 ms.
However, this isn’t the full picture, because that 150 ms is actually the P50.
In this video I explore Clouldflare CDN work, how the old core-based centralized quicksilver, lazy purge work compared to the new coreless, decentralized active purge.
I explore the pros and cons of both systems and give you my thoughts of this system.
One of my favorites videos
Video: https://t.co/z7HlOFPX27
Audio: https://t.co/WzZDNPuwM9
I just read an exciting blog from Zerodha about sending signed PDF reports for daily trading transactions! The scale of operations and the quick turnaround time with their new architecture are fascinating!
https://t.co/lIpdNSzBAy
Wrote up some notes on Cloudflare's fascinating new SQLite-backed "Durable Objects" system, which encourages an architectural style where your application creates thousands of tiny read-write SQLite databases scattered across Cloudflare's network https://t.co/ApbEQAuryo
NotebookLM is quite powerful and worth playing with
https://t.co/EMHIjc15iU
It is a bit of a re-imagination of the UIUX of working with LLMs organized around a collection of sources you upload and then refer to with queries, seeing results alongside and with citations.
But the current most new/impressive feature (that is surprisingly hidden almost as an afterthought) is the ability to generate a 2-person podcast episode based on any content you upload. For example someone took my "bitcoin from scratch" post from a long time ago:
https://t.co/7ajZNZ0BGi
and converted it to podcast, quite impressive:
https://t.co/ZZn0LJgsnu
You can podcastify *anything*. I give it train_gpt2.c (C code that trains GPT-2):
https://t.co/gDrAqix4Iv
and made a podcast about that:
https://t.co/bgcwmQr5d7
I don't know if I'd exactly agree with the framing of the conversation and the emphasis or the descriptions of layernorm and matmul etc but there's hints of greatness here and in any case it's highly entertaining.
Imo LLM capability (IQ, but also memory (context length), multimodal, etc.) is getting way ahead of the UIUX of packaging it into products. Think Code Interpreter, Claude Artifacts, Cursor/Replit, NotebookLM, etc. I expect (and look forward to) a lot more and different paradigms of interaction than just chat.
That's what I think is ultimately so compelling about the 2-person podcast format as a UIUX exploration. It lifts two major "barriers to enjoyment" of LLMs. 1 Chat is hard. You don't know what to say or ask. In the 2-person podcast format, the question asking is also delegated to an AI so you get a lot more chill experience instead of being a synchronous constraint in the generating process. 2 Reading is hard and it's much easier to just lean back and listen.
Haha we've all been there. I stumbled by this tweet earlier today and tried to write a little utility that auto-generates git commit message based on the git diff of staged changes. Gist:
https://t.co/1SbQsHSNwK
So just typing `gcm` (short for git commit -m) auto-generates a one-line commit message, lets you to accept, edit, regenerate or cancel. Might be fun to experiment with.
Uses the excellent `llm` CLI util from @simonw
https://t.co/LnHeCSfiHc
Anyone who professionally builds for the Web should read the Reckoning series by @slightlylate
Start with this case-study of how the California food stamps signup site takes 29.5s to become interactive on a rural internet connection: https://t.co/dOjEZHUngP
In which order does an SQL query run?
Understanding the order in which SQL queries run is critical to optimizing them.
Typically, SQL queries are processed in a logical order that differs from the one in which the SQL statements are written.
Here is the logical order in which SQL queries are processed:
1. FROM + JOIN
The first step is to process the data sources (tables, views, etc.) specified in the FROM clause. The data is read from the tables and combined from multiple sources if joins exist.
2. WHERE
The WHERE clause filters the rows based on the given conditions. Rows that do not meet the conditions are not considered for further processing.
3. GROUP BY
If a GROUP BY clause is present, rows are grouped based on the specified columns. Aggregate functions (like SUM, COUNT, or AVG) are applied to each group.
4. HAVING
If a HAVING clause is present, the groups are filtered according to aggregate conditions. Groups that meet the conditions are included.
5. SELECT
The SELECT clause is then applied to the result set. Columns are selected to compose the result data.
6. ORDER BY
If an ORDER BY clause is present, the result data is sorted based on the specified columns.
7. LIMIT/OFFSET
If there is a LIMIT or OFFSET clause (used in some database systems), the final result set is limited to row count or offset accordingly.
The main takeaways from this execution order are:
- Use the WHERE clause effectively to reduce the size of the data set early in the query process
- Since the HAVING clause is executed after the WHERE and GROUP BY clauses, move any filter conditions that don't depend on aggregation from HAVING to WHERE
- LIMIT, and OFFSET clauses are applied late in the query process and mainly affect the final result set, not the performance of the query execution
I'm creating a list of 5 of the "best books that will change how to see the world" for @Shepherd_books.
I went through the (extensive!) Further Readings list in A Theory of Everyone and have narrowed it down to 15 books.
Please help me shortlist with ♥️, 🔁 & 💬!
🧵
New YouTube video: 1hr general-audience introduction to Large Language Models
https://t.co/Bl4WNuNyFJ
Based on a 30min talk I gave recently; It tries to be non-technical intro, covers mental models for LLM inference, training, finetuning, the emerging LLM OS and LLM Security.
RabbitMQ is an open source message broker tool often used in distributed and pub-sub systems.
And you can configure a single instance to use different virtual hosts for each app.
In this tutorial, Ridwan walks you through exactly how it all works.
https://t.co/vcx1xrLRye
Just found this monograph on B-trees that has a fairly holistic perspective on B-trees including the data structures and algorithms part and use of B-tree indexes in databases, transactional techniques and query processing techniques.
Modern B-Tree Techniques - Goetz Graefe
https://t.co/PaHAf88dqS