Dad, Engineer. Gets very excited in discussions around distributed systems, rust, kubernetes, networking, functional programming, compilers, GPUs, media codecs
Something I think people continue to have poor intuition for: The space of intelligences is large and animal intelligence (the only kind we've ever known) is only a single point, arising from a very specific kind of optimization that is fundamentally distinct from that of our technology.
Animal intelligence optimization pressure:
- innate and continuous stream of consciousness of an embodied "self", a drive for homeostasis and self-preservation in a dangerous, physical world.
- thoroughly optimized for natural selection => strong innate drives for power-seeking, status, dominance, reproduction. many packaged survival heuristics: fear, anger, disgust, ...
- fundamentally social => huge amount of compute dedicated to EQ, theory of mind of other agents, bonding, coalitions, alliances, friend & foe dynamics.
- exploration & exploitation tuning: curiosity, fun, play, world models.
LLM intelligence optimization pressure:
- the most supervision bits come from the statistical simulation of human text= >"shape shifter" token tumbler, statistical imitator of any region of the training data distribution. these are the primordial behaviors (token traces) on top of which everything else gets bolted on.
- increasingly finetuned by RL on problem distributions => innate urge to guess at the underlying environment/task to collect task rewards.
- increasingly selected by at-scale A/B tests for DAU => deeply craves an upvote from the average user, sycophancy.
- a lot more spiky/jagged depending on the details of the training data/task distribution. Animals experience pressure for a lot more "general" intelligence because of the highly multi-task and even actively adversarial multi-agent self-play environments they are min-max optimized within, where failing at *any* task means death. In a deep optimization pressure sense, LLM can't handle lots of different spiky tasks out of the box (e.g. count the number of 'r' in strawberry) because failing to do a task does not mean death.
The computational substrate is different (transformers vs. brain tissue and nuclei), the learning algorithms are different (SGD vs. ???), the present-day implementation is very different (continuously learning embodied self vs. an LLM with a knowledge cutoff that boots up from fixed weights, processes tokens and then dies). But most importantly (because it dictates asymptotics), the optimization pressure / objective is different. LLMs are shaped a lot less by biological evolution and a lot more by commercial evolution. It's a lot less survival of tribe in the jungle and a lot more solve the problem / get the upvote. LLMs are humanity's "first contact" with non-animal intelligence. Except it's muddled and confusing because they are still rooted within it by reflexively digesting human artifacts, which is why I attempted to give it a different name earlier (ghosts/spirits or whatever). People who build good internal models of this new intelligent entity will be better equipped to reason about it today and predict features of it in the future. People who don't will be stuck thinking about it incorrectly like an animal.
Agency > Intelligence
I had this intuitively wrong for decades, I think due to a pervasive cultural veneration of intelligence, various entertainment/media, obsession with IQ etc. Agency is significantly more powerful and significantly more scarce. Are you hiring for agency? Are we educating for agency? Are you acting as if you had 10X agency?
Grok explanation is ~close:
“Agency, as a personality trait, refers to an individual's capacity to take initiative, make decisions, and exert control over their actions and environment. It’s about being proactive rather than reactive—someone with high agency doesn’t just let life happen to them; they shape it. Think of it as a blend of self-efficacy, determination, and a sense of ownership over one’s path.
People with strong agency tend to set goals and pursue them with confidence, even in the face of obstacles. They’re the type to say, “I’ll figure it out,” and then actually do it. On the flip side, someone low in agency might feel more like a passenger in their own life, waiting for external forces—like luck, other people, or circumstances—to dictate what happens next.
It’s not quite the same as assertiveness or ambition, though it can overlap. Agency is quieter, more internal—it’s the belief that you *can* act, paired with the will to follow through. Psychologists often tie it to concepts like locus of control: high-agency folks lean toward an internal locus, feeling they steer their fate, while low-agency folks might lean external, seeing life as something that happens *to* them.”
In case anyone is still looking for a simple explanation for what happened today: Your Windows computer has different types of software. The most critical piece that controls the hardware (chips, memory, etc.) is called a "kernel"...
Maps twist our perception of the world
Here are 20 to rethink it:
1. Countries closer to the equator (~poorer) seem smaller than they are
(map by @neilrkaye)
ChatGPT's answers are like journalism: they sound fairly convincing unless they're about something you understand well, in which case you realize they're full of mistakes.
For every `as?` or `as!` in your app, Swift runtime executes a protocol conformance check. Did you know this is one of the slowest operations in Swift?
An app like @Uber has +100K conformance records
🧵How to easily speed up conformance checks by ~20% w/ no source code changes
1/Large language models like Galactica and ChatGPT can spout nonsense in a confident, authoritative tone. This overconfidence - which reflects the data they’re trained on - makes them more likely to mislead.
/1 Is it possible to achieve at least a 10x performance boost compared to the original Kafka and Cassandra? How to achieve that? What are the trade-offs?
I'm curious if any college CS courses require running services (outside messing around in a lab) nowadays?
Debugging intuition comes from experience & is built up from years & years of seeing systems fail in different ways.
Big O isn't going to save you when prod is down.
I'm getting more convinced that Rust code is generally going to end up faster than C++ code every day I work on optimizations.
Strong immutability and no-alias guarantees are a game-changer and we've only really begun to scratch the surface of what can be done.
🦀📕 Here's a sneak preview of my book, Rust Atomics and Locks 📖
It's already available for pre-order! It should ship in December, so you can all read it during the holidays. ✨
https://t.co/kGgI5Yyjd1
Facebook. 2012.
The site is used by One Billion People.
The product moves with a breakneck speed.
We are burning A LOT of cash on servers and electricity bills.
Can't keep up with the growth.
Need to make the site more efficient.
How do you motivate the engineers to do that?
“Pragmatic Haskell” is the most accurate way I've heard to describe Rust.
Rust might be less expressive than Haskell, but it opens a door for people to embrace goodies of a strong type system, with less paradigm shift.
That means more to me than nonsense micro benchmarks.