I loved reading the first edition of "Designing Data-Intensive Applications". It is the most comprehensive book on the types of systems I work on. I recently started reading the second edition. It was a pleasant surprise—and rather humbling—to read the references in Chapter 1.
DuckDB Labs becomes DuckLabs. We ended up working on more than DuckDB, i.e., DuckLake and most recently, Quack. It was time to change the company name to reflect this. Nothing else changes – read our blog post for more details.
https://t.co/iaF2TC6mBT
Words should always be careful, but not necessarily precise. Sometimes the imprecise word is the right one.
Nothing should be too tight, especially your thinking.
AI increases the demand for coordination work but doesn’t reduce the costs of coordination work. Watch for coordination challenges to increase in the near future.
I'm hiring in Palo Alto, Austin, Amsterdam, and Toronto. Reach out if you want to work on distributed systems, databases, or edge computing for scaling solar manufacturing, battery energy storage, vehicle charging, virtual power plants, and more.
@breckcs It’s such an underrated skill.
In an organization everything is in constant movement. If you want to contribute a maximum you need to understand your environment, what is lacking and how you and team can contribute best. Self-Awareness.
Great question.
I don't use LLMs for writing.
I use agents extensively for brainstorming, research, checking facts, handling markup, finding references, indexing data, and so on.
But I think that asking people to read LLM-generated text breaks a kind of social contract.
Pre-object store design permeates the whole architecture. If you swap out storage, you are still left with an architecture that predates object stores. Leadership election, compute, caching.. your entire system needs to be rebuilt.
How have the fundamentals of building large, distributed software systems changed the last decade? A conversation with Martin Kleppmann (author of Designing Data-Intensive Applications) - given that the second, updated edition of the book was just released.
Timestamps:
00:00 Early career
05:46 Building Rapportive
10:47 Working at LinkedIn
14:09 Writing Designing Data-Intensive Applications
23:00 Reliability, scalability, and repeatability
26:24 DDIA: the second edition
30:50 Tradeoffs of using cloud services
39:02 How the cloud changed scaling
42:53 The trouble with distributed systems
49:02 Ethics for software engineers
52:45 Formal verification
1:00:12 Academia vs. industry
1:03:50 Local-first software
1:09:50 Computer science education
1:18:32 Martin’s current research and advice
Brought to you by:
• @statsig – The unified platform for flags, analytics, experiments, and more. https://t.co/ZCSOIcWv31
• @SonarSource – The makers of SonarQube, the industry standard for code verification and automated code review. Check out Sonar's new architecture management capabilities that ensure both humans and AI agents respect your system’s blueprint. https://t.co/3Rycx5lfXE
• @WorkOS – Ship enterprise features – SSO, directory sync, RBAC, audit logs – in days, not months. https://t.co/Xxf2A82ixc
Three things worth considering, as discussed with Martin, in this episode:
1. Multi-region and multi-cloud are risk/cost trade-offs, not best practices.
Martin does not believe that there is a “best practice” in deciding whether to go multi-region or multi-cloud. This decision is a tradeoff between risk and costs. It’s a business decision to be made. Designing Data-Intensive Applications gives engineers the vocabulary to articulate the tradeoffs, not to dictate answers.
2. Replication for fault tolerance is more relevant for most engineers these days than sharding.
Though the book has a full chapter on sharding, Martin said that the cloud has reduced the need for manual sharding for the majority of teams. This is also because machines are increasingly bigger, and more workloads fit on a single machine. Sharding across machines is increasingly a specialist concern; replication for fault tolerance, however, is still relevant at every scale.
3. Knowing system internals as a superpower for application developers.
Martin maintains that Designing Data-Intensive Applications is not a book for people who build databases or even infrastructure, but it’s helpful for application developers to develop an intuition for making good design decisions and debugging performance issues we will eventually encounter.
I loved reading the first edition of "Designing Data-Intensive Applications". It is the most comprehensive book on the types of systems I work on. I recently started reading the second edition. It was a pleasant surprise—and rather humbling—to read the references in Chapter 1.
Supercharger wait times have been reducing through more Superchargers, larger sites, better Trip Planner and better long-term planning.
The best wait is no wait, but our Supercharger build plan forecasts aren't perfect and surge events do happen. In some cases, customers start self-organizing, like passing through lists of who's next in line like this example.
Knowing who's next and being able to leave your car can make this a much better experience, even for the shortest of waits. Through our vertical integration we're generating a dynamic waitlist, integrated into your Tesla and the Tesla app.
We'll be iterating on this based on feedback and, if successful, scaling it across more Superchargers.