We built SmithDB: the database purpose built for agent observability workloads that now powers many parts of LangSmith.
Agent observability presents a challenging data problem. Agent traces can contain tens of thousands of intermediate spans and large, unbounded payloads. These characteristics are a direct result of agents running for longer time horizons and LLM context window sizes growing.
Traditional data infrastructure was not built to handle the complexities associated with storing and querying this data.
SmithDB brings LangSmith up to 12x performance improvements across access patterns most important for agent observability. I’ve been working on SmithDB directly with an amazing team over the past few months, and I’m incredibly proud of the results we’re seeing.
I wrote a bit more about the story and engineering challenges behind SmithDB in this blog.
Additionally, if you’re a systems engineer interested in building the future of agent observability, please reach out!
Sharing some thoughts this morning about different paths you can take as a software engineer who's looking for more independence, and perhaps, eventually becoming a solopreneur.
It's not just about building a product or becoming a consultant.
https://t.co/PvhhD8BRiy
Data Streaming Academy has tutorials now! The first tutorial covers reading and writing Kafka consumer offsets in Apache Flink using the State Processor API.
https://t.co/GoBloaZcln
Recently, I spent a lot of time benchmarking various Iceberg sinks (Spark, Flink, Kafka Connect) and trying to beat Supermetal for moving data from Postgres to Iceberg. Here's my report: https://t.co/lOmu6XN5w4