Alex Kira

Andrew Lamb @andrewlamb1111

15 days ago

Great write up from A16z on the AI app layer

Joe Schmidt IV

@joeschmidtiv

17 days ago

https://t.co/i7teOUTWgZ

207

AlexKira retweeted

about 1 month ago

A new Database product based on @ApacheDataFusio was announced today from @LangChain -- focused on agent observability. It is really neat to see how people are building (very) customized data + query systems faster than ever now that they don't have to build the whole stack

18K

28 days ago

Excited to see SmithDB announcement at Interrupt, our purpose-built distributed database for agent observability! SmithDB is built on top object storage, written in Rust and leverages Apache DataFusion and the Vortex file format.

CEO @btrustteam | Managing Partner @rcrsvcapital Bylines @forbescrypto | 🇳🇬 Bitcoin Core contributor

about 1 month ago

Just announced at Interrupt! SmithDB. Agent traces have outgrown the databases built to hold them. That’s why we built SmithDB, a purpose-built distributed database for agent observability. Read the announcement from Co-Founder @ankush_gola11 → https://t.co/mu1zvuujwt

162

125

134K

AlexKira retweeted

Jaya Gupta

@JayaGup10

about 1 month ago

https://t.co/5vDrndcmFn

199

584

15K

Who to follow

Abubakar Nur Khalil

@ihate1999

Sarah Brown

@SkBr0wn

ex @tryramp @bainalerts

B’Erica Sinclair

@FFRENCHkiss321

muffin. did you put on sunscreen today? Proud supporter of @_spoonie (cupcake) #ikeeponfalling #crimejunkie #patw

AlexKira retweeted

Viv

@Vtrivedy10

about 1 month ago

we're on an Open Model mission to help builders create world class agents >20x cheaper than what they have today a couple things have become evident recently: 1. The age of the token subsidy is being pulled back 2. Open Models have crossed an intelligence threshold making them viable for real world agents at a fraction of the cost As teams get exponentially larger monthly bills from the labs, it's worth exploring how many agents today perform just as well using Open Models Check out the numbers on external evals + try it yourself by dogfooding and running on internal evals - @OpenRouter and @ArtificialAnlys have great leaderboards and breakdowns of what people are using. The time investment is definitely worth the massive cost savings - Instead of Sonnet 4.6 (or even 5.5/Opus) try Kimi-2.6, GLM5.1, Deepseek v4 pro, etc - Instead of Haiku try DeepSeekv4 Flash, Nemotron, etc Open models require some tuning to make sure they work well in your harness for your task (another reason why open harnesses are important) The closed models are excellent, there's no need to full-scale rip them out. Often the first use of Open Models is as subagents or using a closed frontier model as an Advisor to an open driver model At LangChain we want to make it as easy as possible to build the best agents in the world as cheaply and quickly as possible. We're leaning into open models heavily across our products and libraries try out an open model in deepagents in just a couple lines and come ride the open model, open harness future

$Vtrivedy10's tweet photo. we're on an Open Model mission to help builders create world class agents >20x cheaper than what they have today a couple things have become evident recently: 1. The age of the token subsidy is being pulled back 2. Open Models have crossed an intelligence threshold making them viable for real world agents at a fraction of the cost As teams get exponentially larger monthly bills from the labs, it's worth exploring how many agents today perform just as well using Open Models Check out the numbers on external evals + try it yourself by dogfooding and running on internal evals - @OpenRouter and @ArtificialAnlys have great leaderboards and breakdowns of what people are using. The time investment is definitely worth the massive cost savings - Instead of Sonnet 4.6 (or even 5.5/Opus) try Kimi-2.6, GLM5.1, Deepseek v4 pro, etc - Instead of Haiku try DeepSeekv4 Flash, Nemotron, etc Open models require some tuning to make sure they work well in your harness for your task (another reason why open harnesses are important) The closed models are excellent, there's no need to full-scale rip them out. Often the first use of Open Models is as subagents or using a closed frontier model as an Advisor to an open driver model At LangChain we want to make it as easy as possible to build the best agents in the world as cheaply and quickly as possible. We're leaning into open models heavily across our products and libraries try out an open model in deepagents in just a couple lines and come ride the open model, open harness future$

13K

about 1 month ago

Great read on Deep Agents harness profiles, supports customization and tuning per model

Viv

@Vtrivedy10

about 1 month ago

https://t.co/cPCn0wrwf1

224

387

132K

AlexKira retweeted

2 months ago

🚀 Interrupt sold out last year, so don't wait: May 13-14 in San Francisco. https://t.co/XFDhS88YFP Last year, hundreds of builders packed a room to share what's actually working in agent production. This year, we're going bigger. @hwchase17, @AndrewYNg, @cj_mongodb, are headlining. Alongside them: real-world sessions from teams shipping agents today, time with LangChain engineers, and a pioneering AI builder community. If you're working on agents or thinking about it, this is two days with the people furthest along.

11K

2 months ago

Nice post on security in the age of agents

sarah guo

@saranormous

2 months ago

https://t.co/xzFDudnHNX

315

384

222K

110

AlexKira retweeted

LangChain OSS

@LangChain_OSS

3 months ago

🔗 Announcing LangChain OSS Skills LangChain has the most popular frameworks for building AI agents — and now your coding agent can be an expert in it. We're excited to release the first iteration of LangChain OSS Skills, giving your agent expertise in our open source frameworks. The skills include guidance on how to use langchain, langgraph, and deepagents to effectively build agents. ➡️ Install our OSS skills for your coding agent here: https://t.co/PoOgOeLpMQ ➡️ Read more: https://t.co/9spxPkyd7J

412

392

66K

3 months ago

New LangChain rebrand!!

3 months ago

What? LangChain is evolving! Meet our final form ➡️ https://t.co/f6eGVFchtB

225

97K

529

AlexKira retweeted

Viv

@Vtrivedy10

4 months ago

Building Better Coding Agent Harnesses at @LangChain we're thinking hard about the science of harness engineering + open research on what works & doesn't A quick peak on our deepagents X Terminal Bench 2.0 work, shoutout to @alexgshaw & Harbor (they're great). Broad research goals: 1. Find general purpose agent improvement recipes 2. Measure which design changes most affect model performance and how 3. Measure if/how models are non-fungible in their harness Some previews on what worked well: - Self-Verification & Iteration as first class citizens. Models are very good at self correction if they get a feedback signal, but they often won't participate in this loop. So designing prompts & deterministic hooks to force them into this helps a lot. - Automated Context Engineering: Pre-fetching some environment context up front avoids discovery errors for tools/files. - Large scale reflection over Traces is a powerful general recipe for stratifying errors + validating proposed improvements We'll be releasing a blog and research artifacts soon on all of this! Will return to measure more vectors of harness design + use codex-5.3 If you're interested in effective harness engineering and building great coding agents, would love to hear from you

Vtrivedy10's tweet photo. Building Better Coding Agent Harnesses
at @LangChain we're thinking hard about the science of harness engineering + open research on what works & doesn't

A quick peak on our deepagents X Terminal Bench 2.0 work, shoutout to @alexgshaw & Harbor (they're great). Broad research goals:
1. Find general purpose agent improvement recipes
2. Measure which design changes most affect model performance and how
3. Measure if/how models are non-fungible in their harness

Some previews on what worked well:
- Self-Verification & Iteration as first class citizens. Models are very good at self correction if they get a feedback signal, but they often won't participate in this loop. So designing prompts & deterministic hooks to force them into this helps a lot.
- Automated Context Engineering: Pre-fetching some environment context up front avoids discovery errors for tools/files.
- Large scale reflection over Traces is a powerful general recipe for stratifying errors + validating proposed improvements

We'll be releasing a blog and research artifacts soon on all of this! Will return to measure more vectors of harness design + use codex-5.3

If you're interested in effective harness engineering and building great coding agents, would love to hear from you

13K

4 months ago

Interesting trace analysis for clawdbot!

Sam Crowder

@samecrowder

5 months ago

clawdbot really made the rounds on Twitter over the weekend with people posting about how it's booked flights for them, made dinner reservations, and more. But what are the failure modes? Let's dig into execution traces to find out 🧵

122

167

49K

AlexKira retweeted

Hunter Lovell

@huntlovell

5 months ago

Now in 𝚘𝚙𝚎𝚗𝚠𝚘𝚛𝚔 0.2! Organize your agents in a kanban view. See all your threads at a glance- what's running, what's waiting for you, and what's done. Subagents show up as cards too so you can track parallel work. Try it with 𝚗𝚙𝚡 𝚘𝚙𝚎𝚗𝚠𝚘𝚛𝚔@𝚕𝚊𝚝𝚎𝚜𝚝!

471

479

48K

AlexKira retweeted

Noam Brown

@polynoamial

5 months ago

I vibecoded an open-source poker river solver over the holiday break. The code is 100% written by Codex, and I also made a version with Claude Code to compare. Overall these tools allowed me to iterate much faster in a domain I know well. But I also felt I couldn't fully trust them. They'd make mistakes and encounter bugs, but rather than acknowledging it they'd often think it wasn't a big deal or, on occasion, just straight up try to gaslight me into thinking nothing is wrong. In one memorable debugging session with Claude Code I asked it, as a sanity check, what the expected value would be of an "always fold" strategy when the player has $100 in the pot. It told me that according to its algorithm, the EV was -$93. When I pointed out how strange that was, hoping it would realize on its own that there's a bug, it reassured me that $93 was close to $100 so it was probably fine. (Once I prompted it to specifically consider blockers as a potential issue, it acknowledged that the algorithm indeed wasn't accounting for them properly.) Codex was not much better on this, and ran into its own set of (interestingly) distinct bugs and algorithmic mistakes that I had to carefully work through. Fortunately, I was able to work through these because I'm an expert on poker solvers, but I don't think there are many other people that could have succeeded at making this solver by using AI coding tools. The most frustrating experience was making a GUI. After a dozen back-and-forths, neither Codex nor Claude Code were able to make the frontend I requested, though Claude Code's was at least prettier. I'm inexperienced at frontend, so perhaps what I was asking for simply wasn't possible, but if that was the case then I wish they would have *told* me it was difficult or impossible instead of repeatedly making broken implementations or things I didn't request. It highlighted to me how there's still a big difference between working with a human teammate and working with an AI. After the initial implementations were complete and debugged, I asked Codex and Claude Code to create optimized C++ versions. On this, Codex did surprisingly well. Its C++ version was 6x faster than Claude Code's (even after multiple iterations of prompting for further optimizations). Codex's optimizations still weren't as good as what I could make, but then again I spent 6 years of PhD making poker bots. Overall, I thought Codex did an impressive job on this. My final request was asking the AIs if they could come up with novel algorithms that could solve NLTH rivers even faster. Neither succeeded at this, which was not surprising. LLMs are getting better quickly, but developing novel algorithms for this sort of thing is a months-long research project for a human expert. LLMs aren't at that level yet.

polynoamial's tweet photo. I vibecoded an open-source poker river solver over the holiday break. The code is 100% written by Codex, and I also made a version with Claude Code to compare.

Overall these tools allowed me to iterate much faster in a domain I know well. But I also felt I couldn't fully trust them. They'd make mistakes and encounter bugs, but rather than acknowledging it they'd often think it wasn't a big deal or, on occasion, just straight up try to gaslight me into thinking nothing is wrong.

In one memorable debugging session with Claude Code I asked it, as a sanity check, what the expected value would be of an "always fold" strategy when the player has $100 in the pot. It told me that according to its algorithm, the EV was -$93. When I pointed out how strange that was, hoping it would realize on its own that there's a bug, it reassured me that $93 was close to $100 so it was probably fine. (Once I prompted it to specifically consider blockers as a potential issue, it acknowledged that the algorithm indeed wasn't accounting for them properly.) Codex was not much better on this, and ran into its own set of (interestingly) distinct bugs and algorithmic mistakes that I had to carefully work through. Fortunately, I was able to work through these because I'm an expert on poker solvers, but I don't think there are many other people that could have succeeded at making this solver by using AI coding tools.

The most frustrating experience was making a GUI. After a dozen back-and-forths, neither Codex nor Claude Code were able to make the frontend I requested, though Claude Code's was at least prettier. I'm inexperienced at frontend, so perhaps what I was asking for simply wasn't possible, but if that was the case then I wish they would have *told* me it was difficult or impossible instead of repeatedly making broken implementations or things I didn't request. It highlighted to me how there's still a big difference between working with a human teammate and working with an AI.

After the initial implementations were complete and debugged, I asked Codex and Claude Code to create optimized C++ versions. On this, Codex did surprisingly well. Its C++ version was 6x faster than Claude Code's (even after multiple iterations of prompting for further optimizations). Codex's optimizations still weren't as good as what I could make, but then again I spent 6 years of PhD making poker bots. Overall, I thought Codex did an impressive job on this.

My final request was asking the AIs if they could come up with novel algorithms that could solve NLTH rivers even faster. Neither succeeded at this, which was not surprising. LLMs are getting better quickly, but developing novel algorithms for this sort of thing is a months-long research project for a human expert. LLMs aren't at that level yet.

122

205

422K

AlexKira retweeted

Simon Willison

@simonw

5 months ago

Here's my enormous round-up of everything we learned about LLMs in 2025 - the third in my annual series of reviews of the past twelve months https://t.co/HD9Zf85SG2 This year it's divided into 26 sections! This is the table of contents:

simonw's tweet photo. Here's my enormous round-up of everything we learned about LLMs in 2025 - the third in my annual series of reviews of the past twelve months
https://t.co/HD9Zf85SG2
This year it's divided into 26 sections! This is the table of contents: https://t.co/DFlzgXudLy

102

868

509K

Kaivalya Apte - The Geek Narrator

7 months ago

Really enjoyed this episode!

@thegeeknarrator

7 months ago

Here we go TigerStyle 🚀 Episode with @jorandirkgreef Founder & CEO, @TigerBeetleDB on TigerStyle is now available to watch.. https://t.co/J2n3gRVDVe Please like, share and subscribe!

22K

8 months ago

We're also hiring across the board: https://t.co/A6jRU67qOs, come work with us! Also looking for SDK engineer for LangSmith: https://t.co/r8WPYso3vP

8 months ago

LangChain has raised a $125M Series B, valuing the company at $1.25B 🦜🔗 It's been a great ride so far, seeing LangChain in the last year grow from 1 to 3 products with traction and customers like Replit, Rippling, Clay, and Cloudflare using us to ship AI agents in production!

over 1 year ago

Excited for Interrupt, the AI Agent conference by LangChain! With a great lineup of speakers, join us in SF!

over 1 year ago

Tickets just dropped for Interrupt: The AI Agent Conference by LangChain 🚀 🌒 Join us this May 13-14 in San Francisco for Interrupt, LangChain’s first-ever conference — a space for anyone building or shaping the future of AI agents. Learn more: https://t.co/vl9uNssvlK

LangChain's tweet photo. Tickets just dropped for Interrupt: The AI Agent Conference by LangChain 🚀 🌒

Join us this May 13-14 in San Francisco for Interrupt, LangChain’s first-ever conference — a space for anyone building or shaping the future of AI agents.

Learn more: https://t.co/vl9uNssvlK

132

57K

287