BentoLabs AI (YC P26) @BentoLabsAI - Twitter Profile

Pinned Tweet

BentoLabs AI (YC P26) @BentoLabsAI

4 days ago

We are live on @ycombinator ! Building the monitoring and learning layer for long-running agents.

Y Combinator

@ycombinator

4 days ago

.@BentoLabsAI is the monitoring and learning layer for long-running agents. Their learning layer gives agents model-jump gains: Sonnet 4.5 went 42.2%→52.4% on TB2 (Internal). Congrats on the launch, @Abhinavv_soni & @kacppian! https://t.co/LTy5sslfni

39

216

35

57

24K

1

9

1

655

BentoLabs AI (YC P26) @BentoLabsAI

about 8 hours ago

Does your agent actually learn from its failures, or just log them?

0

2

0

18

BentoLabs AI (YC P26) @BentoLabsAI

about 12 hours ago

@Rahul_J_Mathur Thank you @Rahul_J_Mathur🙌🏻, glad Bento made this list.

0

87

BentoLabs AI (YC P26) @BentoLabsAI

about 12 hours ago

Teams running agents in production, what is one thing you wish your monitoring could tell you that it currently cannot?

0

2

1

32

BentoLabs AI (YC P26) @BentoLabsAI

1 day ago

@abhitwt The human in the loop out here doing god's work, all because the toilet paper fumbled the first few steps. we've all been that person :,)

1

7

0

2K

BentoLabs AI (YC P26) @BentoLabsAI

1 day ago

Why agents that work in staging often degrade in production? It's usually a diagnostic failure. Users use your agents in ways you can't even imagine, that results in failures that are even harder to catch and work on. Our framework helps you spot which layer is actually breaking. Read here 👇🏻 https://t.co/M6542A7cJV

0

4

3

0

124

BentoLabs AI (YC P26) @BentoLabsAI

3 days ago

We ran our recursive learning layer on Terminal-Bench 2.0. Same agent. Same model. Same harness. Same budget. The result: Claude Sonnet went from 42.2% → 52.4%. A +10.2 percentage-point lift, significant at p < 0.05, with a 13:3 task-level win/loss ratio (internal). The only variable was a learning layer. We wrote a full technical breakdown on what changed, why it worked, and what this means for production AI agents. Read it here 👇 https://t.co/VhVCgURuXv

BentoLabsAI's tweet photo. We ran our recursive learning layer on Terminal-Bench 2.0. Same agent. Same model. Same harness. Same budget.

The result: Claude Sonnet went from 42.2% → 52.4%. A +10.2 percentage-point lift, significant at p < 0.05, with a 13:3 task-level win/loss ratio (internal).

The only variable was a learning layer.

We wrote a full technical breakdown on what changed, why it worked, and what this means for production AI agents.

Read it here 👇
https://t.co/VhVCgURuXv

0

10

4

0

245

BentoLabsAI retweeted

Kaushik @kacppian

8 days ago

My co-founder @Abhinavv_soni clearly does not like being kept away from his agents. PS: BTS of the BTS. #launchsoon

0

4

2

3

366

BentoLabs AI (YC P26) @BentoLabsAI

8 days ago

Consider this a glimpse of what's coming #launchsoon

Abhinav Soni

@Abhinavv_soni

8 days ago

Almost didn't document the day, my team made sure we did. Some days you just feel the shift. We booked a studio, brought in a team, and spent the day trying to capture what @BentoLabsAI actually is right now. Where we started, where we are, and where we're going. It's one thing to build in silence. It's another to finally show it in a much bigger way. We've been deep in production agent systems, working with some of the top teams running AI at scale today. The problem we set out to solve is more real than ever. Not ready to say more just yet. But it's close. Stay tuned.

Abhinavv_soni's tweet photo. Almost didn't document the day, my team made sure we did.

Some days you just feel the shift. We booked a studio, brought in a team, and spent the day trying to capture what @BentoLabsAI actually is right now. Where we started, where we are, and where we're going. It's one thing to build in silence. It's another to finally show it in a much bigger way.

We've been deep in production agent systems, working with some of the top teams running AI at scale today. The problem we set out to solve is more real than ever.

Not ready to say more just yet. But it's close. Stay tuned.

2

23

6

2

1K

0

8

1

0

490

BentoLabs AI (YC P26) @BentoLabsAI

9 days ago

We are live on the YC Directory. https://t.co/6nEr0It6Do

0

22

6

4

5K

BentoLabsAI retweeted

Abhinav Soni

@Abhinavv_soni

11 days ago

Stop fine-tuning your model. Your harness is broken. The model was fine.

0

13

4

0

426

BentoLabs AI (YC P26) @BentoLabsAI

11 days ago

The model: “I can do it, I promise.” The harness: ❌ wrong context ❌ broken retrieval ❌ timeout ❌ hallucinated tool response Everyone: “wow, this model is really bad”"

BentoLabsAI's tweet photo. The model: “I can do it, I promise.”

The harness:
❌ wrong context
❌ broken retrieval
❌ timeout
❌ hallucinated tool response

Everyone:
“wow, this model is really bad”" https://t.co/RDHT5LjwXW

0

7

3

0

339

BentoLabs AI (YC P26) @BentoLabsAI

12 days ago

Everyone wants AGI. You just want the pager to stop going off after deploying one prompt change. Don't worry. We got you. Coming soon to save you some sleep.

BentoLabsAI's tweet photo. Everyone wants AGI.

You just want the pager to stop going off after deploying one prompt change.

Don't worry. We got you. Coming soon to save you some sleep. https://t.co/7U8mt61YNW

1

4

1

0

144

BentoLabs AI (YC P26) @BentoLabsAI

13 days ago

Busy building for the humans behind AI agents. Stay tuned.

0

5

2

0

167

BentoLabs AI (YC P26) @BentoLabsAI

15 days ago

https://t.co/cl8cAsUYU4

0

4

0

122

BentoLabs AI (YC P26) @BentoLabsAI

15 days ago

Other tools show you the known failures. We show you the unknowns as well. But we don't stop at detection. We help prevent them from repeating. Knowledge carries forward. Successful patterns get reinforced. Failures get avoided. Knowledge compounds over time!

1

7

3

0

207

BentoLabs AI (YC P26) @BentoLabsAI

18 days ago

Why we call it 'learning infrastructure': Traditional monitoring: Log → Alert → Debug → Patch → Repeat Learning infrastructure: Capture → Learn → Apply → Improve → Compound The difference between reactive maintenance and proactive improvement.

1

7

2

1

256

BentoLabs AI (YC P26) @BentoLabsAI

19 days ago

If you've ever noticed your production AI agents ignoring instructions mid-run and spent hours debugging it. Here's why it might be happening and how you can fix it. https://t.co/OfZ2Z8Lasc

0

8

3

0

193

BentoLabs AI (YC P26)

@BentoLabsAI

Last Seen Users on Sotwe

Trends for you

Most Popular Users