Zihao Lin

@ZihaoLin685013

Co-founder of Klavis AI (YC X25)

San Francisco

Joined August 2024

116 Following

298 Followers

154 Posts

Pinned Tweet

Zihao Lin @ZihaoLin685013

6 months ago

Introduce Klavis MCP SaaS (Sandbox-as-a-service) ! Reinforcement Learning environments are becoming the new labeled datasets. Just as curated data powered the last wave of AI breakthroughs, training environments are powering the next one: AI agents that can actually use tools. But here's the problem: Training models to use Gmail, Salesforce, Slack, or Jira requires things are complex and painful to build yourself: → Managing hundreds of authenticated test accounts → Initializing realistic data states for each training episode → Resetting state between runs → Ensuring isolation across concurrent training sessions → Providing verifiable sandbox state for reward signals Most research teams spend months on this infrastructure before a single training run. That's why we are launching a managed MCP Sandbox-as-a-Service for RL training on tool use in Klavis AI. One API call → isolated sandbox backed by real service instances. The training loop becomes simple: Initialize → Seed sandbox with custom data state Interact → Model executes actions via MCP tools Dump → Get final state snapshot Compute reward → Compare dump vs. target state Reset → Return to pristine state instantly Deterministic. Reproducible. Parallelizable. If you're training models on tool use, we'd love to chat.

0

6

0

1

204

Zihao Lin @ZihaoLin685013

4 months ago

@Justin01805921 congrats on the launch guys!!

0

0

0

0

91

Zihao Lin @ZihaoLin685013

5 months ago

@lobehub congrats !

0

2

0

0

69

ZihaoLin685013 retweeted

5 months ago

https://t.co/DmdpIjkUJl

8

52

12

15

114K

Zihao Lin @ZihaoLin685013

5 months ago

@AnthropicAI https://t.co/FRO1RLUkBy

0

2

0

0

59

Zihao Lin @ZihaoLin685013

5 months ago

Most AI agent benchmarks are a total waste of time. They rely on static, low-fidelity mockups that don't exist in the real world. If you’re training agents on simplified environments, don't be surprised when they fail in production. @AnthropicAI latest guide on demystifying evals makes one thing clear: you are "flying blind" without rigorous, repeatable testing. But here is the problem: Evals are only as good as the world they live in. To build a truly generalist agent, you need more than a mockup. You need: 1> long-horizon workflows: Tasks that require dozens of steps, not just one-shot completions. 2> authentic ecosystems: Real software, real state, and real ambiguity. 3> noise and state-dependency: The "messiness" that breaks 90% of agents today. This is why we built @Klavis_AI Universe. We don't provide "test cases", we provide scalable universes. By giving models ultra-realistic settings with 300+ MCP Servers, we enable the kind of RL training and long-horizon evaluation that was previously impossible. Anthropic says the value of evals compounds over the life of an agent. We say the value of the environment is the ceiling of your agent's intelligence. Stop testing against toy environments. Start building for the real world.

1

4

0

0

88

Zihao Lin @ZihaoLin685013

6 months ago

Most founders think Forward Deployed Engineering (FDE) is just a fancy term for technical support. They’re wrong. FDE isn't about fixing bugs for customers, it’s about collapsing the distance between a vision and a production-ready reality. We spent a full day in a private shack in SF with @DeepAI team No pitch meeting. No polished demo environment. just system design, whiteboarding, and pair-coding. We didn’t just talk about how to integrate @Klavis_AI . We went deep on the reality of AI agents in production, tooling infrastructure that actually scales, solving integration blockers in real-time. The bandwidth of collaboration you get in 6 hours of coding beats 6 months of email threads. One more thing I learned? @KevinBaragona shared the story of how @DeepAI became the official sponsor of the Tuvalu National Futsal Team! It’s these "fascinatingly weird" stories that emerge when you actually get in the trenches with other builders. You don't get that over a Zoom call. If you aren't spending deep, focused time in the room with your partners, you aren't building, you’re just shipping features. The future of AI isn't just about better models. It’s about better stories we build along the way. Build with your users.

ZihaoLin685013's tweet photo. Most founders think Forward Deployed Engineering (FDE) is just a fancy term for technical support.

They’re wrong.

FDE isn't about fixing bugs for customers, it’s about collapsing the distance between a vision and a production-ready reality.

We spent a full day in a private shack in SF with @DeepAI team

No pitch meeting. No polished demo environment. just system design, whiteboarding, and pair-coding.

We didn’t just talk about how to integrate @Klavis_AI . We went deep on the reality of AI agents in production, tooling infrastructure that actually scales, solving integration blockers in real-time.

The bandwidth of collaboration you get in 6 hours of coding beats 6 months of email threads.

One more thing I learned?

@KevinBaragona shared the story of how @DeepAI became the official sponsor of the Tuvalu National Futsal Team!

It’s these "fascinatingly weird" stories that emerge when you actually get in the trenches with other builders. You don't get that over a Zoom call.

If you aren't spending deep, focused time in the room with your partners, you aren't building, you’re just shipping features.

The future of AI isn't just about better models. It’s about better stories we build along the way.

Build with your users.

1

5

0

0

203

Zihao Lin @ZihaoLin685013

6 months ago

@emkara sentry RIP?

0

1

0

0

290

Zihao Lin @ZihaoLin685013

6 months ago

Klavis team at MCP Night

ZihaoLin685013's tweet photo. Klavis team at MCP Night https://t.co/Fgfr46mb13

1

3

0

0

132

Zihao Lin @ZihaoLin685013

6 months ago

The #1 rule of early-stage sales is: "Listen more than you talk." I failed this rule completely with my first customer. I met Shoya (@pineforesta) at Equator Coffee shortly after we launched @Klavis_AI . I was nervous. I spent the entire conversation stumbling through a long, complicated pitch. I barely let him get a word in. By all accounts, I should have lost the sale. But Shoya didn’t walk away. He listened patiently. When I finally stopped talking, he didn't ask for clarification. instead, he did something incredible: in just a few simple sentences, he explained his product and exactly where Klavis fit into his workflow. He even opened his laptop to demo my value prop to me. I learned two massive lessons that afternoon: 1/ True early adopters are rare. They don't need a perfect pitch. They see the vision through the mess and connect the dots themselves. 2/ Great founders simplify complexity. While I was complicating things, Shoya was clarifying them. Shoya became our first paying customer that day. He is still on our cheapest "grandfathered" pricing plan. Not just because he is our friend, but because he bet on us when I didn't even know how to sell yet. Today, Shoya just got accepted into the next @ycombinator batch. I’m not surprised. He knows how to find the signal through the noise better than anyone I know. So proud of you, Shoya!

ZihaoLin685013's tweet photo. The #1 rule of early-stage sales is: "Listen more than you talk."

I failed this rule completely with my first customer.

I met Shoya (@pineforesta) at Equator Coffee shortly after we launched @Klavis_AI . I was nervous. I spent the entire conversation stumbling through a long, complicated pitch. I barely let him get a word in.

By all accounts, I should have lost the sale.

But Shoya didn’t walk away. He listened patiently.

When I finally stopped talking, he didn't ask for clarification. instead, he did something incredible: in just a few simple sentences, he explained his product and exactly where Klavis fit into his workflow. He even opened his laptop to demo my value prop to me.

I learned two massive lessons that afternoon:

1/ True early adopters are rare. They don't need a perfect pitch. They see the vision through the mess and connect the dots themselves.
2/ Great founders simplify complexity. While I was complicating things, Shoya was clarifying them.

Shoya became our first paying customer that day.

He is still on our cheapest "grandfathered" pricing plan. Not just because he is our friend, but because he bet on us when I didn't even know how to sell yet.

Today, Shoya just got accepted into the next @ycombinator batch.

I’m not surprised. He knows how to find the signal through the noise better than anyone I know.

So proud of you, Shoya!

0

8

2

0

3K

Zihao Lin @ZihaoLin685013

7 months ago

@Qual_Gent cooking!

0

4

0

0

65

Zihao Lin @ZihaoLin685013

7 months ago

Having trouble with @canva. Looking to connect with someone from their team to resolve this. Any POC recommendations? 🙏

1

3

0

0

216

ZihaoLin685013 retweeted

7 months ago

Cursor Head of Design Ryo Lu (@ryolu_) has spent his career at the intersection of design and engineering—from building fan sites as a kid to designing products at Stripe, Asana, and Notion. Now he's rethinking how software itself gets made. On this episode of Design Review, Ryo joins YC's @aaron_epstein to break down how great product websites communicate what a company does. They walk through sites from early-stage startups, calling out the small choices in structure, clarity, and brand that help users understand a product instantly— and the ones that get in the way. 00:00 - Intro 01:00 - Crunched 05:30 - Velvet 09:00 - Klavis AI 14:30 - Code Crafters 20:40 - Slashy 22:50 - Freya 26:00 - Finta 30:30 - Vibeflow

15

320

38

376

172K

ZihaoLin685013 retweeted

7 months ago

Super grateful for @ryolu_ and @aaron_epstein diving deep on Klavis website in the latest YC video! That's the kind of feedback that actually shapes our products... back to work now!

1

17

2

3

3K

Zihao Lin @ZihaoLin685013

7 months ago

@auchenberg @pipedream @Klavis_AI can help with that

0

0

0

0

59

ZihaoLin685013 retweeted

Klavis AI (YC X25)

7 months ago

Great Collaborations with @FireworksAI_HQ for Reinforcement Learning with MCP servers! Check this out 👇

1

15

4

1

906

Zihao Lin @ZihaoLin685013

7 months ago

DM us and I'm happy to help onboard your team!

7 months ago

Happy to offer 6 months free of @Klavis_AI for all customers from @pipedream . Just DM me and we will support you for the transition!

0

6

2

0

603

0

1

0

0

87

Zihao Lin @ZihaoLin685013

7 months ago

Managing complex permissions kills velocity... Check out role-based access control (RBAC) at @Klavis_AI - Organize your entire team effortlessly. - Fine-grained control down to specific roles. - Secure every connector individually. See it in action below. 👇

0

3

1

0

121

Zihao Lin @ZihaoLin685013

7 months ago

We've all been trained to think "bigger is better." But the reality? Most models start degrading long before that... Around the 200k mark, the "context rot" begins: → Slower inferences → Degraded quality → Lost context It's the hidden bottleneck that kills performance. Stop focusing on the 1M marketing number. The real insight is finding the "pre-rot threshold." and for most models is 128k - 200k. Use this number as your trigger for context reduction and management. This is critical when your AI agent interacts with tools or MCP servers. Those interactions consume massive amounts of context and will push you into the "rot" zone faster than you think.

ZihaoLin685013's tweet photo. We've all been trained to think "bigger is better."

But the reality? Most models start degrading long before that...

Around the 200k mark, the "context rot" begins:
→ Slower inferences
→ Degraded quality
→ Lost context

It's the hidden bottleneck that kills performance.

Stop focusing on the 1M marketing number. The real insight is finding the "pre-rot threshold." and for most models is 128k - 200k.

Use this number as your trigger for context reduction and management.

This is critical when your AI agent interacts with tools or MCP servers. Those interactions consume massive amounts of context and will push you into the "rot" zone faster than you think.

1

1

0

0

71

Zihao Lin @ZihaoLin685013

7 months ago

@xiangkaizeng cooking

0

2

0

0

24

Last Seen Users on Sotwe

Trends for you

Most Popular Users