Plurai @pluraiAI - Twitter Profile

Pinned Tweet

2 months ago

We're launching vibe training. Describe what your agent should and shouldn't do. We generate the edge cases, build the test set, train a model calibrated to your policies. In minutes.

3

16

2

0

437K

pluraiAI retweeted

Plurai

@pluraiAI

2 months ago

We're launching vibe training. Describe what your agent should and shouldn't do. We generate the edge cases, build the test set, train a model calibrated to your policies. In minutes.

3

16

2

0

437K

Plurai

@pluraiAI

2 months ago

@fmerian Thank you Florian!

0

1

0

12

Plurai

@pluraiAI

2 months ago

Within hours of our launch, thousands of agent builders were already live with vibe training. Today we're on Product Hunt. An upvote takes 30 seconds. 👉 https://t.co/0DSKjrXv7s

Ilan Kadar

@ilan_kadar

2 months ago

Yesterday blew past every expectation. I barely slept (2 hours, if I’m honest)… and now we’re heading straight into our #ProductHunt launch and I need you! 🚀 Because something clicked. We launched vibe training - and within hours, thousands of agent builders started creating evals and guardrails for their own use cases! It’s moving fast. Because the truth is simple: Building agents is easy. Making them reliable in production is not. That’s what vibe training fixes. If you’ve been following, building with us, or just rooting from the sidelines — we need your support ❤️ • Open the link • Hit upvote • Drop a quick comment This takes 30 seconds and directly impacts our ranking. Let’s push this to the top today https://t.co/Np5GbG2xWm

4

16

3

4

3K

8

16

4

3

2K

Plurai

@pluraiAI

2 months ago

@kunalstwt Thanks @kunalstwt !

0

1

0

13

Plurai

@pluraiAI

2 months ago

@mimu_ai1 Thank you!

0

15

Plurai

@pluraiAI

2 months ago

@aaliya_va Thank you!

0

1

0

14

pluraiAI retweeted

Kunal Kushwaha

@kunalstwt

2 months ago

Air Canada’s chatbot once literally made up its own refund policy in court and won a lawsuit for the customer, not the airline. There’s a new term being coined right now called vibe training by the company @pluraiAI, and they’ve basically built a way to use tiny, fast models as guardrails to catch hallucinations in sub-100ms and the cost is over 8x lower than GPT-5-mini. 🔥👉 They’re live on Product Hunt today: https://t.co/oCrSIlcaMH If you’re building agents, go check them out, grab the free trial, and show them some love on the launch! 🫶 The best part? You don’t need a PhD in AI. Sponsored by Plurai.

0

84

4

50

12K

pluraiAI retweeted

fmerian/launch

@fmerian

2 months ago

This team just coined the concept of vibe training. Build real-time, tailored evals and guardrails for your agent, with high accuracy at a fraction of the LLM cost. Launching today on @ProductHunt.

3

4

1

314

pluraiAI retweeted

Plurai

@pluraiAI

2 months ago

Within hours of our launch, thousands of agent builders were already live with vibe training. Today we're on Product Hunt. An upvote takes 30 seconds. 👉 https://t.co/0DSKjrXv7s

8

16

4

3

2K

pluraiAI retweeted

Ilan Kadar

@ilan_kadar

2 months ago

Yesterday blew past every expectation. I barely slept (2 hours, if I’m honest)… and now we’re heading straight into our #ProductHunt launch and I need you! 🚀 Because something clicked. We launched vibe training - and within hours, thousands of agent builders started creating evals and guardrails for their own use cases! It’s moving fast. Because the truth is simple: Building agents is easy. Making them reliable in production is not. That’s what vibe training fixes. If you’ve been following, building with us, or just rooting from the sidelines — we need your support ❤️ • Open the link • Hit upvote • Drop a quick comment This takes 30 seconds and directly impacts our ranking. Let’s push this to the top today https://t.co/Np5GbG2xWm

4

16

3

4

3K

pluraiAI retweeted

Plurai

@pluraiAI

2 months ago

Today we're launching vibe training. Describe what your agent should and shouldn't do. We generate the edge cases, build the test set, train a model calibrated to your policies. In minutes. Start Vibe-training: https://t.co/PQ5AT3X3x3

0

10

4

1

179

pluraiAI retweeted

Daily Dose of Data Science

@DailyDoseOfDS_

2 months ago

Vibe train your AI agents. This new method can replace LLM-as-a-judge for production agents. Most teams point a giant LLM at their agent's output and call it evaluation. It works, but it comes with two real costs: - It's slow and expensive at inference time - It misses the domain-specific failures that actually matter to your use case Vibe training flips the whole setup. Researchers at Plurai distill a small language model that's specialized for your agent's exact behavior, your edge cases, and your failure modes. The SLM becomes your evaluator and your runtime guardrail in one. Here's why this is a big deal: - Cheap enough to run inline on every agent step, not just offline batches - Catches the failures that generic LLM judges shrug off - Same model guards production and grades it, so eval and runtime stay in sync A small specialized model beating a giant general one is becoming a pattern. Distillation is quietly turning into one of the most underrated techniques for shipping reliable agents. Try it here: https://t.co/KKzTfveJ26 Paper: https://t.co/GcIm0PKlQr

DailyDoseOfDS_'s tweet photo. Vibe train your AI agents.

This new method can replace LLM-as-a-judge for production agents.

Most teams point a giant LLM at their agent's output and call it evaluation. It works, but it comes with two real costs:

- It's slow and expensive at inference time
- It misses the domain-specific failures that actually matter to your use case

Vibe training flips the whole setup.

Researchers at Plurai distill a small language model that's specialized for your agent's exact behavior, your edge cases, and your failure modes. The SLM becomes your evaluator and your runtime guardrail in one.

Here's why this is a big deal:

- Cheap enough to run inline on every agent step, not just offline batches
- Catches the failures that generic LLM judges shrug off
- Same model guards production and grades it, so eval and runtime stay in sync

A small specialized model beating a giant general one is becoming a pattern. Distillation is quietly turning into one of the most underrated techniques for shipping reliable agents.

Try it here: https://t.co/KKzTfveJ26

Paper: https://t.co/GcIm0PKlQr

5

61

9

46

6K

Plurai

@pluraiAI

2 months ago

@akshay_pachaar Thank you Akshay!

0

74

pluraiAI retweeted

Akshay 🚀

@akshay_pachaar

2 months ago

Vibe train your AI agents. There's a new method that could replace LLM-as-a-judge for production agents. Most teams rely on a giant LLM as a judge to evaluate and guard their agent. But it has two major drawbacks: - It's slow and expensive at inference time - It often misses domain-specific failures Vibe training flips this. Researchers at Plurai distill a small language model that's specialized for your agent's exact use case. The SLM becomes your evaluator and your runtime guardrail, both in one. The training data isn't hand-curated either. They spin up a swarm of adversarial agents that debate and stress-test every use case your agent is supposed to handle. That synthetic interaction data trains the specialized SLM. So the judge actually understands what "wrong" looks like in your specific domain. The reported gains vs. standard LLM-as-a-judge setups: - ~8x faster inference - ~50% fewer evaluation errors Smaller, faster, and more accurate because it's specialized for the job. The SLM-for-agents thesis is playing out in a very concrete way. If LLM-as-a-judge is your current evaluation layer, this is worth benchmarking against. Paper link in the replies.

akshay_pachaar's tweet photo. Vibe train your AI agents.

There's a new method that could replace LLM-as-a-judge for production agents.

Most teams rely on a giant LLM as a judge to evaluate and guard their agent. But it has two major drawbacks:

- It's slow and expensive at inference time
- It often misses domain-specific failures

Vibe training flips this.

Researchers at Plurai distill a small language model that's specialized for your agent's exact use case. The SLM becomes your evaluator and your runtime guardrail, both in one.

The training data isn't hand-curated either.

They spin up a swarm of adversarial agents that debate and stress-test every use case your agent is supposed to handle. That synthetic interaction data trains the specialized SLM.

So the judge actually understands what "wrong" looks like in your specific domain.

The reported gains vs. standard LLM-as-a-judge setups:

- ~8x faster inference
- ~50% fewer evaluation errors

Smaller, faster, and more accurate because it's specialized for the job. The SLM-for-agents thesis is playing out in a very concrete way.

If LLM-as-a-judge is your current evaluation layer, this is worth benchmarking against.

Paper link in the replies.

21

157

25

163

12K

Plurai

@pluraiAI

2 months ago

@Mdkhurshed76417 @ilan_kadar 🚀🚀🚀

0

1

0

11

Plurai

@pluraiAI

2 months ago

@InnerCompassHQ @ilan_kadar EXACTLY!

0

1

0

8

Plurai

@pluraiAI

2 months ago

@NirDiamantAI Thank you Nir!

0

22

pluraiAI retweeted

NirD

@NirDiamantAI

2 months ago

THIS IS REVOLUTIONARY Building agents from now on is going to be a totally different thing. much more reliable much faster much cheaper and as far as I know, it is currently free. they train a small language model in a super sophisticated way that makes unlabeled data labeled, using agents that debate to get the best label. it is brilliant...

1

9

3

1

325

pluraiAI retweeted

Chidanand Tripathi

@thetripathi58

2 months ago

I used to pay for the most expensive AI models just to double-check my own agents. It felt like a "safety tax" I had to pay, but it was killing my margins and making everything feel slow. I was basically paying twice for the same result. Plurai finally fixed this. Instead of a giant model, you train a tiny one that only cares about your specific rules. You just type what you want in plain English, and it builds a custom safety net in minutes. It runs instantly and costs almost nothing. This is how you actually move from a prototype to something that works at scale. Check it out:

15

142

41

17

39K

pluraiAI retweeted

Santiago

@svpino

2 months ago

I've made a ton of money helping companies implement LLM-as-a-judge evaluations. LLM Judges provide a ton of value. But the hard part is choosing the model to implement the judge. • The family of GPT-5 models is very good, but slow and expensive. • Models like Gemma and Phi are fast and cheap, but not that good. Most of the time, you can only run a percentage of your traffic through the model (otherwise it would be too expensive and slow). But now, there's a better strategy.

16

216

25

440

30K

Plurai

@pluraiAI

Last Seen Users on Sotwe

Trends for you

Most Popular Users