Adaptive ML

Verified account

@AdaptiveML

Build, own, and deploy specialized models. Drive business value through Reinforcement Learning.

Paris & New York

Joined October 2023

42 Following

459 Followers

202 Posts

1 day ago

If you haven't checked it out yet, we built an RL Glossary. Reinforcement learning is how modern AI actually gets trained. But the resources out there are scattered, fragmented, and frankly, confusing. So we organized it. One place. Concepts laid out by where they fit in the pipeline. Read it in order or jump around. Take a look. https://t.co/skWjcB8Jsn

AdaptiveML's tweet photo. If you haven't checked it out yet, we built an RL Glossary.

Reinforcement learning is how modern AI actually gets trained. But the resources out there are scattered, fragmented, and frankly, confusing.

So we organized it. One place. Concepts laid out by where they fit in the pipeline. Read it in order or jump around.

Take a look.

https://t.co/skWjcB8Jsn

0

3

1

5

238

3 days ago

ICYMI: a 3B open-source model, fine-tuned on Adaptive Engine, matched proprietary accuracy on function calling for @CCSMedical patient support. With over 90% lower latency. Worth a read if you missed it the first time. https://t.co/zVGh2jNFN7

0

1

0

0

40

6 days ago

How do you train one model to be poetic, another to be terse, and a third to be completely unhinged? Reinforcement learning. Same base model, same loop, different rewards. We put together a short video that walks through it, naming paint colors as the example. Rollouts, rewards, GRPO, and what happens when a model starts gaming the judge. Want to learn more, check out our RL Glossary - https://t.co/FkHDBXVydf

0

1

0

0

58

8 days ago

900,000 call transcripts. Every day. That's the volume @ATT processes through its call center AI pipeline. Each one needs accurate summarization, PII removal, regulatory flagging, and bilingual support. They were running it on a general-purpose LLM with a long system prompt. It worked, but it was expensive, inflexible, and slow to iterate on. We worked with AT&T to fine-tune a specialized model purpose-built for the job. 30% faster. More accurate. Read the full story on our website. https://t.co/HiABBG8oF9

AdaptiveML's tweet photo. 900,000 call transcripts. Every day.

That's the volume @ATT processes through its call center AI pipeline. Each one needs accurate summarization, PII removal, regulatory flagging, and bilingual support.

They were running it on a general-purpose LLM with a long system prompt. It worked, but it was expensive, inflexible, and slow to iterate on.

We worked with AT&T to fine-tune a specialized model purpose-built for the job. 30% faster. More accurate.

Read the full story on our website.

https://t.co/HiABBG8oF9

0

0

0

0

35

10 days ago

We built a visual walkthrough of how reinforcement learning actually works. Not theory. Just one example: training a model to name paint colors. From a single rollout, to a group, to a reward, to GRPO, to what happens when the model starts gaming the judge. If you've ever wanted RL explained without the math wall, start with this post below. https://t.co/mdhe0rKFmi

AdaptiveML's tweet photo. We built a visual walkthrough of how reinforcement learning actually works.

Not theory. Just one example: training a model to name paint colors. From a single rollout, to a group, to a reward, to GRPO, to what happens when the model starts gaming the judge.

If you've ever wanted RL explained without the math wall, start with this post below.

https://t.co/mdhe0rKFmi

0

2

1

0

63

16 days ago

Want to know what we learnt from processing trillions of tokens at Fortune 500 companies? Our Co-Founder & Chief Customer Officer Alessandro Cappelli’s talk at @aiDotEngineer covers just that. An insight into what it really takes to deploy large-scale AI agents in production. Watch the full talk on the Startup Hub website 👇 https://t.co/is0Aux1wi2

AdaptiveML's tweet photo. Want to know what we learnt from processing trillions of tokens at Fortune 500 companies?

Our Co-Founder & Chief Customer Officer Alessandro Cappelli’s talk at @aiDotEngineer covers just that.

An insight into what it really takes to deploy large-scale AI agents in production.

Watch the full talk on the Startup Hub website 👇

https://t.co/is0Aux1wi2

0

0

0

0

100

17 days ago

Not all meetings are the same, especially when on a sunny rooftop. Last week the commercial team came together at the Toronto office for an offsite. As a remote first company, bringing teams together regularly is fundamental to building a great culture. Want to learn more about Adaptive ML - check out our website here - https://t.co/kbvN4Lqnt5

AdaptiveML's tweet photo. Not all meetings are the same, especially when on a sunny rooftop.

Last week the commercial team came together at the Toronto office for an offsite.

As a remote first company, bringing teams together regularly is fundamental to building a great culture.

Want to learn more about Adaptive ML - check out our website here - https://t.co/kbvN4Lqnt5

0

1

0

0

105

20 days ago

In patient support, every second matters. With @CCSMedical, we fine-tuned a 3B @metaai Llama model (on @AWSstartups) on Adaptive Engine to match proprietary model accuracy on function calling, with over 90% lower latency. Smaller, faster, and tuned for the task. Read the case study: https://t.co/KLA9JNw0fr

AdaptiveML's tweet photo. In patient support, every second matters.

With @CCSMedical, we fine-tuned a 3B @metaai Llama model (on @AWSstartups) on Adaptive Engine to match proprietary model accuracy on function calling, with over 90% lower latency.

Smaller, faster, and tuned for the task.

Read the case study:
https://t.co/KLA9JNw0fr

0

1

0

0

150

22 days ago

You don't choose RL. Your workload does. A Fortune 500 customer-operations team we worked with started where most teams do: prompting. It got them to 90% accuracy on a frontier model and a $1.6M annual bill. Then they specialized. An 8B model trained with SFT cut roughly 80% of the cost. RL extended it further, learning from outcomes the demonstrations couldn't anticipate. Prompting defined early behavior. SFT stabilized it. RL pushed past the demonstration ceiling. This progression isn't a stylistic preference. It's where high-volume production LLM systems converge. Our latest piece breaks down when to use each technique, where each one hits a structural limit, and a simple decision rule for figuring out what your system actually needs. Read the full article on our website: https://t.co/Dd0DaYyogQ

AdaptiveML's tweet photo. You don't choose RL. Your workload does.

A Fortune 500 customer-operations team we worked with started where most teams do: prompting. It got them to 90% accuracy on a frontier model and a $1.6M annual bill.

Then they specialized. An 8B model trained with SFT cut roughly 80% of the cost. RL extended it further, learning from outcomes the demonstrations couldn't anticipate.

Prompting defined early behavior. SFT stabilized it. RL pushed past the demonstration ceiling.

This progression isn't a stylistic preference. It's where high-volume production LLM systems converge.

Our latest piece breaks down when to use each technique, where each one hits a structural limit, and a simple decision rule for figuring out what your system actually needs.

Read the full article on our website: https://t.co/Dd0DaYyogQ

0

0

0

0

81

23 days ago

Latest product update is out. Highlights include: - Function graders for deterministic evaluation logic in RL and eval workflows - Constrained decoding to enforce structured outputs at generation time - Checkpoint promotion to evaluate and ship intermediate training states As post-training workflows mature, evaluation quality, output reliability, and model promotion become core parts of the stack. Read the full update here - https://t.co/qvGgQ6F3MD

AdaptiveML's tweet photo. Latest product update is out.

Highlights include:
- Function graders for deterministic evaluation logic in RL and eval workflows
- Constrained decoding to enforce structured outputs at generation time
- Checkpoint promotion to evaluate and ship intermediate training states

As post-training workflows mature, evaluation quality, output reliability, and model promotion become core parts of the stack.

Read the full update here -
https://t.co/qvGgQ6F3MD

0

0

0

0

96

24 days ago

New case study with @awscloud and @AIatMeta. We fine-tuned a Llama 3.2 3B model on Adaptive Engine to power a patient support agent for @CCSMedical. Result: 90%+ latency reduction. ~230ms end-to-end responses. Specialized small models beat generalists where it counts.

27 days ago

90% reduction in response latency. That's what enterprise AI should look like. @AdaptiveML deployed a Llama-powered AI agent on AWS to transform patient service operations for CCS, making chronic care support faster and more reliable than ever.

21

423

106

85

16M

0

1

0

1

281

24 days ago

Introducing Recipes. The latest post from our DevRel, Dylan Ebert, explains how a single Python file can define a full AI workflow on Harmony, Adaptive Engine's compute backend. https://t.co/YrxIQTWaw1

AdaptiveML's tweet photo. Introducing Recipes.

The latest post from our DevRel, Dylan Ebert, explains how a single Python file can define a full AI workflow on Harmony, Adaptive Engine's compute backend.

https://t.co/YrxIQTWaw1 https://t.co/WNOe0ONzpN

0

0

0

0

89

28 days ago

As a post-training platform for specialized agents, we’re often asked how teams should choose between prompting, SFT, and RL. These techniques are not mutually exclusive and are often used together, but they apply to different constraints in the system. We break down a practical framework for how these approaches fit together and how to evaluate their tradeoffs. If you’re building with LLMs, this helps clarify what moves performance and cost. https://t.co/Dd0DaYyogQ

0

2

1

0

58

29 days ago

Same paint color. Same base model. Four different names: Sandy Beige, Sunset Dust, Burnt Sienna, Cinnamon Toast. The difference is what each model was rewarded for. Our latest piece visualizes reinforcement learning using a simple task: naming paint colors from hex codes. One base model, three judges, three trained 'painters'. The Poet rewards evocative names. The Architect rewards terse, material-led ones. The Unhinged rewards the vivid and off-register. Nobody wrote a 'be more poetic' rule. The reward did the work. That's the shape of RL. A reward, a loop, and outputs nudged toward what the reward says is good. Anything you can score becomes a training signal: math, code, conversation. Same loop, different reward. The piece also covers where this breaks down. Reward hacking, Goodhart's law, and how to avoid both. Find the full interactive post on our website here - https://t.co/mdhe0rK7wK

AdaptiveML's tweet photo. Same paint color. Same base model. Four different names: Sandy Beige, Sunset Dust, Burnt Sienna, Cinnamon Toast.

The difference is what each model was rewarded for.

Our latest piece visualizes reinforcement learning using a simple task: naming paint colors from hex codes. One base model, three judges, three trained 'painters'.

The Poet rewards evocative names. The Architect rewards terse, material-led ones. The Unhinged rewards the vivid and off-register. Nobody wrote a 'be more poetic' rule. The reward did the work.

That's the shape of RL. A reward, a loop, and outputs nudged toward what the reward says is good. Anything you can score becomes a training signal: math, code, conversation. Same loop, different reward.

The piece also covers where this breaks down. Reward hacking, Goodhart's law, and how to avoid both.

Find the full interactive post on our website here - https://t.co/mdhe0rK7wK

0

1

0

0

190

about 1 month ago

9/9 Full engineering write-up on how Adaptive Engine unifies post-training, evaluation, and inference: https://t.co/y9mYbbDKDV

0

0

0

0

30

about 1 month ago

1/9 A 12B Gemma 3 model now summarizes 600,000 customer care calls a day at a major telecom. It replaced a generalist model. Judge-evaluated accuracy: 97.69%. ~5 points higher on the same benchmark. Here's the infrastructure that made it possible. 🧵👇

AdaptiveML's tweet photo. 1/9 A 12B Gemma 3 model now summarizes 600,000 customer care calls a day at a major telecom. It replaced a generalist model. Judge-evaluated accuracy: 97.69%. ~5 points higher on the same benchmark. Here's the infrastructure that made it possible. 🧵👇 https://t.co/41Ikh4vJlr

1

1

0

2

107

about 1 month ago

8/9 The end result at 600k calls a day: Higher accuracy. Lower cost. Compliance-grade outputs the previous system couldn't produce. And a model that keeps improving from production feedback. Infrastructure that appreciates over time.

1

0

0

0

34

Last Seen Users on Sotwe

Trends for you

Most Popular Users