TrainLoop @trainloop_ai - Twitter Profile

23 days ago

We partnered with @mercor_ai to test a simple idea: What if knowledge-work agents were just… coding agents? Result: +25% performance, 2x faster, cheaper, and new SOTA on APEX-Agents. @josancamon19

jackson_stokes's tweet photo. We partnered with @mercor_ai to test a simple idea:
What if knowledge-work agents were just… coding agents?

Result: +25% performance, 2x faster, cheaper, and new SOTA on APEX-Agents. @josancamon19 https://t.co/TEnkp0lsj2

6

100

9

88

17K

TrainLoop

@TrainLoop_ai

about 1 month ago

+48% from a single RL step, and 100k rollouts from a single policy. @LoganGrasby wrote up his findings that off policy training with OAPL is more robust than we thought!

Jackson Stokes

@jackson_stokes

about 1 month ago

can we train a model in single RL step? During recent experiments, @Logangrasby found that a single step of OAPL increased model performance from ~0 to 48% on a clinical reasoning and prediction task. Turns out, data staleness might matter less than we think. with @pathos :

jackson_stokes's tweet photo. can we train a model in single RL step?

During recent experiments, @Logangrasby found that a single step of OAPL increased model performance from ~0 to 48% on a clinical reasoning and prediction task.
Turns out, data staleness might matter less than we think. with @pathos : https://t.co/uuqQmblj6N

3

71

6

95

7K

0

4

0

3

490

TrainLoop

@TrainLoop_ai

about 2 months ago

Task-specific training can be far more efficient than we realize. This work by @hasith_v explores the LoRA-trained solution space for GSM8k, finding a massive "plane" of solutions, representable in a single rank.

Jackson Stokes

@jackson_stokes

about 2 months ago

We trained LoRA adapters of different ranks to understand training dynamics, finding that adapters for GSM8k live in a surprisingly vast, low-rank solution space. This hints that some model skills are easy to learn, and training is more forgiving than we think. @hasith_v 1/6 🧵

jackson_stokes's tweet photo. We trained LoRA adapters of different ranks to understand training dynamics, finding that adapters for GSM8k live in a surprisingly vast, low-rank solution space.

This hints that some model skills are easy to learn, and training is more forgiving than we think. @hasith_v 1/6 🧵 https://t.co/jmhXaBrmjT

5

252

26

221

23K

0

5

0

474

TrainLoop

@TrainLoop_ai

about 2 months ago

Specialized models for the most important tasks!

Jackson Stokes

@jackson_stokes

about 2 months ago

We post-trained MedGemma to be SoTA in visual medicine ddx, outperforming Opus 4.6, Gemini 3.1 and GPT-5.4 while running at ~1/30th the cost. @getnolla Part 1 - improving visual reasoning 🧵1/6

jackson_stokes's tweet photo. We post-trained MedGemma to be SoTA in visual medicine ddx, outperforming Opus 4.6, Gemini 3.1 and GPT-5.4 while running at ~1/30th the cost. @getnolla Part 1 - improving visual reasoning 🧵1/6 https://t.co/ri6InzBeca

6

34

9

3

4K

0

2

0

1

303

TrainLoop_ai retweeted

Joan Cabezas

@josancamon19

6 months ago

🧵 Labs and VC's are throwing cash at RL environments, especially for computer and browser use. Yet, with just 4 customers and over 30+ vendors, is cloning every website in the world really the path to scale? of course not. Introducing TRACE: Trajectory Recording and Capture of Environments.