Rafael Pardinas @muchomuchacho - Twitter Profile

Pinned Tweet

Rafael Pardinas

@muchomuchacho

3 days ago

So exciting to have our work featured among top class players!

SemiAnalysis

@SemiAnalysis_

3 days ago

RL Systems Mind the Gap: Matching Trainer and Generator Throughput RL Training Infrastructure, GRPO, PipelineRL, Async RL, Policy Staleness, RL Sandbox Infra, CPU Requirements, TCO Analysis, Thinking Machines Tinker https://t.co/yr5oH99h4B

8

404

52

471

180K

1

5

0

390

muchomuchacho retweeted

Alex Gurung @AlexAag1234

about 12 hours ago

Been thinking recently about how to improve credit assignment in long horizon RL? Our new MosaicLeaks blog post describes our method to accurately value actions via situational rewards, improving our privacy-aware research agent over outcome-only rewards! https://t.co/xCbOKEcdwa

0

10

2

3

944

Rafael Pardinas

@muchomuchacho

about 14 hours ago

@eisokant @eliebakouch Base models are amazing for RL tuning and reasoning. If a company is looking to shape some behaviour into their models I would expect them to start with a base stage.

0

1

0

53

Rafael Pardinas

@muchomuchacho

1 day ago

@johnschulman2 PipelineRL: rejected from NeurIPS 2025

0

9

0

1

5K

Who to follow

Mauricio Castro

@Maurici87045421

caballero, bien dotado, atractivo, educado, profesionista, muy hot, CDMX.

Camilo Serna Zamora

@tekelala

This person is building new futures with #BusinessCyborgs…

17 days ago

@cwolferesearch Fantastic compilation. Thanks!

1

0

340

muchomuchacho retweeted

Alex Gurung @AlexAag1234

18 days ago

Excited to share my recent work @ServiceNowRSRCH ! We introduce a new privacy-centric deep research dataset and show models frequently leak enterprise information. However, training with dense _situational_ rewards efficiently learns to jointly optimize performance and privacy

0

8

3

2

637

Rafael Pardinas

@muchomuchacho

18 days ago

Led by @AlexAag1234 at ServiceNow AI Research, with @gspandana , @alexandredrouin , @ILaradji , @PerouzT and me. Paper: https://t.co/2lGKyiJzW2

0

1

0

218

Rafael Pardinas

@muchomuchacho

18 days ago

MosaicLeaks is now on arXiv. The Mosaic Effect captures a simple idea: small fragments can look harmless alone, but become revealing in aggregate. Deep research agents can leak enterprise information in exactly this way. 1/9

1

6

2

3

2K

Rafael Pardinas

@muchomuchacho

18 days ago

The core idea: Enterprise agent privacy failures will not only come from copying private text. They can also come from the external actions agents take while trying to be useful. Privacy shouldn't come at the cost of utility, we can optimise for both. 8/9

1

0

40

Rafael Pardinas

@muchomuchacho

25 days ago

this is too good

テコまる @tecomalupepepe

25 days ago

長椅子振動主への反射システム

2K

166K

23K

18K

14M

0

68

Rafael Pardinas

@muchomuchacho

29 days ago

This is becoming really powerful. More to come for high latency agentic pipelines

Rafael Pardinas

@muchomuchacho

2 months ago

Better reasoning does not have to mean longer reasoning. Apriel OpenReasoner: fully reproducible multi-domain RL post-training using public datasets. 30-50% shorter traces, no quality trade-off. @ServiceNowRSRCH @ehsk0 @dvazquezcv @alexandredrouin

4

11

5

4

3K

0

1

0

55