RanW @_RanW_ - Twitter Profile

RanW @_RanW_

10 days ago

@pietromazzaglia @1x_tech Welcome to the bay Pietro!

1

0

111

RanW @_RanW_

about 1 month ago

@natolambert Olympic park good place to run!

0

1

0

258

_RanW_ retweeted

Conor Heins @conorheins

3 months ago

pymdp 1.0.0 is here: batched, autodifferentiable, JIT-compiled active inference in JAX: https://t.co/Hhzsh1wOv5 This release brings: GPU/TPU-ready active inference autodiff through inference, planning and learning easy parallelization and batching with vmap()

2

98

25

75

9K

RanW @_RanW_

4 months ago

@agarwl_ Actually R1 optimality I think

0

29

Who to follow

Benjamin Goldman

@BenGoldman369

Founder & CRO, CCO/ GTM SaaS / B2B & ABM Marketing Guru/ Angel Investor / 40under40 / Data Whisper/ Golf Enthusiast/ Focus on Purpose & Path

Bahi 🎒

@Bahman_Apl

AI agents unlock a new level of flow in human conscious. I coin the phrase here "agent flow" for the history. You are welcome. Date Feb 10 2026 #minab

Oscar Mitchell

@oscarmitchell92

El dolor y el sufrimiento sin empatía son el infierno mismo. Padre por sobre todas las cosas.

RanW @_RanW_

4 months ago

@agarwl_ Use R2 as reward shaping if multi step? Guaranteed R1 improvement I think

1

0

1

547

RanW @_RanW_

5 months ago

@Waymo into 2026💨

0

44

RanW @_RanW_

6 months ago

@SarahChieng Honestly the other run this morning was a bit too fast 😅 slow is great

0

1

0

99

RanW @_RanW_

6 months ago

@grx_xce Hi, can’t attend but is there a way to learn more about this?

0

67

RanW @_RanW_

7 months ago

Sometimes it’s very useful to write down the Bayes net/factor graph/causal DAG of your env. Helped me quite a lot e.g. studying Alchemy: https://t.co/2WKAGmxhif

_RanW_'s tweet photo. Sometimes it’s very useful to write down the Bayes net/factor graph/causal DAG of your env.

Helped me quite a lot e.g. studying Alchemy: https://t.co/2WKAGmxhif https://t.co/sBwXss4L3s

Pablo Samuel Castro @pcastr

7 months ago

🚨The Formalism-Implementation Gap in RL research🚨 Lots of progress in RL research over last 10 years, but too much performance-driven => overfitting to benchmarks (like the ALE). 1⃣ Let's advance science of RL 2⃣ Let's be explicit about how benchmarks map to formalism 1/X

pcastr's tweet photo. 🚨The Formalism-Implementation Gap in RL research🚨

Lots of progress in RL research over last 10 years, but too much performance-driven => overfitting to benchmarks (like the ALE).

1⃣ Let's advance science of RL
2⃣ Let's be explicit about how benchmarks map to formalism

1/X https://t.co/qEBG1p5x7R

2

153

28

102

9K

0

259

_RanW_ retweeted

Harshit Sikchi

@harshit_sikchi

10 months ago

At @RLBRew_RLC today we are presenting 2 works on unsupervised RL and 1 work on inverse RL. Stop by the poster session to learn more! Details below:

1

8

1

2

583

_RanW_ retweeted

Alec Tschantz @a_tschantz

12 months ago

New paper from @VERSESAI - AXIOM is a world model that learns to play pixel-based arcade games in minutes. Preprint: https://t.co/DnqwZDaN6l Blog: https://t.co/bIsschojXA Code: https://t.co/nxQKlm8swF 🧵

a_tschantz's tweet photo. New paper from @VERSESAI - AXIOM is a world model that learns to play pixel-based arcade games in minutes.

Preprint: https://t.co/DnqwZDaN6l
Blog: https://t.co/bIsschojXA
Code: https://t.co/nxQKlm8swF
🧵 https://t.co/KgmyVq7IEa

6

69

29

33

11K

RanW @_RanW_

about 1 year ago

https://t.co/ezFJy4mW9I

0

64

RanW @_RanW_

about 1 year ago

New post studying the empowerment objective for the assistance game in human-AI collaboration. What is empowerment optimizing? Is it aligned with human preference? What's the ultimate objective for human-AI collaboration? 👇

_RanW_'s tweet photo. New post studying the empowerment objective for the assistance game in human-AI collaboration. What is empowerment optimizing? Is it aligned with human preference? What's the ultimate objective for human-AI collaboration? 👇 https://t.co/MiYE2mQ6bN

1

0

121

RanW @_RanW_

about 1 year ago

We also found some useful implementation tricks and tips and observations along the way. These details are documented in this blog (https://t.co/wXXyakWhhP).

0

1

0

86

RanW @_RanW_

about 1 year ago

Hi imitation learning friends, I am excited to introduce CleanIL (https://t.co/jacC3COKXZ), a repo of high quality single-file implementations of imitation learning and inverse RL algos inspired by CleanRL and built on @torchrl1.

1

21

5

11

2K

RanW @_RanW_

about 1 year ago

CleanIL aims to address this by gathering SOTA algos scattered all over the internet into a single repo. We implemented 7 algos as a starting point. Future plans are outlined in this blog post (https://t.co/Bsunt8kwtB) along with interesting use cases of IL and IRL.