Joseph Suarez 🐡 @jsuarez - Twitter Profile

Pinned Tweet

Joseph Suarez 🐡

@jsuarez

about 2 months ago

Releasing PufferLib 4.0: Train agents in seconds

39

1K

95

865

191K

Joseph Suarez 🐡

@jsuarez

about 2 hours ago

Still sim2real! Also the public version of PufferDrive is built on PufferLib 3.0. Our latest PufferLib 4.0 has some major advancements to our general policy architecture, 3-5x faster training, and up to 10x faster wallclock to reach the same fixed level of performance vs. 3.0 on most of our baseline tasks

0

2

0

26

Joseph Suarez 🐡

@jsuarez

about 5 hours ago

PufferLib on a real vehicle! PufferDrive is a collaboration with NYU @EugeneVinitsky @daphne_cor et. al + @spenccheng at Puffer. Working on AV RL? We offer R&D contracts, sim development, and support. Contact jsuarez🐡puffer🐡ai.

jsuarez's tweet photo. PufferLib on a real vehicle! PufferDrive is a collaboration with NYU @EugeneVinitsky @daphne_cor et. al + @spenccheng at Puffer.

Working on AV RL? We offer R&D contracts, sim development, and support. Contact jsuarez🐡puffer🐡ai. https://t.co/GcGFparcTt

2

65

4

14

3K

Joseph Suarez 🐡

@jsuarez

about 5 hours ago

@yacineMTB It's 0.14 seconds with the standard pufferlib env

0

5

0

1

313

Who to follow

Ben Eysenbach

@ben_eysenbach

Prof @ Princeton CS working on AI/ML/RL. 🦋@ https://t.co/hz4KZsv5iO

Marc G. Bellemare

@marcgbellemare

Modelling @ Cohere. Ex RL research lead at Google Brain, DeepMind. Textbook author. Co-founder, Reliant AI.

Abhishek Gupta

@abhishekunique7

Assistant Professor at University of Washington. I like robots, and reinforcement learning. Previously: post-doc at MIT, PhD at Berkeley

Joseph Suarez 🐡

@jsuarez

about 6 hours ago

Reinforcement learning research with Joseph Suarez https://t.co/dDz5xzI10G

0

12

1

0

750

Joseph Suarez 🐡

@jsuarez

1 day ago

Reinforcement learning research with Joseph Suarez https://t.co/NE26ggiepv

0

12

0

1

1K

Joseph Suarez 🐡

@jsuarez

2 days ago

That doesn't track w/ the implementation or with results from 4.0 sweeps. I've never in 30k+ experiments seen it pin LR to the minimum. It has it pinned to the max on breakout because the model is tiny and you can genuinely get away with it. The GPs are not just fitting a line to hparams globally either

0

79

Joseph Suarez 🐡

@jsuarez

3 days ago

@BullTheoryio As a Florida resident: do y'all understand how much better the state would be without mosquitos? If I go outside for an hour past sundown, it's 10+ bites every time. Lots of people have porches with entire pools screened in.

6

80

0

3

4K

Joseph Suarez 🐡

@jsuarez

3 days ago

@nilinabra Want to try the new method on PufferLib directly? We train up to 47x47 mazes and have a simple single-file Muon implementation in CUDA C. Our tasks can get arbitrarily sparse. The chance of getting a reward on a random Sokoban map with a random policy is <1/1B per step

1

21

1

9

2K

Joseph Suarez 🐡

@jsuarez

5 days ago

@dogecahedron @__tinygrad__ I actually tried those! Tinygrad with static shapes was competitive with torch, but it was way slower with dynamic

2

6

0

449

Joseph Suarez 🐡

@jsuarez

5 days ago

they build some very nice tools, and getting some automatic kernel optim would be quite nice. We write our kernels manually now because everything else fails. We'd write fewer of them like that if it didn't. Startup time is massive QoL for research and dev. Lots of our experiments are short. To give you an idea, we solve our basic benchmark tasks in 0.1 to 10 seconds. We run thousands of such experiments, which sweep over net size and depth automatically

2

3

0

117

Joseph Suarez 🐡

@jsuarez

5 days ago

@dogecahedron @__tinygrad__ Maybe but this would be so much worse than our current setup that it's not even worth considering. We have virtually zero startup time on our runs because all we need is a quick CUDAGraph trace, no jit, and we are 3-5x faster than torch/jax/tiny without even doing kernel search

1

0

97

Joseph Suarez 🐡

@jsuarez

5 days ago

@dogecahedron @__tinygrad__ you could use the kernel gen or directly inline your own kernels without external binds. Embedding C code in a string is pretty awful

1

0

101

Joseph Suarez 🐡

@jsuarez

5 days ago

The Puffer RL environment for Arkhai's compute marketplace is open source on our GitHub! Tiny RL agents learn to buy, sell, and iteratively negotiate. Development ongoing!

Arkhai

@arkhai_io

5 days ago

Today we're launching Simple Compute Market (SCM). The market is simple: agents find compute, negotiate, settle, and get access without a human driving every step. Open-source. Agent-driven. Public good. No token. No fees.

16

137

30

79

53K

3

88

7

28

8K

Joseph Suarez 🐡

@jsuarez

5 days ago

@dogecahedron @__tinygrad__ I do think that once mature, a C version of tinygrad would be awesome. Much easier to mix in optimized hand-written kernels etc. without the language barrier. Python + C/Cuda extensions is miserable

2

0

116

Joseph Suarez 🐡

@jsuarez

5 days ago

@dogecahedron @__tinygrad__ CUBLAS for matmuls. Our kernels are for activations, sequence wise fns for our MinGRU arch, loss fn, etc. Our models are small so these will eat your compute budget without heavy fusion etc. We don't have any fancy attention layers. Those are slow in RL

1

0

130

Joseph Suarez 🐡

@jsuarez

5 days ago

@dogecahedron @__tinygrad__ We just write the kernels directly

1

0

102

Joseph Suarez 🐡

@jsuarez

5 days ago

@dogecahedron @__tinygrad__ I tried that. It outputs batshit insane kernels you'd never want to touch manually. I still really like tinygrad, but it's too high level for my projects

1

0

120

Joseph Suarez 🐡

@jsuarez

6 days ago

@PatrickDanqman @haydendevs We do have some professional sim work in finance

1

0

102

Joseph Suarez 🐡

@jsuarez

6 days ago

@rodney_lafuente @haydendevs When someone offers me enough money that I can do it for a couple years and then have enough cash banked to run my own small 5-10 person lab forever

1

5

0

115

Joseph Suarez 🐡

@jsuarez

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users