Michael Matthews @mitrma - Twitter Profile

Pinned Tweet

9 days ago

Hindsight Experience Replay has become the ubiquitous method for goal-conditioned reinforcement learning, but leaves open the question of which goal to relabel with. In this work, accepted at ICML, we propose instead simply Learning Everything All at Once (LEO). 1/

4

210

31

147

25K

mitrma retweeted

Clarisse Wibault @ClarisseWibault

3 days ago

Mean Field Games provide a framework for modelling large populations. ICML26 Spotlight: Introducing Recurrent Structural Policy Gradient for partially observable MFGs with common noise, benefitting from faster convergence than model-free RL, but remaining tractable, unlike DP.

ClarisseWibault's tweet photo. Mean Field Games provide a framework for modelling large populations.
ICML26 Spotlight: Introducing Recurrent Structural Policy Gradient for partially observable MFGs with common noise, benefitting from faster convergence than model-free RL, but remaining tractable, unlike DP. https://t.co/LvbQxue1Sd

1

93

22

48

22K

Michael Matthews @mitrma

9 days ago

Work done at @FLAIR_ox with @JacksonMattT, @mcbeukman, Thomas Foster, @_aletcher, Scott Fujimoto, @cedcolas and @j_foerst. Blog post: https://t.co/3ONUTnuJxI GitHub: https://t.co/OihCR4rUqE Paper: https://t.co/QW50ARXwCi end/

0

12

0

12

773

Michael Matthews @mitrma

9 days ago

Hindsight Experience Replay has become the ubiquitous method for goal-conditioned reinforcement learning, but leaves open the question of which goal to relabel with. In this work, accepted at ICML, we propose instead simply Learning Everything All at Once (LEO). 1/

4

210

31

147

25K

Who to follow

Marin Vlastelica 🤖🎸

@vlastelicap

PostDoc @ ETH AI Center 🤖 | All things ML 🎲 | ex. @Meta, @DeepMind, @amazon, @MPI_IS 🇨🇭🇭🇷🇩🇪🎸🏀🎾

Roberta Raileanu

@robertarail

Open-Ended Team Lead and Senior Staff Research Scientist @GoogleDeepMind. Honorary Lecturer @UCL. ex @Meta | @NYU | @Princeton.

Foerster Lab for AI Research

@FLAIR_Ox

ML research group @uniofoxford. Focussed on multi-agent, open-ended, meta and reinforcement learning as well as agent based models. More at https://t.co/kMMdoaadJ3.

Michael Matthews @mitrma

9 days ago

While the focus of our work is on finite goal sets, we also adapt LEO for continuous goal sets through goal quantisation, achieving competitive results with Hindsight Experience Replay in continuous control tasks. 8/

mitrma's tweet photo. While the focus of our work is on finite goal sets, we also adapt LEO for continuous goal sets through goal quantisation, achieving competitive results with Hindsight Experience Replay in continuous control tasks.
8/ https://t.co/PbyTJ5jcI3

1

7

0

744

mitrma retweeted

Mikael Henaff @HenaffMikael

about 1 month ago

Happy to share that SOL has been accepted as spotlight to ICML :) come hear about SOL in Seoul!

1

32

2

7

3K

Michael Matthews @mitrma

about 2 months ago

@jsuarez I think the group would definitely be interested - after the NeurIPS deadline? I can preload my question: NLE is unsolved +already in C +people would be very impressed if it was solved. Do you think puffer could solve it?

2

6

0

1

404

Michael Matthews @mitrma

about 2 months ago

@elliotarledge This looks really cool! How does the agent trained with pufferlib perform? + can it transfer back to original Craftax? (i.e. is this a 1-1 exact environment remake?)

0

1

0

805

mitrma retweeted

Kimbo

@kimbochen

about 2 months ago

RL researchers when they try to think of a name for their new algorithm:

11

264

26

28

33K

Michael Matthews @mitrma

about 2 months ago

Very well deserved! Can't wait to see what you work on there :)

nathan monette @nathanrmonette

about 2 months ago

Excited to announce I’ll be joining @EugeneVinitsky at @nyutandon this autumn for a PhD! I will be working on the intersection of game theory, reinforcement learning, and autonomous vehicles. Thanks to everyone who helped me get to this point, especially from @FLAIR_Ox :)

19

116

4

13K

1

6

0

1

610

mitrma retweeted

Michael Beukman @mcbeukman

2 months ago

1/ As compute continues to grow and simulators continue to improve, it is becoming feasible to train RL agents for billions or trillions of timesteps. However, this is only useful if agents can continue learning over such long training horizons, which is far from given 👇

mcbeukman's tweet photo. 1/ As compute continues to grow and simulators continue to improve, it is becoming feasible to train RL agents for billions or trillions of timesteps. However, this is only useful if agents can continue learning over such long training horizons, which is far from given 👇 https://t.co/RGiiQPlU6s

6

353

46

412

99K

mitrma retweeted

Tim Rocktäschel

@_rockt

2 months ago

"The only unsaturated agentic intelligence benchmark in the world" Excuse me? @NetHack_LE is unsaturated since 2020.

13

223

27

47

51K

mitrma retweeted

Oscar Michel

@ojmichel4

3 months ago

📢Current world models aren't really modeling the world; they're modeling one agent's view of it. Partial observations ≠ world state. Future world models will be independent of any one agent's perspective. You will be able to “drop in” any number of agents at any point in time, and a persistent world state will evolve with their interactions. Imagine a neural MMORPG server. 🧵[1/10]

13

612

87

355

126K

mitrma retweeted

Benjamin Spiegel

@superspeeg

about 1 year ago

Why did only humans invent graphical systems like writing? 🧠✍️ In our new paper at @cogsci_soc, we explore how agents learn to communicate using a model of pictographic signification similar to human proto-writing. 🧵👇

25

1K

181

767

155K

Michael Matthews

@mitrma

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users