TalkRL Podcast @TalkRLPodcast - Twitter Profile

Pinned Tweet

TalkRL Podcast

@TalkRLPodcast

7 months ago

E73: @danijarh (ex-@GoogleDeepMind Research Scientist) in-depth on Dreamer v4. Audiogram edition 📡

0

12

2

3

956

TalkRL Podcast

@TalkRLPodcast

6 months ago

@kjaved_ @how_uhh @velocizapkar @danijarh Agree this seems to defeat the purpose of small sample benchmarks! Still hoping there is another solution to this issue other than slow envs... seems a case of goodhearts law

1

0

188

TalkRL Podcast

@TalkRLPodcast

6 months ago

@kjaved_ @how_uhh @velocizapkar @danijarh Purposely wanting slow envs due to wanting sample efficient algos seems a bit like throwing baby out with bathwater. Then those with most compute have more advantage. Why not just pay attention to sample complexity/hp sensitivity, and also have fast envs?

1

0

150

TalkRL Podcast

@TalkRLPodcast

6 months ago

@how_uhh @kjaved_ @velocizapkar Or maybe... Craftax? (Accelerated variant of @danijarh 's Crafter)

1

2

0

307

Who to follow

Tom Zahavy

@TZahavy

Building creative agents @GoogleDeepMind. AlphaProof, AlphaZero_db, PuzzleGen, Convex RL, meta gradients. Staff research scientist, discovery team

Marc G. Bellemare

@marcgbellemare

Modelling @ Cohere. Ex RL research lead at Google Brain, DeepMind. Textbook author. Co-founder, Reliant AI.

Anna Harutyunyan | Աննա Հարությունյան

@aharutyu

researcher, philosopher, witch. opinions my own

TalkRL Podcast

@TalkRLPodcast

7 months ago

Kai was our guest back on Episode 11. https://t.co/v6Qvv9Kidi

0

2

1

259

TalkRL Podcast

@TalkRLPodcast

7 months ago

Glad to meet up with @kaixhin , research scientist at @SakanaAILabs, for lunch in Tokyo!

1

5

0

1K

TalkRL Podcast

@TalkRLPodcast

7 months ago

@danijarh Always a pleasure Danijar! 🙏

0

2

0

233

TalkRL Podcast

@TalkRLPodcast

7 months ago

Apple https://t.co/aVfIauHuzu

0

5

0

685

TalkRL Podcast

@TalkRLPodcast

7 months ago

E73: Danijar Hafner on Dreamer v4 @danijarh (ex-@GoogleDeepMind RS) on offline world models for safe robotics, Shortcut Forcing for fast diffusion video models, outperforming OpenAI’s VPT with 100× less data, his “APD” theory unifying exploration and empowerment, and more!

TalkRLPodcast's tweet photo. E73: Danijar Hafner on Dreamer v4
@danijarh (ex-@GoogleDeepMind RS) on offline world models for safe robotics, Shortcut Forcing for fast diffusion video models, outperforming OpenAI’s VPT with 100× less data, his “APD” theory unifying exploration and empowerment, and more! https://t.co/noRt5uNCHk

5

24

5

19

16K

TalkRL Podcast

@TalkRLPodcast

7 months ago

Spotify https://t.co/fLd60Kn21D

1

4

1

0

833

TalkRL Podcast

@TalkRLPodcast

7 months ago

@jesswhittles Always enjoy your writing! Is there some tension between integrity and diplomacy (which is often less about integrity than interests)?

0

23

TalkRL Podcast

@TalkRLPodcast

7 months ago

@jaseweston @jackjingyuzhang @ericsmithnyc @DanielKhashabi @ben_vandurme Congrats! Would you be open to a short interview on this?

0

59

TalkRL Podcast

@TalkRLPodcast

7 months ago

@sirbayes @alexinch_ai @karpathy @RichardSSutton @dwarkesh_sp And how/if it differs on this level, after pre-training vs after post-training.

0

53

TalkRL Podcast

@TalkRLPodcast

7 months ago

@sirbayes @alexinch_ai @karpathy @RichardSSutton @dwarkesh_sp Its interesting how central this is to current paradigm, yet how non obvious to most (including me). Has this formulation been more spelled out somewhere by you or others?

1

0

101

TalkRL Podcast

@TalkRLPodcast

7 months ago

@snowclipsed Just to be safe, lets start with n↑↑4

0

56

TalkRL Podcast

@TalkRLPodcast

7 months ago

@jackjingyuzhang Congrats! Would you or another co-author be open to speak about it on TalkRL Pod?

0

38

TalkRL Podcast

@TalkRLPodcast

8 months ago

@NielsRogge @tw_killian True https://t.co/HqTaeHZriq

0

1

0

62

TalkRL Podcast

@TalkRLPodcast

8 months ago

@CsabaSzepesvari @karpathy My personal hot take is very different: 1. RL as a family of conceptual frameworks, is timeless. 2. Frustrations with modern deep RL algo performance, are mostly due to limitations of deep learning function approx tldr; Give RL FAs that generalize better (plus algos) :D

0

8

1

0

2K