Tal Daniel @TalDaniel8 - Twitter Profile

Pinned Tweet

3 months ago

🚀 #ICLR2026 Oral 💥 How can we design world models that capture object interactions directly from pixels? Introducing Latent Particle World Models-the first end-to-end self-supervised, object-centric world model, trained from videos, supporting action/img/lang conditioning. 1/n

4

280

49

221

29K

TalDaniel8 retweeted

Aviv Tamar @AvivTamar1

13 days ago

VLA is 95% certain about current action. Will it 95% succeed in the task? Obviously, not necessarily. But if you’re clever, you can *calibrate* action prob. to task success. Our #ICML2026 paper formulates this + SOTA algorithms based on new connection to RL temporal differences

0

46

10

37

8K

Tal Daniel @TalDaniel8

about 1 month ago

@AvivTamar1 ❤️

0

51

Tal Daniel @TalDaniel8

about 1 month ago

I’m attending #ICLR2026 in Rio this week to present LPWM! Friday April 24 Poster Session 3 10:30, Oral Session 4B 3:15p. Happy to chat about self-sup object-centric learning and world models. I’ll be on the job market soon and looking for exciting opportunities! #ICLR @iclr_conf

Tal Daniel @TalDaniel8

3 months ago

🚀 #ICLR2026 Oral 💥 How can we design world models that capture object interactions directly from pixels? Introducing Latent Particle World Models-the first end-to-end self-supervised, object-centric world model, trained from videos, supporting action/img/lang conditioning. 1/n

4

280

49

221

29K

3

51

8

15

6K

Who to follow

Aviv Tamar

@AvivTamar1

AI and robotics researcher at Technion

Orr Krupnik

@orrkrup

AI Researcher | Reinforcement Learning, Robotics | PhD Technion | More data isn't all we need 🦾 🌍📈

Yonathan Efroni

@EfroniYonathan

Assistant Professor@TAU | Research@doubleAI

TalDaniel8 retweeted

Dan Haramati @DanHrmti

about 1 month ago

I will be at #ICLR2026 this week to present our work on Hierarchical Entity-centric Reinforcement Learning! Come by our poster (Thursday Poster Session 2 P4-#4712) and reach out anytime to talk about #ReinforcementLearning #WorldModels #HierarchicalRL

1

51

7

19

4K

TalDaniel8 retweeted

ML@CMU @mlcmublog

about 1 month ago

Check out our latest post on CMU@ICLR 2026! https://t.co/TzSd2zs4sT

0

13

1

3K

TalDaniel8 retweeted

Sandeep Routray

@SandeepRoutra11

about 2 months ago

🚀 Excited to share ViPRA: Video Prediction for Robot Actions 📍 Accepted to #ICLR2026 @iclr_conf 🏆 Best Paper — #NeurIPS2025 Embodied World Models Workshop Robot learning today still needs millions of action labeled videos. Yet videos are abundant — from humans and the web — but lack action labels. Meanwhile, pretrained video models already learn rich dynamics. ViPRA is a recipe for turning pretrained video models into robot policies while enabling robot learning to scale with actionless videos. 🧵 Thread ↓

2

265

39

213

25K

Tal Daniel @TalDaniel8

2 months ago

Join us for live paper (Latent Particle World Models) overview + Q&A today (March 24) at 4pm PT / 7pm ET! https://t.co/KRgXI2RGWe

Cecile Tamura @ceciletamura

2 months ago

🎤 Dr. @TalDaniel8 (@CarnegieMellon ) In a deep dive w/ @ceciletamura of @ploutosai How can AI discover objects, model uncertainty, and predict the future from raw video alone? 🔴 [https://t.co/yimiKyJeYV](https://t.co/yimiKyJeYV)

ceciletamura's tweet photo. 🎤 Dr. @TalDaniel8 (@CarnegieMellon ) In a deep dive
w/ @ceciletamura of @ploutosai

How can AI discover objects, model uncertainty, and predict the future from raw video alone?

🔴 [https://t.co/yimiKyJeYV](https://t.co/yimiKyJeYV) https://t.co/utdL6A3FDz

1

2

0

574

0

7

1

443

Tal Daniel @TalDaniel8

3 months ago

@DanHrmti @carl_qi98 @LambdaAPI @AvivTamar1 @davheld @pathak2206 🌐 Learn more about Latent Particle World Models (LPWM): Website: https://t.co/nqDE1UyviV Paper: https://t.co/cMFrfsN00p Code (+ tutorial): https://t.co/0kJskfdm8O 16/n

0

9

1

3

778

Tal Daniel @TalDaniel8

3 months ago

During training, the posterior latent actions condition the dynamics module that predicts the next-frame prior. A KL regularization term aligns this prediction with the latent policy’s output, forming a VAE-style objective over particle transitions. 7/n

1

8

0

839

Tal Daniel @TalDaniel8

3 months ago

Huge thanks to @DanHrmti, @carl_qi98, Amir Zadeh and Chuan Li (@LambdaAPI), @AvivTamar1, @davheld, and @pathak2206 — incredible collaborators who made LPWM possible. 15/n

1

6

0

765

Tal Daniel @TalDaniel8

3 months ago

The inverse dynamics observes particles at t and t+1, inferring the latent actions that caused the change. The latent policy sees only particles at t and outputs a distribution over possible latent actions from the current state. 6/n

1

7

0

737

Tal Daniel @TalDaniel8

3 months ago

🚀 #ICLR2026 Oral 💥 How can we design world models that capture object interactions directly from pixels? Introducing Latent Particle World Models-the first end-to-end self-supervised, object-centric world model, trained from videos, supporting action/img/lang conditioning. 1/n

4

280

49

221

29K

Tal Daniel @TalDaniel8

3 months ago

To address this, we introduce a context module that predicts latent actions per particle, enabling fine-grained, multi-entity dynamics It has two heads: (1) an inverse dynamics (posterior) and (2) a latent policy (prior). 5/n

TalDaniel8's tweet photo. To address this, we introduce a context module that predicts latent actions per particle, enabling fine-grained, multi-entity dynamics
It has two heads: (1) an inverse dynamics (posterior) and (2) a latent policy (prior).
5/n https://t.co/gitCrfiyjZ

1

8

0

794

Tal Daniel @TalDaniel8

3 months ago

Building a world model means capturing stochastic particle dynamics. Existing “latent action” models help, but (1) need strong regularization (e.g., VQ) and (2) rely on a single global latent—missing interactions among multiple entities. 4/n

1

6

0

823

Tal Daniel @TalDaniel8

3 months ago

DLP decomposes scenes to particles with several attributes (keypoints, bounding-boxes, masks), fully unsupervised. These act as visual “tokens,” making cross-modal long-horizon reasoning (vision ↔ language) far more natural than the standard pixel patches. 3/n

TalDaniel8's tweet photo. DLP decomposes scenes to particles with several attributes (keypoints, bounding-boxes, masks), fully unsupervised.
These act as visual “tokens,” making cross-modal long-horizon reasoning (vision ↔ language) far more natural than the standard pixel patches.
3/n https://t.co/YdpVrl8Me3

2

10

0

3

991

Tal Daniel

@TalDaniel8

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users