Joe Clinton

@JoeClinton02

Developing better Long-Horizon behaviour models for robots.

Joined January 2018

99 Following

27 Followers

48 Posts

Joe Clinton @JoeClinton02

15 days ago

@LeRobotHF @allen_ai @UW @cole__ai @amazon When it comes to "zero shot" rewards, it would be much more interesting if the task wasn't the most saturated one imaginable. Could you try it on something harder and less seen in literature like bimanual lego 2x4 stacking?

1

0

0

0

96

Joe Clinton @JoeClinton02

16 days ago

@serenaa_ge The best chinese model for coding right now is arguably Qwen 3.7 max, will you be testing that?

0

1

0

0

171

Joe Clinton @JoeClinton02

about 2 months ago

@VilleKuosmanen "casually training a world model to predict subtask images" is similar to my planning transformer paper's main idea of training autoregressive behaviour policies to predict the long term future first before predicting the next actions. https://t.co/BdoS0C7GRL

0

2

1

2

274

Joe Clinton @JoeClinton02

4 months ago

@PgChiyo If you've kept the same motors yours arms now have a rated payload of 0g, and a peak payload of 250g. You need to at least replace the first few motors with STS3250.

1

1

0

0

98

Joe Clinton @JoeClinton02

5 months ago

Recently have been working with image-to-video generation models a lot more, so I put together this graph to help determine the best video model for any price point. Seedance-v1.5-pro stands out the most to me as the optimal choice to balance quality and cost.

JoeClinton02's tweet photo. Recently have been working with image-to-video generation models a lot more, so I put together this graph to help determine the best video model for any price point. Seedance-v1.5-pro stands out the most to me as the optimal choice to balance quality and cost. https://t.co/dHdSGtabNd

0

5

1

2

575

Joe Clinton @JoeClinton02

5 months ago

@ihorbeaver Perhaps you could speed up the model and then learn a residual network to adapt the action decoder to the wobble with online-rl?

0

1

0

0

200

JoeClinton02 retweeted

5 months ago

NEO’s Starting to Learn on Its Own

297

3K

411

1K

6M

Joe Clinton @JoeClinton02

5 months ago

@thealexbanks This doesn't account for developers moving to untrackable local agents like claude code, codex, cursor and copilot in the same timeframe. Claude Code is far ahead of Codex which is in turn ahead of Gemini.

1

1

0

0

41

Joe Clinton @JoeClinton02

5 months ago

@Ciszek @chris_j_paxton The vae has a 16x16x4 compression. The model begins with a 480x640 input, so 4 frames is compressed to 600 tokens. The input is 5 context frames + 4 noisy latent frames. The DIT generates this in a single step then passes to the action head. This is not a problem.

0

0

0

0

18

Joe Clinton @JoeClinton02

5 months ago

@chris_j_paxton VLA's with a video model backbone are my PhD topic. Wholeheartedly believe they are the way forwards and will share some exciting progress on this front later in the year.

1

11

2

5

435

Joe Clinton @JoeClinton02

5 months ago

@ycombinator @gentrajectory So it's affordance driven grasping like https://t.co/oiAlnpbLEc ?

1

0

0

0

30

Joe Clinton @JoeClinton02

5 months ago

@joshuabelofsky Doesn't look accurate enough to be useful unfortunately. I think the data collected would be low quality and it would impact the resulting policy.

0

0

0

0

20

Joe Clinton @JoeClinton02

6 months ago

@mo_danesh @k7agar You can't guess anything from a such a tiny amount of information. Why even bother trying to help. There's hundreds of possible reasons a VLA model might underperform.

0

0

0

0

12

Joe Clinton @JoeClinton02

6 months ago

@k7agar Loss curves are useful for comparisons between models that have same datasets and same loss function.

0

0

0

0

20

Joe Clinton @JoeClinton02

6 months ago

@chrisgpt 45s Christmas ad for mcdonalds with no speaking roles, 18 locations, 45 actors, 90 extras, 3 cgi shots would require a budget of >$1 million. It's likely they spent about 10x less on this ad and even negative attention is still attention.

0

0

0

0

18

Joe Clinton @JoeClinton02

7 months ago

@lukas_m_ziegler I think this could have been done way cheaper by just waiting for the heated bed to cool down and then repeatedly ramming the part with the flat side of the extruder head until it unsticks and then pushing it off the ledge onto a cushion below.

1

2

0

1

406

Joe Clinton @JoeClinton02

7 months ago

@KLieret Hi, when will you update with GLM 4.6, Kimi K2 thinking and Minimax M2? Would love to know how they compare.

1

1

0

0

338

Joe Clinton @JoeClinton02

7 months ago

@liyitengx @RemiCadene Hi, first off this is amazing! Secondly, wanted to ask two questions: 1. why you didn't go for an off the shelf telescopic lift solution? 2. What is the payload of the lekiwi base and do you think it's overloaded?

1

2

0

0

335

Joe Clinton @JoeClinton02

9 months ago

@vbingliu Could you test models with their preferred agent that they recommend (Claude with Claude Code, GPT-5 with Codex, Gemini with Gemini-Cli, Qwen with Qwen-code)? The right agent pairing should significantly boost performance.

0

0

0

0

418

Joe Clinton @JoeClinton02

10 months ago

gpt-oss is now on @ArtificialAnlys, and is absolutely dominating the Pareto frontier of intelligence vs cost!

JoeClinton02's tweet photo. gpt-oss is now on @ArtificialAnlys, and is absolutely dominating the Pareto frontier of intelligence vs cost! https://t.co/JrvUZjWUdj

0

0

0

0

121

Last Seen Users on Sotwe

Trends for you

Most Popular Users