Kevin Black @kvablack - Twitter Profile

Kevin Black @kvablack

8 days ago

@akoustov guy who wants a visit from Sam Kriss

0

2

0

810

Kevin Black @kvablack

9 days ago

same failure from GPT 5.5, no spatial awareness

1

7

0

1K

Kevin Black @kvablack

9 days ago

this is why you can't use frontier models for robotics

6

41

0

9

6K

kvablack retweeted

donald

@donaldjewkes

about 2 months ago

POV you doordashed an air fryer and it's about to get zero-shot

7

116

6

5

15K

Who to follow

Dhruv Shah @ CVPR

@shahdhruv_

professor @Princeton | researcher @GoogleDeepMind

Seohong Park

@seohong_park

Reinforcement Learning | CS Ph.D. Student @berkeley_ai | Physical Intelligence

Kimin

@kimin_le2

Assistant professor at KAIST and Co-Founder of @config_inc. Prev: Research scientist at @GoogleAI, Postdoc at @berkeley_ai & Ph.D at KAIST.

Kevin Black @kvablack

about 2 months ago

can VCs stop asking us about world models now?

Physical Intelligence

@physical_int

about 2 months ago

π0.7 handles diverse prompts that don't just say what to do, but also how to do it, including rich language and multimodal information, such as visual subgoal images. At test time, these images can be produced by a lightweight world model.

3

102

3

23

33K

10

252

6

62

27K

Kevin Black @kvablack

2 months ago

@Miles_Brundage @davidshustin @jasminewsun no ill will towards the analysis itself, but just to correct the record, it's not affiliated with Physical Intelligence in any way :)

0

2

0

57

Kevin Black @kvablack

3 months ago

oh you're using VLAs? everyone's using GRPs now. just kidding we're all on LBMs. world models are the future so we developed our own WAM. we're using DVAs. we were using UWMs but our robot caught on fire so we switched to DreamUMVLAPs. we're shipping a robot that passes butter.

18

484

38

127

31K

Kevin Black @kvablack

3 months ago

@chris_j_paxton @notmahi what evidence is there that the aux loss stuff made a huge difference? from my reading of the paper, there are no abaltions that test a Wan backbone with no video prediction loss

1

10

0

543

Kevin Black @kvablack

3 months ago

@chris_j_paxton I used 238 words/min, which is an average reading speed

0

1

0

89

Kevin Black @kvablack

3 months ago

@chris_j_paxton the sheer number of hours is still absolutely miniscule compared to language, which is more significant I think https://t.co/77plg82xiL

Kevin Black @kvablack

over 1 year ago

My favorite slide that I made for my talk last weekend -- a very silly thought experiment in which we compare language datasets to robotics datasets (in the most shallow way possible). Yes it is to scale; I learned that the maximum shape size in Keynote is 20,000pts

kvablack's tweet photo. My favorite slide that I made for my talk last weekend -- a very silly thought experiment in which we compare language datasets to robotics datasets (in the most shallow way possible). Yes it is to scale; I learned that the maximum shape size in Keynote is 20,000pts https://t.co/utrVnFxWfF

5

91

4

32

21K

1

3

1

0

891

Kevin Black @kvablack

3 months ago

this is probably our most important release so far, even though it has nothing to do with research progress

Physical Intelligence

@physical_int

3 months ago

General-purpose AI models are behind some of the most exciting applications we now can't live without. We envision that an analogous “physical intelligence layer” built with models like π0.6 will similarly spur a new wave of applications for the physical world. We’ve recently begun working with a handful of companies that have deployed their robots to do real-world, useful things. https://t.co/udVO9fV0PH

9

739

91

361

176K

0

41

1

8

9K

kvablack retweeted

SAIL Media

@readsail

4 months ago

Robots have a "latency" problem. 🤖 💨 @kvablack explains how to use diffusion models and "Action Chunking" to make robot movements seamless—even when the AI is still "thinking." Watch the full clip on YT! Link in replies.

readsail's tweet photo. Robots have a "latency" problem. 🤖 💨

@kvablack explains how to use diffusion models and "Action Chunking" to make robot movements seamless—even when the AI is still "thinking."

Watch the full clip on YT! Link in replies. https://t.co/6ddfhUAHUq

1

20

1

11

2K

Kevin Black @kvablack

6 months ago

@kenbwork sure, I mean that "the literal error bar is symmetric when it consists of +-1 SEM". I think most would know that's what I (or Generalist) mean when we say "plotting the standard error".

1

0

303

Kevin Black @kvablack

6 months ago

I know I'm the only robot learning researcher to ever care about statistical rigor, but technically you shouldn't use standard error for a binary success rate. The binomial distribution isn't symmetrical 😅

Generalist

@GeneralistAI

6 months ago

More pretraining improves GEN-0 real-robot performance (via blind A/B evals with closed-loop rollouts). Improvements are significant in the low-data regime, but the best models thrive with both pretraining and ample post-training. See blog addendum: https://t.co/LVBdzMxn0f

GeneralistAI's tweet photo. More pretraining improves GEN-0 real-robot performance (via blind A/B evals with closed-loop rollouts).

Improvements are significant in the low-data regime, but the best models thrive with both pretraining and ample post-training.

See blog addendum: https://t.co/LVBdzMxn0f

5

186

27

97

81K

8

167

5

68

25K

Kevin Black @kvablack

6 months ago

@kenbwork I mean that the literal error bar is symmetric about the sample mean when it's based on SE

1

0

326

Kevin Black @kvablack

6 months ago

@kenbwork you're right that it depends on how they're pooling though. if they're averaging multiple proportions then it's no longer binomial. not sure what you can do then besides do a lot more trials. or maybe just presenting the data per-task (unpooled) is better.

0

1

0

406

Kevin Black @kvablack

6 months ago

@kenbwork the SE is symmetric bc it relies on the CLT, which is fine for arbitrary distributions and a large enough sample size. but if you have a smaller sample size and you know the distribution is binomial, you can do better (e.g., Wilson score interval)

2

4

0

847

Kevin Black @kvablack

6 months ago

@aliuahma if you look it up it seems like the rule of thumb is np>10, which it doesn't seem like they have. but in practice I don't see a reason to ever use the normal approximation, especially with proportions near 0 or 1

1

6

0

853

Kevin Black @kvablack

6 months ago

@Christian061145 it all depends on your constraints. inference-time RTC is still more convenient. however, we already do a lot of post-training so we may as well add something there, and this simple method seems to work well enough. I'm sure ppl will find other methods that work better.

1

0

286

Kevin Black @kvablack

6 months ago

Last week I presented real-time chunking (RTC) at NeurIPS, and we did a live coffee demo the very same evening. To celebrate, we're releasing a (very short) follow-up paper describing a training-time variant of RTC, which is what we've actually been using in our demos!

13

438

34

217

34K

Kevin Black @kvablack

6 months ago

@m0hitsharma "d" in the paper includes network, but yeah, you need to know the rough range of delays at training time

0

5

0

1

397

Kevin Black

@kvablack

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users