Jerry Zhi-Yang He @_herobotics_ - Twitter Profile

Pinned Tweet

over 2 years ago

Is your robot 🤖 safe around humans🚸? Or will it fail catastrophically🤕? Excited to introduce "Natural Adversarial Frontier", a framework for probing the robustness of Human-Robot Interactions. To appear at #CoRL2023 🧵(1/8) Joint work with @ancadianadragan @daniel_s_brown @ZackoryErickson https://t.co/JbAc2tdKVz https://t.co/EQGCRuXvaL

1

51

10

17

11K

Jerry Zhi-Yang He @_herobotics_

1 day ago

@GuanyaShi Congrats!

0

1

0

111

Jerry Zhi-Yang He @_herobotics_

7 days ago

@dwarkesh_sp comedy

0

1

0

59

_herobotics_ retweeted

Sasha Rush

@srush_nlp

15 days ago

Talk: Training Composer https://t.co/diSjKhP4g5 Overview of the methods that we use at Cursor to build our model.

6

681

85

709

97K

Who to follow

David Held

@davheld

Associate Professor at Carnegie Mellon University | he/him

Anca Dragan

@ancadianadragan

Google DeepMind • AI safety, alignment, collaboration • post training • associate professor @ UC Berkeley EECS

Jiajun Wu

@jiajunwu_cs

Assistant Professor at @Stanford CS

_herobotics_ retweeted

Po-Shen Loh

@PoShenLoh

16 days ago

Whoa. This breakthrough is going to fundamentally affect the structure of how universities select and retain professors. And more generally the structure of work and creativity. AI has managed to discover a significant result in research mathematics in a one-shot query (!!!), disproving a conjecture that a huge number of people have tried working on (including myself, although I am not anywhere as good as the other experts who have thought about it). @wtgowers (Fields Medalist who has been thinking a lot about this space of AI and math) wrote: "if a human had written the paper and submitted it to the Annals of Mathematics and I had been asked for a quick opinion, I would have recommended acceptance without any hesitation. No previous AI-generated proof has come close to that." [The Annals of Mathematics is perhaps the most prestigious math journal in the world.] OpenAI's announcement: https://t.co/faCjJFkY43 Mathematicians' discussion: https://t.co/RF0PzhRonM In college classes, we already have a major issue where take-home assignments are susceptible to students using AI. Even with in-person exams, we're having issues where students use AI in the bathroom during the exam. Now, how will universities decide which professors to hire and promote? Universities used to judge on the basis of whether you got papers published in highly prestigious journals. Should the person who can use AI to generate massive quantities of publishable results be picked, if it can be done with one-shot prompts? More generally, everyone (not just universities) needs to rethink their objectives for hiring and promotion. I am also an entrepreneur, and have been refining my hiring process in this age of AI. Before, I used to be particularly impressed by academic competition performance. Today, I search for people who hold 2 certain principles very strongly: they enjoy (1) delighting other people, and (2) achieving understanding through their own thought. I find these people are good at figuring out what makes customers/partners tick (hence able to identify good directions to run without micromanagement), and also curious enough to learn forever. They also tend to already be pretty strong skills-wise, because those two principles inherently drive them to build skills. I actually think the whole world would be better off with more Thought Full people: https://t.co/cPdqAhQRz3. We'll need people like that to figure out how to help humanity survive. If you'd like to collaborate on ways to make a future, feel free to reach out. That's all I work on nowadays.

8

434

62

216

67K

_herobotics_ retweeted

Sam Sheffer

@samsheffer

17 days ago

i don’t think you understand how insane omni is

116

3K

231

1K

289K

_herobotics_ retweeted

Dwarkesh Patel

@dwarkesh_sp

21 days ago

New blackboard lecture w @ericjang11 He walks through how to build AlphaGo from scratch, but with modern AI tools. Sometimes you understand the future better by stepping backward. AlphaGo is still the cleanest worked example of the primitives of intelligence: search, learning from experience, and self-play. You have to go back to 2017 to get insight into how the more general AIs of the future might learn. Once he explained how AlphaGo works, it gave us the context to have a discussion about how RL works in LLMs and how it could work better – naive policy gradient RL has to figure out which of the 100k+ tokens in your trajectory actually got you the right answer, while AlphaGo’s MCTS suggests a strictly better action every single move, giving you a training target that sidesteps the credit assignment problem. The way humans learn is surely closer to the second. Eric also kickstarted an Autoresearch loop on his project. And it was very interesting to discuss which parts of AI research LLMs can already automate pretty well (implementing and running experiments, optimizing hyperparameters) and which they still struggle with (choosing the right question to investigate next, escaping research dead ends). Informative to all the recent discussion about when we should expect an intelligence explosion, and what it would look like from the inside. Timestamps: 0:00:00 – Basics of Go 0:08:06 – Monte Carlo Tree Search 0:31:53 – What the neural network does 1:00:22 – Self-play 1:25:27 – Alternative RL approaches 1:45:36 – Why doesn’t MCTS work for LLMs 2:00:58 – Off-policy training 2:11:51 – RL is even more information inefficient than you thought 2:22:05 – Automated AI researchers

65

3K

286

3K

683K

_herobotics_ retweeted

Amir Zamir

@zamir_ar

23 days ago

Test-time scaling, reasoning, and generally search-like processes clearly drive significant gains in LLMs. Largely owed to the structure of language. One would think the same could apply to non-linguistic domains, like image generation, but that obviously depends on whether the structure of the domain's representation lends itself to search. 1D ordered tokens (e.g., image FlexTok, video FlexTok) seem like a natural fit since they enable a step-by-step coarse-to-fine generation. We investigated that and found they indeed enable search and scale far better with test-time compute than 2D grids. See the visuals on the webpage. Appearing in @icmlconf 2026. 🔗 https://t.co/yOFqeIJrEz 📄 https://t.co/WFZCihp1m4,

5

138

31

86

15K

Jerry Zhi-Yang He @_herobotics_

25 days ago

@MillionInt maybe if LLM is analogous to the “brain”, then HS are the “spine”. spine is fast, high-freq around perception. It can stabilize, dodge, react faster than an LLM loop, but it’s also reflex-based, local, and hacky. Evolution discovered the spine early on because it won the game

0

69

_herobotics_ retweeted

Dwarkesh Patel

@dwarkesh_sp

about 1 month ago

Did a very different format with @reinerpope – a blackboard lecture where he walks through how frontier LLMs are trained and served. It's shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk. It’s a bit technical, but I encourage you to hang in there - it’s really worth it. There are less than a handful of people who understand the full stack of AI, from chip design to model architecture, as well as Reiner. It was a real delight to learn from him. Recommend watching this one on YouTube so you can see the chalkboard. 0:00:00 – How batch size affects token cost and speed 0:31:59 – How MoE models are laid out across GPU racks 0:47:02 – How pipeline parallelism spreads model layers across racks 1:03:27 – Why Ilya said, “As we now know, pipelining is not wise.” 1:18:49 – Because of RL, models may be 100x over-trained beyond Chinchilla-optimal 1:32:52 – Deducing long context memory costs from API pricing 2:03:52 – Convergent evolution between neural nets and cryptography

151

7K

600

9K

1M

_herobotics_ retweeted

Pulkit Agrawal

@pulkitology

about 1 month ago

Until now, robotics stopped where the human hand begins. Strong, but not delicate. Precise, but not adaptive. Repetitive, but not creative. The human hand wasn’t a benchmark — it was a boundary. We are crossing it. @EkaRobotics. Coming soon.

24

488

39

171

73K

_herobotics_ retweeted

Michelle Lee

@michellearning

about 1 month ago

Welcome to the scientific revolution. 100s of robots. Zero coffee breaks. America’s largest autonomous lab, open today.

159

3K

393

1K

677K

_herobotics_ retweeted

Ziqian Zhong

@fjzzq2002

about 2 months ago

https://t.co/9YS942pplc

5

112

11

77

14K

Jerry Zhi-Yang He @_herobotics_

2 months ago

@FakePsyho lol

0

1

0

105

_herobotics_ retweeted

Tenobrus

@tenobrus

3 months ago

Donald Knuth is vibemathing now. real tough day for the stochastic-parrot crew.

76

3K

429

1K

520K

_herobotics_ retweeted

Frank Yan @FrankYan2

4 months ago

As promised, here's the short film Jia Zhangke produced using Seedance 2.0 for Chinese New Year and his take on AI filmmaking

FrankYan2's tweet photo. As promised, here's the short film Jia Zhangke produced using Seedance 2.0 for Chinese New Year and his take on AI filmmaking https://t.co/czwORR06Uf

41

2K

255

1K

571K

_herobotics_ retweeted

Eric Jang

@ericjang11

4 months ago

As Rocks May Think: an interactive essay on thinking models, automated research, and where I think they are headed. Enjoy! https://t.co/dcFYrXUQYg

51

1K

182

2K

536K

Jerry Zhi-Yang He @_herobotics_

6 months ago

@DorsaSadigh Congrats Dorsa! Really happy for you!

0

199

_herobotics_ retweeted

Dwarkesh Patel

@dwarkesh_sp

6 months ago

The @ilyasut episode 0:00:00 – Explaining model jaggedness 0:09:39 - Emotions and value functions 0:18:49 – What are we scaling? 0:25:13 – Why humans generalize better than models 0:35:45 – Straight-shotting superintelligence 0:46:47 – SSI’s model will learn from deployment 0:55:07 – Alignment 1:18:13 – “We are squarely an age of research company” 1:29:23 – Self-play and multi-agent 1:32:42 – Research taste Look up Dwarkesh Podcast on YouTube, Apple Podcasts, or Spotify. Enjoy!

403

9K

1K

8K

4M

Jerry Zhi-Yang He @_herobotics_

8 months ago

@WinstonGu_ very cool!

0

61

_herobotics_ retweeted

Jason Peng @xbpeng4

8 months ago

I have always been surprised by how few positive samples adversarial imitation learning needs to be effective. With ADD we take this to the extreme! A differential discriminator trained with a SINGLE positive sample can still be effective for a wide range of tasks.

4

156

21

68

17K

Jerry Zhi-Yang He

@_herobotics_

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users