Bin Ren @hello_renbin - Twitter Profile

22 days ago

Glad to be recognized as an Outstanding Reviewer by #CVPR 2026!

#CVPR2026 @CVPR

23 days ago

We are grateful to all of the 17,491 reviewers who helped make #CVPR2026 possible. We are especially pleased to recognize the following Outstanding Reviewers, whose high-quality reviews (as judged by their Area Chairs) placed them among the top 5% of reviewers.

CVPR's tweet photo. We are grateful to all of the 17,491 reviewers who helped make #CVPR2026 possible. We are especially pleased to recognize the following Outstanding Reviewers, whose high-quality reviews (as judged by their Area Chairs) placed them among the top 5% of reviewers. https://t.co/YjQppx6a8K

5

223

43

30

96K

2

7

0

1K

Hello_RenBin retweeted

Thinking Machines

@thinkymachines

30 days ago

People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. https://t.co/AFJZ5kH7Ku

465

16K

2K

12K

8M

Bin Ren @Hello_RenBin

about 1 month ago

Fantastic Talk, looking forward to the Physical AGI!

Jim Fan

@DrJimFan

about 1 month ago

I promise this will be the best 20 min you spend today! Robotics: Endgame, the sequel to my last year's Sequoia AI Ascent talk, "Physical Turing Test". I laid out the roadmap for solving Physical AGI as a simple parallel to the LLM success story. Be a good scientist, copy homework ;) And stay till the end, more easter eggs and predictions for your polymarket! 00:30 DGX-1 origin story at OpenAI, I was there in 2016 signing with Jensen and Elon. Heading to the Computer History Museum! 01:42 The Great Parallel 03:31 Robotics, the Endgame 03:39 Why VLAs fall short 04:32 Video world models as the 2nd pretraining paradigm 06:09 World Action Models (WAM) 07:46 Strategies for robot data collection and the FSD equivalent to physical data flywheel for robot manipulation 11:06 EgoScale and the Dexterity Scaling Law we discovered recently 14:00 Physical RL: bridging the last mile 15:39 DreamDojo: an end-to-end neural physics engine for scaling RL in silico 17:00 Civilizational Technology Tree and my predictions for the near future. Spoiler: it's closer than you think. Thanks to my friends at Sequoia for inviting me back to AI Ascent this year! I had a blast! Last year's talk is attached in the thread if you missed it.

162

3K

546

4K

565K

0

54

Bin Ren @Hello_RenBin

about 1 month ago

Impressive

Genesis AI

@gs_ai_

about 1 month ago

We are back. After one year of quiet building. Introducing GENE-26.5, our first robotic brain that takes a major step toward human-level capability. For years, robotics has struggled to learn from the world’s largest and valuable data source: Humans. Solving it means rethinking the whole stack from the ground up: - A robotics-native foundation model. - A 1:1 human-like robotic hand. - A noninvasive data collection glove for motion, force, and touch. - A simulator that turns weeks of experiments into minutes. GENE-26.5 is trained across language, vision, proprioception, tactile, and action. We designed a set of tasks to test how far we can go with this new paradigm. Fully autonomous, 1x speed, one model, same weights. (Enjoy with sound on) We are approaching the endgame for robotics. And this is just a beginning.

281

6K

1K

3K

3M

0

1

0

57

Who to follow

Fabio Cermelli

@fcdl94

CTO and Cofounder of @FocoosAi. PhD in Computer Vision and Continual Learning at @PoliTOnews. Past president of IEEE @HKNPoliTo Mu Nu Chapter.

Elisa Ricci

@eliricci_

Professor of Computer Science at @UniTrento. Head of Research Unit at @FBK_research. Interests in Computer Vision, Deep Learning and Robotics

Nicola Dall'Asen

@fodark

SB Intuitions | AI Ph.D. Student @UniTrento and @Unipisa | Diffusion Models and nerdy stuff | Prev. @AIatMeta

Hello_RenBin retweeted

Songyou Peng @songyoupeng

3 months ago

I gave an award talk @3DVconf that might be interested to some people. I took a step back and shared a few personal stories from my 10-year journey, reflecting on the profound impact of people, luck (you need a lot!), grit, and the art of giving up. (1/2)

songyoupeng's tweet photo. I gave an award talk @3DVconf that might be interested to some people. I took a step back and shared a few personal stories from my 10-year journey, reflecting on the profound impact of people, luck (you need a lot!), grit, and the art of giving up. (1/2) https://t.co/y7qDm0MADT

9

347

44

170

20K

Hello_RenBin retweeted

Andrej Karpathy

@karpathy

3 months ago

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. https://t.co/WAz8aIztKT All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

karpathy's tweet photo. Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project.

This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.:

- It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work.
- It found that the Value Embeddings really like regularization and I wasn't applying any (oops).
- It found that my banded attention was too conservative (i forgot to tune it).
- It found that AdamW betas were all messed up.
- It tuned the weight decay schedule.
- It tuned the network initialization.

This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism.
https://t.co/WAz8aIztKT

All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges.

And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

962

20K

2K

11K

4M

Hello_RenBin retweeted

Junfan Zhu 朱俊帆 ✈️ CVPR

@junfanzhu98

4 months ago

https://t.co/sCG3gdPbnY

1

209

20

209

18K

Hello_RenBin retweeted

Zeyi Liu

@Liu_Zeyi_

4 months ago

For video generation in robotic applications, looking pretty is usually not enough. Robot manipulation requires understanding how visual observations and 3D geometry evolve over time under agent actions, with temporal coherence and geometric consistency across camera views. We study this challenge in our work (recently accepted by @iclr_conf ), 4D Video Generation for Robot Manipulation, which enforces multi-view 3D consistency via geometric supervision to generate spatio-temporally aligned videos.

9

310

40

136

54K

Bin Ren @Hello_RenBin

6 months ago

Amazing

Carlo Sferrazza

@carlo_sferrazza

6 months ago

Sim-to-real learning for humanoid robots is a full-stack problem. Today, Amazon FAR is releasing a full-stack solution: Holosoma. To accelerate research, we are open-sourcing a complete codebase covering multiple simulation backends, training, retargeting, and real-world inference.

20

598

132

319

213K

0

122

Hello_RenBin retweeted

Yi Ma

@YiMaTweets

7 months ago

I have met many students and young researchers lately who claim to be working on World Models or Embodied AI but do not even know the basics of 3D Vision or linked rigid body motions. When did we start to give students the illusion that they can *do* things right without *learning* anything right?

83

1K

65

532

244K

Bin Ren @Hello_RenBin

7 months ago

@xusy2333 Congrats!

0

1

0

30

Bin Ren @Hello_RenBin

12 months ago

@altndrr 666

1

0

84

Bin Ren @Hello_RenBin

about 1 year ago

@paolorotaphd Bravo!

0

87

Bin Ren @Hello_RenBin

over 1 year ago

@Koichi_N_ Congrats!

1

0

90

Hello_RenBin retweeted

MrNeRF

@janusch_patas

over 1 year ago

Paper: https://t.co/AkgNLodxNE

0

3

1

1K

Hello_RenBin retweeted

MrNeRF

@janusch_patas

over 1 year ago

CityLoc: 6 DoF Localization of Text Descriptions in Large-Scale Scenes with Gaussian Representation Contributions: • Experimental Setup and Benchmarking: We develop a comprehensive experimental setup designed to evaluate city-scale, text-based 6DoF localization. • Novel Approach for Text-Based 6DoF Localization: We propose a diffusion-based method for text-based 6DoF localization that operates effectively at the city scale. • Pose Refinement Technique: We employ Gaussian splatting rendering for pose refinement, filtering out poorly matched poses and optimizing them by maximizing cosine similarity with text features. This guides the pose to the most relevant location for the given text description. • State-of-the-Art Results: Our approach delivers superior performance, surpassing baseline methods in both pose estimation accuracy and distribution modeling.

janusch_patas's tweet photo. CityLoc: 6 DoF Localization of Text Descriptions in Large-Scale Scenes with Gaussian Representation

Contributions:
• Experimental Setup and Benchmarking: We develop a comprehensive experimental setup designed to evaluate city-scale, text-based 6DoF localization.

• Novel Approach for Text-Based 6DoF Localization: We propose a diffusion-based method for text-based 6DoF localization that operates effectively at the city scale.

• Pose Refinement Technique: We employ Gaussian splatting rendering for pose refinement, filtering out poorly matched poses and optimizing them by maximizing cosine similarity with text features. This guides the pose to the most relevant location for the given text description.

• State-of-the-Art Results: Our approach delivers superior performance, surpassing baseline methods in both pose estimation accuracy and distribution modeling.

1

17

2

11

2K

Hello_RenBin retweeted

INSAIT Institute

@INSAITinstitute

almost 2 years ago

🔬 Researchers from INSAIT, ETH Zurich, University of Amsterdam and the Università di Pisa and Trento created the first of its kind large-scale dataset for understanding 3D Gaussian splats. Links in comments! 🎉 Congratulations to all authors!

INSAITinstitute's tweet photo. 🔬 Researchers from INSAIT, ETH Zurich, University of Amsterdam and the Università di Pisa and Trento created the first of its kind large-scale dataset for understanding 3D Gaussian splats. Links in comments!

🎉 Congratulations to all authors! https://t.co/GnYSJuftH2

1

10

2

566

Bin Ren @Hello_RenBin

almost 2 years ago

@Koichi_N_ @amsabour @FidlerSanja @seungkim0123 nice work!

0

1

0

37

Hello_RenBin retweeted

Dino Pedreschi @DinoPedreschi

almost 2 years ago

Looking for a PhD at the frontier of Human-centered Artificial Intelligence? 19 newly added open positions at Italy's National Phd on AI! https://t.co/qLKiSX2AUa

0

26

18

3

2K

Bin Ren @Hello_RenBin

almost 2 years ago

@zhang1632201 Cool! Awesome work!

0

1

0

29

Bin Ren

@Hello_RenBin

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users