Zhenzhen Weng @JenWeng4 - Twitter Profile

JenWeng4 retweeted

over 1 year ago

Just published in @ScienceAdvances, our work demonstrating the ability of AI and 3D computer vision to produce automated measurement of human interactions in video data from early child development research -- providing over 100x time savings compared to human annotation and enabling quantitative, big data studies. We use our method, HARMONI, to characterize longitudinal trends in infant and toddler interaction with caregivers, in over 500 hours of video data. Work led by @JenWeng4 together with co-PI @SandersMDMPH and @K_L_Humphreys, and with a great interdiscplinary team including Laura Bravo Sanchez, @bergelsonlab, @akanazawa, @StanfordCERC, and many others! https://t.co/ouyazVXqIH

2

53

13

11

6K

JenWeng4 retweeted

Jing-Jing Li @drjingjing2026

over 1 year ago

1/3 Today, an anecdote shared by an invited speaker at #NeurIPS2024 left many Chinese scholars, myself included, feeling uncomfortable. As a community, I believe we should take a moment to reflect on why such remarks in public discourse can be offensive and harmful.

drjingjing2026's tweet photo. 1/3 Today, an anecdote shared by an invited speaker at #NeurIPS2024 left many Chinese scholars, myself included, feeling uncomfortable. As a community, I believe we should take a moment to reflect on why such remarks in public discourse can be offensive and harmful. https://t.co/LEB0tN8eZV

175

4K

549

462

1M

JenWeng4 retweeted

Serena Yeung-Levy

@yeung_levy

over 1 year ago

Our lab at Stanford has postdoc openings! Candidates should have expertise and interests in one or multiple of: multimodal large language models, video understanding (including video-language models), AI for science / biology, or AI for surgery. Please send inquiries by email and see https://t.co/3DB5k5AC2o for more information.

4

193

37

71

37K

JenWeng4 retweeted

Zhaorun Chen

@ZRChen_AISafety

almost 2 years ago

🤗First benchmark on multimodal judge’s feedback for text-to-image generation!! 🏃Come and pick up your personal advice and package to choose the best judge to fine-tune your diffusion model 👉 https://t.co/7ENkz1Fbmx Paper: https://t.co/F77BCQlJtA Code: https://t.co/baiK2H1yuq

0

19

4

5

4K

Who to follow

Minghua Liu @ CVPR26

@MinghuaLiu_

Founding member @sudo_robotics. Embodied AI, 3D vision. | ex: @nvidia @Qualcomm @Waymo @Adobe @ucsd_cse @Tsinghua_Uni

Shaowei Liu

@stevenpg8

CS PhD @IllinoisCDS | MSCS @ucsd_cse | BSEE @Tsinghua_uni

Congyue Deng

@CongyueD

Postdoc in EECS @MIT | Previous: PhD in CS @Stanford; BS in math @Tsinghua_Uni | ❤️ 3D vision, geometry, and art

JenWeng4 retweeted

Huaxiu Yao

@HuaxiuYaoML

almost 2 years ago

🌟NEW Paper Alert 🌟 👩‍⚖️MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? (https://t.co/hKQaAOzrf4) 🧐Also wonder about the best judge model to provide feedback for your diffusion models? We evaluate multimodal judges in providing feedback for image generation models across four key perspectives: alignment, safety, image quality, and bias. Key findings: 👉1. While closed-source VLM judges typically perform better, smaller CLIP-based models offer better text-image alignment and image quality feedback due to extensive pre-training on text-vision corpus. Conversely, VLMs provide more accurate feedback on safety and generation bias, thanks to their stronger reasoning capabilities. 👉2. VLM judges can provide more accurate and stable feedback in natural language (e.g. Poor, Average, Good) than numerical scales. Led by @ZRChen_AISafety, Yichao Du, Zichen Wen, @AiYiyangZ. https://t.co/6cwN9yrVOm

HuaxiuYaoML's tweet photo. 🌟NEW Paper Alert 🌟
👩‍⚖️MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? (https://t.co/hKQaAOzrf4)

🧐Also wonder about the best judge model to provide feedback for your diffusion models?

We evaluate multimodal judges in providing feedback for image generation models across four key perspectives: alignment, safety, image quality, and bias.

Key findings:
👉1. While closed-source VLM judges typically perform better, smaller CLIP-based models offer better text-image alignment and image quality feedback due to extensive pre-training on text-vision corpus. Conversely, VLMs provide more accurate feedback on safety and generation bias, thanks to their stronger reasoning capabilities.

👉2. VLM judges can provide more accurate and stable feedback in natural language (e.g. Poor, Average, Good) than numerical scales.

Led by @ZRChen_AISafety, Yichao Du, Zichen Wen, @AiYiyangZ.

https://t.co/6cwN9yrVOm

3

139

36

40

21K

Zhenzhen Weng @JenWeng4

almost 2 years ago

🌟Just completed my PhD at @Stanford! 🌟 A huge thanks to my advisor @yeung_levy, my family and friends, committee and collaborators, and everyone who supported me along the way. Excited to start my next chapter at @Waymo, working on foundation models for self-driving cars!

JenWeng4's tweet photo. 🌟Just completed my PhD at @Stanford! 🌟 A huge thanks to my advisor @yeung_levy, my family and friends, committee and collaborators, and everyone who supported me along the way. Excited to start my next chapter at @Waymo, working on foundation models for self-driving cars! https://t.co/LRPsrdyem6

7

148

4

11

26K

Zhenzhen Weng @JenWeng4

about 2 years ago

@charles_rqi Congrats Charles!

0

77

JenWeng4 retweeted

Jonathon Luiten

@JonathonLuiten

about 2 years ago

If you’re in Davos, we just started giving a tutorial on Gaussian Splatting at 3DV. With @GKopanas @Snosixtytwo @antoine_guedon https://t.co/zMVqL2s7z9 https://t.co/MJwOOuXBYu

2

71

10

24

8K

JenWeng4 retweeted

Xiaohan Wang

@XiaohanWang96

about 2 years ago

Thanks @_akhaliq for sharing our work! Letting LLM be an agent and long-form videos as an environment, and allowing LLM to interact with videos and decide where to look iteratively, we achieve SoTA zero-shot performance and show potential on processing extremely long videos!

1

36

12

20

24K

JenWeng4 retweeted

Judy Shen @judyhshen

over 2 years ago

Are you hiring top AI talent? Here is a list of Ph.D. students affiliated with @StanfordAILab who are on the industry and academic job markets this year! This list showcases diverse research areas and 41% of these graduates are URMs! Check it out: https://t.co/WiTN8FKHhO

4

216

50

102

55K

Zhenzhen Weng @JenWeng4

over 2 years ago

@_akhaliq Arxiv: https://t.co/n3wPQY3ikU Project page: https://t.co/AK6rRdtlBW

1

2

1

0

321

Zhenzhen Weng @JenWeng4

over 2 years ago

Check out our recent work on generalizable human NeRF prediction! Arxiv: https://t.co/n3wPQY3ikU Project page: https://t.co/AK6rRdtlBW

AK

@_akhaliq

over 2 years ago

Single-View 3D Human Digitalization with Large Reconstruction Models paper page: https://t.co/JRrI8By7U5 introduce Human-LRM, a single-stage feed-forward Large Reconstruction Model designed to predict human Neural Radiance Fields (NeRF) from a single image. Our approach demonstrates remarkable adaptability in training using extensive datasets containing 3D scans and multi-view capture. Furthermore, to enhance the model's applicability for in-the-wild scenarios especially with occlusions, we propose a novel strategy that distills multi-view reconstruction into single-view via a conditional triplane diffusion model. This generative extension addresses the inherent variations in human body shapes when observed from a single view, and makes it possible to reconstruct the full body human from an occluded image. Through extensive experiments, we show that Human-LRM surpasses previous methods by a significant margin on several benchmarks.

2

111

18

46

36K

2

36

7

5

13K

JenWeng4 retweeted

Serena Yeung-Levy

@yeung_levy

over 2 years ago

What are differences between image datasets? (e.g. ImageNet & ImageNetv2) Errors by one model vs. another? (e.g. CLIP & ResNet) Correct vs. incorrect predictions? VisDiff can answer by describing differences in image sets w/ language. Work led by @Zhang_Yu_hui and @lisabdunlap!

0

10

2

1

4K

JenWeng4 retweeted

Jimei Yang @jimei_yang

over 2 years ago

Sneak peek in video GenAI projects we’re working on @AdobeResearch : 1/3 compositing videos in 3D with NeRF

0

24

4

2

8K

JenWeng4 retweeted

CG Channel @theCGchannel

over 2 years ago

Check out Adobe's Project Scene Change The interesting experimental #AI tech automatically composites an actor from one shot into the environment from another without the need for #rotoscoping or #cameratracking https://t.co/gP3iK2mTXR #compositing #VFX #motiongraphics

0

31

8

6

6K

Zhenzhen Weng @JenWeng4

almost 3 years ago

Pls stop at #CVPR2023 poster *Tue AM 110* to learn about GC-KPL: a novel method for learning 3D human keypoints from point clouds w/o human labels. Project: https://t.co/XYgT2MtAEW Joint work w/ awesome folks @gorban Jingwei Ji, @MahyarNajibi, Yin Zhou, Dragomir Anguelov, @Waymo

Alexander Gorban @gorban

about 3 years ago

Check out our #CVPR2023 paper on 3D Human Keypoints Estimation From Point Clouds in the Wild Without Human Labels https://t.co/KspaOpK9Fd Huge shout out to @JenWeng4 who interned in our team last summer and did all the work!

0

4

1

2

2K

0

8

5

0

2K

JenWeng4 retweeted

Jackson Wang

@kcjacksonwang

almost 3 years ago

Have videos of your tennis practice and wish you can put your own motion in 3D? 🎾 👟 🏋🏻 #CVPR2023 We present, NeMo, a 3D motion recovery method that is more accurate by leveraging information shared across multiple instances/repetitions! 👇🏻Resources in 🧵

2

60

19

12

9K

JenWeng4 retweeted

Alexander Gorban @gorban

about 3 years ago

Check out our #CVPR2023 paper on 3D Human Keypoints Estimation From Point Clouds in the Wild Without Human Labels https://t.co/KspaOpK9Fd Huge shout out to @JenWeng4 who interned in our team last summer and did all the work!

0

4

1

2

2K

JenWeng4 retweeted

Nick Greenawalt @motionbynick

about 3 years ago

SO much potential for this:

50

2K

221

269

189K

JenWeng4 retweeted

Yuhui Zhang

@Zhang_Yu_hui

over 3 years ago

(1/8) Can you diagnose and rectify a #vision model using #language? Check our work in #ICLR2023! Our analysis reveals when and how text embeddings can be used as a proxy for image embeddings to debug vision models. Paper: https://t.co/8sGkwOQhhz Code: https://t.co/VFzu1AmvUs

1

104

16

43

26K

Zhenzhen Weng

@JenWeng4

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users