Ta-Ying Cheng @ChengTim0708 - Twitter Profile

Pinned Tweet

2 months ago

🎞️✨ Have you ever imagined what happens if an object is NOT there? Check out VOID, work led by our amazing intern @sammtmd to see how video diffusion models rewrite scenes after objects are removed!

Sam Motamed

@sammtmd

2 months ago

🎥🪄 What should happen when you remove an object from a video? Example 1: A domino chain is falling → remove the middle blocks → the last block should remain standing Example 2: Two cars are about to crash → remove one car → the other should drive away 🚙 Current video object removal models fail at these dynamic scenarios. We introduce VOID: a model that removes objects and updates the scene as if they were never there. 🏆 Preferred 64.8% of the time vs Runway Aleph, Gen-Omnimatte, ProPainter, and more. 🌐 Project page: https://t.co/PBAWjuwUea 💻 GitHub: https://t.co/nYTv4miPSt 🤗 Demo: https://t.co/9DZpYCBUeN 📄 arXiv: https://t.co/UymkQC6Yku w/ @willarvey @ZhuoningYuan @ChengTim0708 and collaborators at @NetflixResearch and @INSAITinstitute

3

64

8

21

5K

0

10

3

0

1K

ChengTim0708 retweeted

DailyPapers

@HuggingPapers

2 months ago

Netflix just released VOID on Hugging Face A video inpainting model that removes objects and their physical interactions Not just shadows, but things like objects falling when a person is removed

HuggingPapers's tweet photo. Netflix just released VOID on Hugging Face

A video inpainting model that removes objects and their physical interactions

Not just shadows, but things like objects falling when a person is removed https://t.co/jzJvOrDbce

4

53

8

24

4K

ChengTim0708 retweeted

Wildminder

@wildmindai

2 months ago

Netflix dropped some useful stuff. VOID -video object and interaction deletion. - removes objects while realistically simulating physical consequences; - beats Runway/ProPainter; - CogVideoX-5B + SAM 2; looks good, no smudges/artifacts https://t.co/cEOc6pJpaF

54

3K

359

3K

365K

ChengTim0708 retweeted

AK

@_akhaliq

2 months ago

VOID Video Object and Interaction Deletion paper: https://t.co/zgAZjL7mfL model: https://t.co/hOF11E9Ion app: https://t.co/Wh7TzrKnEb

3

36

8

22

6K

Ta-Ying Cheng @ChengTim0708

2 months ago

@cwolferesearch Thanks for sharing Cameron!

1

0

47

ChengTim0708 retweeted

Sylvain Filoni

@fffiloni

2 months ago

Netflix just dropped their first public model on @huggingface 👀

104

4K

305

2K

430K

ChengTim0708 retweeted

Phillip Isola @phillip_isola

12 months ago

Slides from my talk on "Language as a Visual Format" at the Visual Generative Modeling workshop at CVPR (mostly derived from slides made by @hyojinbahng and @carolinemchan): https://t.co/CpKe1CZ5Ow

1

301

45

225

18K

Ta-Ying Cheng @ChengTim0708

12 months ago

Thanks @_akhaliq ‼️ Be sure to checkout our Hugging Face Demo 🤗: https://t.co/kg1PeIDyAt

AK

@_akhaliq

12 months ago

Stability AI just released MARBLE on Hugging Face Material Recomposition and Blending in CLIP-Space

2

187

36

83

18K

0

4

0

364

Ta-Ying Cheng @ChengTim0708

12 months ago

Check out more about MARBLE 👇 🌐 Project Page: https://t.co/VOHGwUGFVZ 📝 Paper: https://t.co/Qpvqk9WMCa 🧑‍💻 Code: https://t.co/5bStkFYMqu 🤗 HF demo: https://t.co/kg1PeID0KV

0

3

0

1

101

Ta-Ying Cheng @ChengTim0708

12 months ago

Imagine a Van Gogh-style teapot turning into glass with one simple slider🎨 Introducing MARBLE, material edits by simply changing CLIP embedding! 🔗 https://t.co/VOHGwUGFVZ 👏 Internship project with @prafull7, @markb_boss , @jampani_varun at @StabilityAI

1

25

5

7

3K

Ta-Ying Cheng @ChengTim0708

12 months ago

Dial roughness down, crank metallic up, stack multiple attributes at once all in a single forward pass!

1

2

0

111

ChengTim0708 retweeted

Chun-Hsiao (Daniel) Yeh

@danielyehhh

about 1 year ago

🚀 Glad to see our All-Angles Bench (https://t.co/2GeMZmS31b) being adopted to evaluate 3D spatial understanding in Seed-1.5-VL-thinking along with OpenAI (o1) and Gemini 2.5 Pro..!

0

23

7

3K

ChengTim0708 retweeted

Yi Ma

@YiMaTweets

about 1 year ago

It seems there is still a long way to go for multi-modal large models to truly understand space and scene.

2

53

11

23

10K

ChengTim0708 retweeted

Chun-Hsiao (Daniel) Yeh

@danielyehhh

about 1 year ago

❗️❗️ Can MLLMs understand scenes from multiple camera viewpoints — like humans? 🧭 We introduce All-Angles Bench — 2,100+ QA pairs on multi-view scenes. 📊 We evaluate 27 top MLLMs, including Gemini-2.0-Flash, Claude-3.7-Sonnet, and GPT-4o. 🌐 Project: https://t.co/yT9aHD3fwm

danielyehhh's tweet photo. ❗️❗️ Can MLLMs understand scenes from multiple camera viewpoints — like humans?

🧭 We introduce All-Angles Bench — 2,100+ QA pairs on multi-view scenes.

📊 We evaluate 27 top MLLMs, including Gemini-2.0-Flash, Claude-3.7-Sonnet, and GPT-4o.

🌐 Project: https://t.co/yT9aHD3fwm https://t.co/RggeoYNNOs

2

79

26

54

18K

ChengTim0708 retweeted

The Nobel Prize

@NobelPrize

over 1 year ago

BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”

NobelPrize's tweet photo. BREAKING NEWS
The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.” https://t.co/94LT8opG79

979

32K

13K

3K

13M

ChengTim0708 retweeted

AI at Meta

@AIatMeta

over 1 year ago

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ https://t.co/M19x2ndwnr 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

527

7K

2K

2M

Ta-Ying Cheng @ChengTim0708

over 1 year ago

Amazing work combining a variety of 3D models with LLMs for better spatial reasoning! @ChenyangMa119

Chenyang Ma

@ChenyangMa119

over 1 year ago

#NeurIPS #NeurIPSConf Thrilled to share that our paper SpatialPIN has been accepted at #NeurIPS2024! We introduce a modular plug-and-play framework that progressively enhances VLMs' 3D reasoning by prompting and interacting with 3D foundational models. (1/8)

ChenyangMa119's tweet photo. #NeurIPS #NeurIPSConf
Thrilled to share that our paper SpatialPIN has been accepted at #NeurIPS2024!

We introduce a modular plug-and-play framework that progressively enhances VLMs' 3D reasoning by prompting and interacting with 3D foundational models.
(1/8) https://t.co/dXpnKXKV8q

2

3

0

1K

0

2

0

247

Ta-Ying Cheng

@ChengTim0708

Last Seen Users on Sotwe

Trends for you

Most Popular Users