Ryan Li

28 days ago

ComfyUI wrapper for LiTo. Single RGBA image to 3D Gaussians in ComfyUI https://t.co/RqR8v92DA7

6

660

83

628

48K

RyanLi1992 retweeted

29 days ago

LTX2.3 OmniNFT RL-LoRA generates high-quality video/audio + visuals and sound are perfectly synchronized, no laggy or mismatched audio - realistic Lip-Sync - action-matched sound - reduces synchronization errors by 52% really nice output https://t.co/MNut4JPPtE

1

169

23

175

17K

RyanLi1992 retweeted

⚡AI Search⚡

@aisearchio

about 1 month ago

This is the future of gaming DreamX-World - new interactive world generator based on Wan https://t.co/XnXjln2zcR

5

54

12

23

4K

RyanLi1992 retweeted

Feiteng

@FeitengLi

about 1 month ago

其实这些问题都能很好的解决了 1. 扔掉 whisper，换 ASR 模型，Qwen3-ASR 就很不错幻觉很少、也有一些别的ASR选择，whisper 幻觉多也要求 30s片段，Qwen3-ASR 塞更长的音频识别越准确，最大支持 20 分钟； 2. 文字时间轴也扔掉 whisper 不是很准, 虽然 Qwen/Qwen3-ForcedAligner-0.6B 也能用，但实际测试超过 180s 就时间轴就混乱不堪，可以用 @LattifAI_HQ https://t.co/ukwzM7NMeS 4 小时都轻松准确，https://t.co/zNDkOQZrui 可以到这里看看卡拉 OK 字幕都十分准确，也有 skill https://t.co/bR8Nw1ahze speaker diarization 和 naming 也解决的很好了 3. VAD 切片我推荐自己的项目 https://t.co/l5yvfDJUOs 准确性 Top

43

836

137

1K

143K

RyanLi1992 retweeted

R@aiaicreate

@aiaicreate

about 1 month ago

ComfyUI向けタイムラインエディタ「LTX Director」公開。 LTX2.3での動画生成において、細かなプロンプト制御や音声追加、I2V等を単一ノードで統合。直感的な操作性が注目を集め、複数フレーム管理を大幅に簡略化するツールとの評価。 #ComfyUI #LTXVideo URLはリプ⬇️

2

106

6

117

6K

RyanLi1992 retweeted

about 1 month ago

NVIDIA just dropped another cool world model - SANA-WM. You give it one image and a camera path, and it spits out a full minute of 720p on a single GPU. > handles a full 60s > 1-minute clip in just 34 seconds (on a 5090). > follows precise 6-DoF paths, movement feels intentional and grounded. > Based on SANA 2.6B World generator with cinematic fly-throughs for any fantasy setting https://t.co/N6KxXv0LL7

6

338

51

284

20K

RyanLi1992 retweeted

about 1 month ago

Wan2.2 again. SwiftI2V: Efficient 2K I2V video gen with 21GB VRAM. - uses 200x less GPU-time than CineScale - exact image fidelity - decoupled processing no models yet. https://t.co/UmfRrwq3IY

1

226

20

225

16K

RyanLi1992 retweeted

歸藏(guizang.ai)

@op7418

about 2 months ago

https://t.co/ip9kyn2Trk

126

4K

706

7K

2M

about 2 months ago

@0x0funky 这熟悉的hyperframes视频

0

1

0

764

about 2 months ago

@oxniubi plus会补货么，买的快到期了。。。

1

0

176

RyanLi1992 retweeted

Zain Shah

@zan2434

about 2 months ago

Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see. @eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)

1K

29K

4K

25K

6M

3 months ago

@InSpatio_AI anyone tried this on 5090 ？

0

94

3 months ago

4/ Feels like browser-operator AI + local AI tools is an underrated combo. Curious if anyone else is experimenting with this.

0

18

3 months ago

1/ You might not need Qwen-VL anymore for image→video prompts. I found a free trick using Gemini in Chrome that can auto-caption images and run my I2V workflows. Works with tools like Wan2GP and ComfyUI. And it worked better than I expected 👇

1

0

149

3 months ago

3/ Result: Gemini acts like a free vision model + workflow operator. Great workaround if you want Qwen-VL style captioning but don’t have enough VRAM. Limitation: this is UI automation, so it won’t work for API pipelines — but for local workflows it’s surprisingly powerful.

1

0

46

RyanLi1992 retweeted

4 months ago

ok, Video AI is finally moving from aesthetic vibes to world understanding. VBVR- a foundational scale-up for video reasoning; - shift focus from visual quality to spatiotemporal intelligence, reason over motion, interaction, causality. https://t.co/53C32C2ExU

2

143

26

117

19K

4 months ago

This is what i need