john yue

@Luck2john

Hello，I‘m a new guys and I came from China . If you like Science ,food ,or Video Games ,travel and so on .Let me know you ,We can be good friends.

上海, 中华人民共和国

Joined February 2017

184 Following

20 Followers

4K Posts

Luck2john retweeted

Wildminder

@wildmindai

19 days ago

Another cool stuff from NVIDIA. LocateAnything - high-speed visual search engine. You provide a text prompt and it instantly pinpoints that object's exact location in an image. - 10x speedup for dense object detection - Qwen2.5-3B + Moon-ViT - Fast/Slow/Hybrid modes - trained on 138M samples for UI, docs, generic grounding. https://t.co/bEvD6pRKaR

152

51K

Luck2john retweeted

Wildminder

@wildmindai

20 days ago

SUPIR upscaler is outdated. ASASR- turns blurry, low-quality photos into sharp, high-res images. Prevents the fake hallucinated details. - improves OCR - high segmentation accuracy - based on FLUX.1 dev this looks sweet https://t.co/kx4A4N1wDc

542

622

26K

john yue @Luck2john

3 months ago

@ljsabc 期待下周

Luck2john retweeted

向阳乔木

@vista8

3 months ago

阿里开源的OCR解析模型 Logics-Parsing V2好像很厉害。连乐谱都能准确识别了，还有流程图、思维导图、代码和伪代码。 github: https://t.co/PmQ9cH8fvw demo: https://t.co/TBwwRX5ZAO 模型地址：https://t.co/bcumWO1PyB

410

550

31K

Who to follow

Ezekiel Winger

@EzekielWinger

Stupidity is the currency of youth

Sebastián Tare

@sebatarebustos

Building https://t.co/bp3OBNxPqT

Luck2john retweeted

3 months ago

Last year, I got to collaborate on a number of serious projects at the intersection of Diffusers x optimization ⚡️ First, NONE of them were bootstrapped with any AI agents but pure domain knowledge and expertise. So, besides just feeling good, it's also very reassuring to me to know how important those two traits are. Now, coming to the projects that I think are worth mentioning: * `flux-fast`: Showing a combination of `torch.compile` + unscaled FP8 FA3 + no CPU-GPU sync + dynamic FP8 is great for accelerating Flux.1-*. https://t.co/Fagw9bkFPA * `torch.compile` x Diffusers: What does it take to get the most out of `torch.compile` in Diffusers across different user workloads? https://t.co/J8bPgBFK1y * `lora-fast`: How to hotswap LoRAs into compiled models without incurring (slow) recompilation issues? How to set it up for success? https://t.co/FhY8ATz4c0 * `zerogpu-brrr`: How to optimize a ZeroGPU HF Space with AOT + FA3 and other goodies? This helps save 💰 and improve the user experience of your ZeroGPU applications. https://t.co/DdPsS6O5Ky Hopefully, this will make you realize there's still a LOT that you can do (preferably pairing with AI) if you're curious and deeply invested in stuff you care about.

14K

Luck2john retweeted

Gorden Sun

@Gorden_Sun

3 months ago

daVinci-MagiHuman：开源音视频模型由国内GAIR实验室发布，15B参数，能生成带有音频的视频（目前开源可用的同类模型只有LTX 2.3），支持中文、英文、日语等多种语言。效果还行，感觉生成的视频动作幅度不大。在线体验：https://t.co/xsYdDP2QvA 模型：https://t.co/tYK1VRWxaV

126

110

11K

Luck2john retweeted

Victor M

@victormustar

3 months ago

Go try it: https://t.co/cbV5BGrqTu

127

10K

Luck2john retweeted

angrizan @chainsmoker89

3 months ago

Track 1 : AI for Production Letter Lora : Qwen Image Edit 2511 A custom trained LoRA that reverses engineers typography from real world photographs. Lora Link : https://t.co/DvB46WCO4F @Ali_TongyiLab @Alibaba_Qwen @ModelScope2022 #HappyQwensDay #QwenImageLora

12K

Luck2john retweeted

Wildminder

@wildmindai

3 months ago

EffectMaker by Tencent. Wan2.2 + Qwen3-VL does zero-shot effect transfer from reference videos to images without per-effect LoRAs. > clones complex video effects to static images > outperforms VFX-Creator and Omni-Effect . professional-grade cinematic effects for film post-production, game design etc. Just point at a reference and it clones the physics. https://t.co/XeOAoA6ekP

Luck2john retweeted

Tengfei Wang @DylanTFWang

3 months ago

Autoregressive diffusion models drift for long videos? 📉 We fixed it.🚀 Speed + Stability = ✅ Meeting *Test-Time Correction (TTC)*. We stop error accumulation in its tracks without any retraining. ✅ Training-free ✅ 1 minute+ stable generation ✅ Negligible overhead

230

165

17K

Luck2john retweeted

Huaxiu Yao

@HuaxiuYaoML

3 months ago

Meet MetaClaw 🦞— Just talk to your agent, it learns and evolves. 💬 Conversations become training trajectories ⚡ Models update live with hot-swapped weights 🧠 Failures generate new reusable skills 💻 No GPU cluster required Under the hood: 🔄 Online SkillRL training Training runs asynchronously while the model continues serving. 🐚 Skill evolution When the agent fails, an LLM analyzes the trajectory and generates new reusable skills. 📌 Skill injection Relevant skills are retrieved and injected into the system prompt at each step to guide behavior in real time. Built on Kimi-2.5 via Tinker cloud LoRA → fine-tuning costs about $10 and requires no GPU cluster 💻 Fully open-sourced https://t.co/RR24ZJvZ7T Built with @openclaw and @thinkymachines Kudos to the team @richardxp888, Jianwen Chen, @Xinyu2ML, @lillianwei423, @StephenQS0710, Zeyu Zheng, @cihangxie!

296

302

28K

Luck2john retweeted

@_akhaliq

3 months ago

MatAnyone 2 is out on Hugging Face Scaling Video Matting via a Learned Quality Evaluator paper: https://t.co/KPMaG8teJ2 app: https://t.co/wkMpaOdoCh

359

316

35K

Luck2john retweeted

PaddlePaddle

@PaddlePaddle

3 months ago

🔥 PaddleOCR-VL is now available in the llama.cpp ecosystem This brings document parsing VLMs closer to local and lightweight deployment workflows — making it easier for developers to explore portable, community-friendly multimodal document AI. Why it matters 🔹 Easier access to PaddleOCR-VL in GGUF-based workflows 🔹 More flexible paths for local inference and lightweight deployment 🔹 A simpler way to experiment with structured document understanding beyond traditional OCR stacks ⚠️ Important note for developers If llama-cli works but llama-server throws errors, try explicitly passing --chat-template-file when launching llama-server. Chat template / GGUF resources 👉 PaddleOCR-VL-1.5-GGUF: https://t.co/F8qxYL0b3X 👉 PaddleOCR-VL-GGUF: https://t.co/BLorWbajVl A meaningful step toward making document intelligence more open, more portable, and more developer-ready. #PaddleOCR #llamacpp #GGUF #DocumentAI #MultimodalAI #OCR #OpenSourceAI

PaddlePaddle's tweet photo. 🔥 PaddleOCR-VL is now available in the llama.cpp ecosystem

This brings document parsing VLMs closer to local and lightweight deployment workflows — making it easier for developers to explore portable, community-friendly multimodal document AI.

Why it matters
🔹 Easier access to PaddleOCR-VL in GGUF-based workflows
🔹 More flexible paths for local inference and lightweight deployment
🔹 A simpler way to experiment with structured document understanding beyond traditional OCR stacks

⚠️ Important note for developers
If llama-cli works but llama-server throws errors, try explicitly passing --chat-template-file when launching llama-server.

Chat template / GGUF resources
👉 PaddleOCR-VL-1.5-GGUF: https://t.co/F8qxYL0b3X
👉 PaddleOCR-VL-GGUF: https://t.co/BLorWbajVl

A meaningful step toward making document intelligence more open, more portable, and more developer-ready.

#PaddleOCR #llamacpp #GGUF #DocumentAI #MultimodalAI #OCR #OpenSourceAI

Luck2john retweeted

Machine Delusions

@Machinedelusion

4 months ago

⏰Released the PixelGen Repo for @ComfyUI This is a pixel diffusion model. No Vae! All diffused in pixel space! LoRa training nodes included as well. Link Below👇

Machinedelusion's tweet photo. ⏰Released the PixelGen Repo for @ComfyUI

This is a pixel diffusion model. No Vae! All diffused in pixel space!

LoRa training nodes included as well. Link Below👇 https://t.co/FpB9B5M8hW

129

112

Luck2john retweeted

Michael Yuan

@juntao

4 months ago · Shady Hollow

Rust implementation for Speech-to-Text based on open-source Qwen3 models * Self-contained binary build — no external dependencies * Uses libtorch on Linux with optional Nvidia GPU support * Uses MLX on MacOS with Apple GPU/NPU support 🔨 CLI for AI agents and humans: https://t.co/knsZlastgQ 🖥️ OpenAI compatible API server: https://t.co/qjDqCf9hor 🤖 OpenClaw skill: https://t.co/tE6lzTjYpy Why and how https://t.co/VxRt9oSZ8a

542

528

32K

Luck2john retweeted

Ling Yang

@LingYang_PU

4 months ago

What if your AI agent got better just by talking to you? Introducing OpenClaw-RL — a fully async RL framework that turns your everyday conversations into training signals. Your agent learns your habits, your workflows, your preferences. Privately. Continuously. #Clawdbot #openclaw 🔑 Two learning modes: • Binary RL — likes/dislikes become rewards • On-Policy Distillation — your textual feedback becomes token-level guidance Self-hosted. Zero API keys. Your data never leaves your machine. 👉 https://t.co/ry18qekutm

429

509

30K

Luck2john retweeted

Niels Rogge @NielsRogge

4 months ago

Opus 4.6 made a @Gradio demo for it too! It uses a "chunked window" approach, allowing it to run up to 160 frames per second (FPS) I'm really impressed by coding agents - porting a model + demo in less than 2 days

126

17K

Luck2john retweeted

Sayak Paul

@RisingSayak

4 months ago

Editing images is a series of state transitions between the source image and the edited image that we want. Yet, the existing paradigm doesn't explicitly include any transitioning priors in the editing process. This becomes particularly prevalent for edits, involving causal dynamics (e.g., refraction, deformation). To model this kind of physics-informed information, we leverage the rich priors present in videos and introduce PhysicEdit 🔥 TL;DR: We fine-tune QwenImage Edit on a curated dataset of videos with reasoning traces and fixed-length transition queries to do solid physics-aware image editing! In the process, we introduce a cool dataset "PhysicTran38K", consisting of 38K transition trajectories across five physical domains and devise a method to provide supervision from it QwenImage Edit. Hop in to learn more ⬇️

$RisingSayak's tweet photo. Editing images is a series of state transitions between the source image and the edited image that we want. Yet, the existing paradigm doesn't explicitly include any transitioning priors in the editing process. This becomes particularly prevalent for edits, involving causal dynamics (e.g., refraction, deformation). To model this kind of physics-informed information, we leverage the rich priors present in videos and introduce PhysicEdit 🔥 TL;DR: We fine-tune QwenImage Edit on a curated dataset of videos with reasoning traces and fixed-length transition queries to do solid physics-aware image editing! In the process, we introduce a cool dataset "PhysicTran38K", consisting of 38K transition trajectories across five physical domains and devise a method to provide supervision from it QwenImage Edit. Hop in to learn more ⬇️$

361

212

51K

Luck2john retweeted

liz tan

@liztansz

4 months ago

Say you have trained your deep learning model. It works. But do you know what it has actually learned? 🚀 We’ve built SymTorch: a library that translates deep learning models into human-readable equations. I've attached here a quick video demonstrating how SymTorch works.

375

135K

Luck2john retweeted

Sayak Paul

@RisingSayak

4 months ago

If you stuck around till here, thank you! Please check out the work here: https://t.co/tRpiaC9Xed. We're also open-sourcing everything -- checkpoints, code, and the dataset ❤️ Best collaborators ever: @ben_nebulous, @zhuole1025, and others!

john yue

@Luck2john

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users