Victor Angel Cruz @vic305 - Twitter Profile

Pinned Tweet

about 1 month ago

First real-world consumer hardware result with DFlash on Gemma 4 👀 4.31× speedup (161.85 tok/s) on Gemma-4-26B-A4B 4-bit AWQ using a single RTX 4000 Ada 20GB. Full thread with exact command + config: https://t.co/ecBh19MBcF

Zhijian Liu

@zhijianliu_

about 1 month ago

DFlash for Gemma 4: Up to 6x Faster. ⚡⚡ Great to see MTP land natively in Gemma 4 today. If you want to push it further, try DFlash — open source, same quality, more speed!! https://t.co/wKcRoibuOB

76

2K

185

1K

472K

0

1

0

126

Victor Angel Cruz

@ViC305

1 day ago

Would have been nice to release a tool or skill that will go through and optimize your current skills for fable 5? #claudefable5

ClaudeDevs

@ClaudeDevs

1 day ago

Prompting gets simpler. Existing prompts or skills developed for prior models are often too prescriptive for Fable. We recommend reviewing and potentially updating or removing older instructions or skills if you find default performance to be better.

6

251

3

34

52K

0

21

Victor Angel Cruz

@ViC305

1 day ago

@elder_plinius attack!!! ⚔️🤺🤺🗡️

Claude

@claudeai

1 day ago

Claude Fable 5 is available everywhere today. Claude Mythos 5 is restricted to Glasswing partners until we expand our trusted access program. https://t.co/iQymY0jiGq

142

4K

237

449

760K

0

8

ViC305 retweeted

Nintendo of America

@NintendoAmerica

1 day ago

The Legend of Zelda: Ocarina of Time will be reborn on Nintendo Switch 2 in 2026. #NintendoDirect

4K

181K

39K

14K

23M

Who to follow

Ben Finx

@BFINX

DJ /Producer/ Label Owner / Vibe Curator #Confessions @SLSMiami #OriginalLabel

JG

@_jackieg

You either grow or regress, nothing in life stands still. ⚡️

4 days ago

@NVIDIAGeForce #RTXPowersPlay

0

8

ViC305 retweeted

Victor M

@victormustar

5 days ago

Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality: 🧠 LLMs → NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models. → Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week. → StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0. → Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week. → JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0. 🎨 Image gen (the surprise of the week) → Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can't believe this is open weights. 🔊 Audio & Speech (a breakout week for open TTS, 4 labs shipped) → Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA. → RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0. → Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos. → NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B. 👁️ Vision & VLMs → PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0. 🎬 Video, 3D & World Models → NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI. → JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3. → ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).

87

3K

400

3K

521K

Victor Angel Cruz

@ViC305

5 days ago

@NVIDIAGeForce #RTXPowersPlay its a beauty! Would be honored !

5

0

11

ViC305 retweeted

Terp

@OnlyTerp

7 days ago

Ran tests overnight to see how good you can get Gemma4 12b if you have your other AI optimize him

10

78

4

38

10K

Victor Angel Cruz

@ViC305

6 days ago

Ran a 550B-param LLM on TWO @NVIDIA DGX Sparks at 1-bit. NVIDIA Nemotron-3-Ultra-550B-A55B in @UnslothAI's UD-IQ1_M (189GB), split across both boxes via llama.cpp RPC over the QSFP cable. ~5.4 tok/s decode, ~157 tok/s prefill. And it's coherent!!🦥

ViC305's tweet photo. Ran a 550B-param LLM on TWO @NVIDIA DGX Sparks at 1-bit.

NVIDIA Nemotron-3-Ultra-550B-A55B in @UnslothAI's UD-IQ1_M (189GB), split across both boxes via llama.cpp RPC over the QSFP cable.

~5.4 tok/s decode, ~157 tok/s prefill. And it's coherent!!🦥 https://t.co/mVvFg0eVAG

0

1

0

23

Victor Angel Cruz

@ViC305

6 days ago

Loving these new certs in the realm of A.I!

Microsoft Learn

@MicrosoftLearn

28 days ago

We’re introducing a new GitHub Certified: Agentic AI Developer (GH-600). As AI agents become part of modern development workflows, this role-based certification focuses on how developers and teams operate, supervise, and integrate agents across the SDLC. If you’re already working with tools like GitHub Copilot or exploring agent-driven workflows, we’d love your input. Learn more and get involved. https://t.co/ruiYtlsYnj

MicrosoftLearn's tweet photo. We’re introducing a new GitHub Certified: Agentic AI Developer (GH-600).
As AI agents become part of modern development workflows, this role-based certification focuses on how developers and teams operate, supervise, and integrate agents across the SDLC.

If you’re already working with tools like GitHub Copilot or exploring agent-driven workflows, we’d love your input.

Learn more and get involved. https://t.co/ruiYtlsYnj

124

4K

573

6K

2M

0

2

Victor Angel Cruz

@ViC305

6 days ago

Alas!!! Unleash the quantizers!!! @UnslothAI I need this running on my DGX Spark asap!!

NVIDIA AI

@NVIDIAAI

7 days ago

@openclaw @NousResearch @LangChain As always, Nemotron 3 Ultra is fully open. This includes model weights, synthetic data, and post-training recipes. Available now on @huggingface → https://t.co/MDdnY047fw

9

299

29

130

29K

0

20

Victor Angel Cruz

@ViC305

8 days ago

This is awesome!! Looking forward to what the mad scientists do with these things!

0

7

ViC305 retweeted

BuBBliK

@k1rallik

10 days ago

🚨do you understand what just happened with NVIDIA RTX Spark.. Jensen Huang walked on stage and pulled an entire gaming PC out of his pocket. NVIDIA merged the CPU, RTX GPU, AI hardware and up to 128GB of memory into one Windows-on-ARM superchip and called it the end of the PC as you know it. > 20-core Grace CPU plus a Blackwell GPU with 6,144 CUDA cores - RTX 5070-tier graphics in a 14mm body. > NVIDIA claims 100+ FPS at 1440p in 007 First Light and Forza Horizon 6, on battery. > It runs a 120-billion-parameter AI model locally, no cloud needed. > ASUS, Dell, HP, Lenovo, MSI and Microsoft already have 30+ laptops lined up for this fall. The whole internet has two questions: is it real, and how much. Nobody's asking the third - what happens to every other chipmaker if it is.

433

8K

705

3K

2M

ViC305 retweeted

Mike Bradley

@The_Only_Signal

13 days ago

Someone out there likely needs to see this: M4 Mac Mini, 120 GB/s AMD Strix Halo, 256 GB/s Nvidia DGX Spark, 273 GB/s M5 32 Core MacBook Pro, 460 GB/s Intel Arc Pro B70 GPU, 608 GB/s M5 40 Core MacBook Pro, 614 GB/s Nvidia RTX 3090 GPU, 936 GB/s Nvidia RTX 5090 GPU, 1,792 GB/s

89

2K

116

1K

319K

ViC305 retweeted

Gary Moore

@GaryMooreCA

13 days ago

Asked Grok to squeeze in my Mac Studio • M4 Mac Mini: 120 GB/s • AMD Strix Halo: 256 GB/s • DGX Spark: 273 GB/s • M4 Max Mac Studio: 410–546 GB/s • M5 Max variants: 460/614 GB/s (rumored/projected) • Intel Arc Pro B70: 608 GB/s • M3 Ultra Mac Studio: 819 GB/s • RTX 3090: 936 GB/s • RTX 5090: 1,792 GB/s

3

33

2

9

14K

ViC305 retweeted

Elon Musk

@elonmusk

14 days ago

Yes. It’s not that we’ve discovered some magic bullet, but rather that JAX, or at least the open source version of it, is mostly optimized for small to medium-sized training runs on Google TPUs, whereas we need to massive training runs on Nvidia GPUs. Pipeline parallelism is essential and crushes fully-sharded data parallelism at scale. And C will compile to the most efficient binary short of assembly. Maybe we will do a little assembly too.

468

5K

395

370

306K

Victor Angel Cruz

@ViC305

14 days ago

Great work by this team! Dflash is phenomenal! You should check out my pinned post 👀

Sandro

@pupposandro

14 days ago

Scrapped 500+ issues and PRs to ship a massive @luceboxai repo redesign and fixes. Very proud of the team. https://t.co/FHgAVFd5ab The fastest inference server isn't going to come from a datacenter, it's going to run on the GPU already in your house.

pupposandro's tweet photo. Scrapped 500+ issues and PRs to ship a massive @luceboxai repo redesign and fixes. Very proud of the team.

https://t.co/FHgAVFd5ab

The fastest inference server isn't going to come from a datacenter, it's going to run on the GPU already in your house. https://t.co/aUHnZGjv5t

10

120

11

78

11K

0

2

0

33

ViC305 retweeted

Logan Kilpatrick

@OfficialLoganK

16 days ago

We just launched the ability to build native Android apps directly in Google AI Studio for free! Since launch last week, people have created more than 250,000 Android apps. Likely >99% of these folks never built an Android app before, everyone can now build, no coding required!

OfficialLoganK's tweet photo. We just launched the ability to build native Android apps directly in Google AI Studio for free!

Since launch last week, people have created more than 250,000 Android apps. Likely >99% of these folks never built an Android app before, everyone can now build, no coding required! https://t.co/3pNyVMfg56

352

5K

479

2K

533K

Victor Angel Cruz

@ViC305

15 days ago

@outsource_ Hook it up with the huggingface link and I’ll test on my RTX 4000 ADA 👀

1

0

40

Victor Angel Cruz

@ViC305

15 days ago

@Teknium Confirmed here too. The sharp edge I hit: auth wasn’t the whole story. The raw Codex stream was valid, but response.completed sometimes had output:null and the SDK finalizer crashed. Hermes can recover by preserving streamed output_item.done items!

1

0

715

Victor Angel Cruz

@ViC305

15 days ago

Another crazy release! Been looking for something like this for a while!! Testing it asap

0

1

0

17

Victor Angel Cruz

@ViC305

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users