Fine-tuning models starts with better datasets.
In this short demo, we show how https://t.co/HtPzNiNKqe helps you build fine-tuning datasets in 3 simple steps:
1. Filter interesting events
2. Build your dataset
3. Explore dataset analysis
Better data, better models.
Want to teach Gemma to master chess?
Check out this awesome community project showing how to fine-tune Gemma 4 12B on your own data, 100% locally!
Running text, images, and audio on just 8GB VRAM makes custom models more accessible than ever.
Does your app use AI?
Whether it relies
- on cloud API calls
- or runs fully offline on-device
WildeEdge lets you inspect every inference and workflow step.
Check out our short demo showing an app that turns speech into actions for an in-car assistant:
You can now fine-tune Qwen3.5
just need 5GB VRAM to train Qwen3.5-2B LoRA locally, 1.5x faster with 50% less VRAM.
Qwen3.5-4B Colab: - https://t.co/YYTtHQJ0eJ
- https://t.co/2wlamIRtZ6
Seeing how SOTA models are evolving: becoming more restrictive in usage (decided by the company), less transparent (you cannot tell if the AI lab nerfed your model) + less private (your prompts are stored, no opt out) makes me much more interested in open models + local inference
Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality:
���� LLMs
→ NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models.
→ Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week.
→ StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0.
→ Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week.
→ JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0.
🎨 Image gen (the surprise of the week)
→ Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can't believe this is open weights.
🔊 Audio & Speech (a breakout week for open TTS, 4 labs shipped)
→ Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA.
→ RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0.
→ Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos.
→ NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B.
👁️ Vision & VLMs
→ PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0.
→ Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0.
🎬 Video, 3D & World Models
→ NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI.
→ JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3.
→ ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).
Unlock local, agentic workflows with Gemma 4 12B and Google AI Edge, directly on your laptop. Experience 100% on-device AI:
• Generate code in AI Edge Gallery (new to Mac)
• Dictate and edit text via AI Edge Eloquent (new to Mac)
• Serve Gemma 4 12B locally with LiteRT-LM
Dive in: https://t.co/gr7tOrZmc0
Most teams start with an LLM API. Then the bill shows up. Or latency breaks UX. Or accuracy problems your evals never caught.
Fine-tuning fixes those, but you need data you haven't been collecting.
WildEdge captures inference events as they run. Filter failures, build a dataset