DeepLearning.Hub @dldothub - Twitter Profile

DLdotHub retweeted

about 10 hours ago

2-bit Gemma 4 12B GGUF, only 4.66 GB on disk, managed to cite 15 sites from a single prompt. Try this locally on >6GB RAM via Unsloth Studio. GitHub: https://t.co/aZWYAtakBP

33

938

105

626

76K

DLdotHub retweeted

Unsloth AI

@UnslothAI

1 day ago

Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs. Google's new model, Gemma 4 12B Unified supports image, audio and 256K context. You can run and train the model via Unsloth Studio. GGUF: https://t.co/8cL321pVDh Guide: https://t.co/odRo9WjRpA

UnslothAI's tweet photo. Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs.

Google's new model, Gemma 4 12B Unified supports image, audio and 256K context.

You can run and train the model via Unsloth Studio.

GGUF: https://t.co/8cL321pVDh
Guide: https://t.co/odRo9WjRpA https://t.co/Ax09ZTXFF3

90

3K

332

1K

263K

DLdotHub retweeted

Google

@Google

1 day ago

Today we’re introducing Gemma 4 12B — our latest open model that brings advanced agentic reasoning, vision and audio directly to your laptop. It delivers performance nearing our larger Gemma models with a much smaller total memory footprint, while being small enough to run locally with just 16GB of VRAM. It’s open and accessible for everyone to use under a permissive Apache 2.0 license. This is all made possible by our new, unified architecture that removes separate multimodal encoders. Here’s how we did it 🧵

Google's tweet photo. Today we’re introducing Gemma 4 12B — our latest open model that brings advanced agentic reasoning, vision and audio directly to your laptop.

It delivers performance nearing our larger Gemma models with a much smaller total memory footprint, while being small enough to run locally with just 16GB of VRAM. It’s open and accessible for everyone to use under a permissive Apache 2.0 license.

This is all made possible by our new, unified architecture that removes separate multimodal encoders. Here’s how we did it 🧵

218

8K

1K

3K

685K

DLdotHub retweeted

merve

@mervenoyann

15 days ago

Cohere dropped Command A+ 🔥 > 25B/219B MoE vision language model > supports 48 languages with efficient tokenizer > tool-calling/agentic + 128k context window > transformers day-0 support 🤗 free license 💗

mervenoyann's tweet photo. Cohere dropped Command A+ 🔥

> 25B/219B MoE vision language model
> supports 48 languages with efficient tokenizer
> tool-calling/agentic + 128k context window
> transformers day-0 support 🤗
free license 💗 https://t.co/ze9340UtBd

3

94

8

16

7K

Who to follow

Goku Mohandas

@GokuMohandas

MIT Intro to Deep Learning

@MITDeepLearning

MIT's introductory course on deep learning!

Tim Rocktäschel

@_rockt

Co-Founder @Recursive_SI, Professor of AI @AI_UCL, PI @UCL_DARK, Fellow @ELLISforEurope. Ex @GoogleDeepMind @AIatMeta @CompSciOxford

DLdotHub retweeted

Unsloth AI

@UnslothAI

16 days ago

4-bit Qwen3.6 MTP GGUF managed to search 70+ sites from a single prompt. Try this locally on 20GB RAM via Unsloth Studio. Unsloth now supports auto MTP + speculative decoding & auto-selects the best MTP settings for your device (Mac, CPU, GPU). GitHub: https://t.co/aZWYAtakBP

38

998

122

749

71K

DLdotHub retweeted

Unsloth AI

@UnslothAI

17 days ago

Qwen3.6 now runs 2x faster with MTP GGUFs! Run locally on just 18GB RAM. ⚡️ MTP enables Qwen3.6 to generate ~1.4–2.2× faster with no accuracy change. Qwen3.6-27B MTP runs at 160 tokens/s. 35B-A3B reaches 240 t/s. GGUFs: https://t.co/7gWhKnseZo Guide: https://t.co/7qzk6ypWDQ

UnslothAI's tweet photo. Qwen3.6 now runs 2x faster with MTP GGUFs! Run locally on just 18GB RAM. ⚡️

MTP enables Qwen3.6 to generate ~1.4–2.2× faster with no accuracy change.

Qwen3.6-27B MTP runs at 160 tokens/s. 35B-A3B reaches 240 t/s.

GGUFs: https://t.co/7gWhKnseZo
Guide: https://t.co/7qzk6ypWDQ https://t.co/8ICXw7iV3G

132

2K

303

2K

140K

DLdotHub retweeted

Adina Yakup

@AdinaYakup

21 days ago

Ant group just dropped Ring-2.6-1T 🔥 1T reasoning model, built for real world agent workflows. ✨ MIT license ✨ 128K >> 256K context (YaRN) ✨ Async RL + IcePop training architecture ✨ Dual reasoning effort: "high" for fast agent loops, "xhigh" for deep reasoning = Better cost/performance tradeoff 👀

AdinaYakup's tweet photo. Ant group just dropped Ring-2.6-1T 🔥 1T reasoning model, built for real world agent workflows.

✨ MIT license
✨ 128K >> 256K context (YaRN)
✨ Async RL + IcePop training architecture
✨ Dual reasoning effort: "high" for fast agent loops, "xhigh" for deep reasoning = Better cost/performance tradeoff 👀

2

108

14

30

9K

DLdotHub retweeted

Daniel Han

@danielhanchen

22 days ago

We released experimental MTP Qwen3.6 Unsloth GGUFs! Qwen3.6 27B MTP now runs at 140 tokens/s. Qwen3.6 35B-A3B MTP gets 220 tokens/s generation on a single GPU. Qwen3.6 27B and 35B-A3B have >1.4x speed-up over the original GGUFs without any change in accuracy. Guide + GGUFs + Benchmarks: https://t.co/x9BYC3iXCL In terms of average speedup, we see a 1.4x for dense models at draft tokens = 2 and for the MoE around 1.15 to 1.2x. We do not recommend more than 2 draft tokens because the acceptance rate drops precipitously from 83% to 50% with 4 draft tokens, and the forward passes for MTP become less beneficial. Use `--spec-type mtp --spec-draft-n-max 2` Thanks to Aman for https://t.co/0WKkIC0kyW!

danielhanchen's tweet photo. We released experimental MTP Qwen3.6 Unsloth GGUFs!

Qwen3.6 27B MTP now runs at 140 tokens/s. Qwen3.6 35B-A3B MTP gets 220 tokens/s generation on a single GPU.

Qwen3.6 27B and 35B-A3B have >1.4x speed-up over the original GGUFs without any change in accuracy.

Guide + GGUFs + Benchmarks: https://t.co/x9BYC3iXCL

In terms of average speedup, we see a 1.4x for dense models at draft tokens = 2 and for the MoE around 1.15 to 1.2x.

We do not recommend more than 2 draft tokens because the acceptance rate drops precipitously from 83% to 50% with 4 draft tokens, and the forward passes for MTP become less beneficial.

Use `--spec-type mtp --spec-draft-n-max 2`

Thanks to Aman for https://t.co/0WKkIC0kyW!

60

783

118

680

123K

DLdotHub retweeted

Nous Research

@NousResearch

24 days ago

Computer use with any model Hermes Agent × @trycua

106

2K

133

1K

1M

DLdotHub retweeted

23 days ago

Trending repository of the day 📈 hermes-agent by nousresearch The agent that grows with you Last 24h: 2,065 ⭐ Total: 145,790 ⭐️ https://t.co/JzUlvlbLbr

0

19

3

6

1K

DLdotHub retweeted

merve

@mervenoyann

24 days ago

🆕 Hugging Face 🤝 Hermes Agent 🔥 > we added Hermes Agent to local apps: run it locally with any compatible GGUF/MLX model > shipped native traces support for Hermes Agent: visualize your Hermes traces directly on the Hub Very soon most agents will run locally and we want to accelerate things as much as we can ⚔️

mervenoyann's tweet photo. 🆕 Hugging Face 🤝 Hermes Agent 🔥

> we added Hermes Agent to local apps: run it locally with any compatible GGUF/MLX model
> shipped native traces support for Hermes Agent: visualize your Hermes traces directly on the Hub

Very soon most agents will run locally and we want to accelerate things as much as we can ⚔️

34

619

72

180

103K

DLdotHub retweeted

Omar Sanseviero

@osanseviero

about 1 month ago

Excited to introduce Gemma 4 Multi-Token Prediction Drafters⚡️Accelerated inference right in your pockets - Up to a 3x speedup - Same quality guarantees - Available in your favorite open-source tools

47

1K

121

362

149K

DLdotHub retweeted

Andrew Carr 🤸

@andrew_n_carr

about 1 month ago

somebody made a huggingface model visualizer!! just plug in the url and explore at any granularity

36

3K

361

2K

161K

DeepLearning.Hub @DLdotHub

about 1 month ago

RT @UnslothAI: You can now run open LLMs in Claude Code, Codex and OpenClaw via Unsloth. Use Gemma 4 and Qwen3.6 GGUFs for local agentic c…

0

6

0

10

DLdotHub retweeted

Asimov

@asimovinc

about 1 month ago

We're open-sourcing Asimov v1, a humanoid robot. With Asimov v1, you can build, train on, and make it your own humanoid robot. It's the first step of building a humanoid labor force for the rest of us. Asimov v1 is 1.2 m tall, 35 kg, with 25 actuated degrees of freedom. Structural parts machined in 7075 aluminium and 3D-printed in MJF PA12 nylon. We're releasing the mechanical design and simulation files. Ready for locomotion policy training out of the box. The BOM is open too. Source everything yourself, or order the DIY Kit. All components, ready to assemble. $499 deposit, $15,000 target price. Ships end of summer 2026. GitHub: https://t.co/kjqkny2oqW Manual: https://t.co/9tjkteOcxO DIY Kit: https://t.co/tzvzNyXQfA Most humanoid robots are controlled by the companies that build them. Asimov v1 is built for the rest of us. Build it, test it, and share your feedback with the community.

49

2K

278

1K

323K

DLdotHub retweeted

ollama

@ollama

about 1 month ago

Nemotron 3 Nano Omni is available locally on Ollama! This requires the latest Ollama 0.22 release.

37

424

42

91

51K

DeepLearning.Hub @DLdotHub

about 1 month ago

RT @UnslothAI: NVIDIA releases Nemotron-3-Nano-Omni, a new 30B open multimodal MoE model. Nemotron-3-Nano-Omni-30B-A3B is the strongest om…

0

3

0

15

DLdotHub retweeted

Unsloth AI

@UnslothAI

about 1 month ago

DeepSeek releases DeepSeek-V4. 🐋 - DeepSeek-V4-Pro: 1.6T params - DeepSeek-V4-Flash: 284B params DeepSeek-V4-Pro rivals Claude-Opus-4.6-Max, GPT-5.4-xHigh and Gemini-3.1-Pro-High. They support 1M context length, thinking and set new records for Codeforces.

UnslothAI's tweet photo. DeepSeek releases DeepSeek-V4. 🐋

- DeepSeek-V4-Pro: 1.6T params
- DeepSeek-V4-Flash: 284B params

DeepSeek-V4-Pro rivals Claude-Opus-4.6-Max, GPT-5.4-xHigh and Gemini-3.1-Pro-High.

They support 1M context length, thinking and set new records for Codeforces. https://t.co/Xtf6SC9SVD

48

1K

135

186

79K

DLdotHub retweeted

Unsloth AI

@UnslothAI

about 1 month ago

2-bit Qwen3.6-27B GGUF made 26 tool calls, triaged 15 GitHub issues and fixed, tested + reproed our repo’s 3 latest issues. 🔥 Try this locally in Unsloth Studio with just 12GB RAM. Studio also has a new look! GitHub: https://t.co/aZWYAtakBP

61

1K

158

1K

77K

DLdotHub retweeted

Unsloth AI

@UnslothAI

about 1 month ago

Qwen3.6-27B can now run locally! 💜 Run on 18GB RAM via Unsloth Dynamic GGUFs. Qwen3.6-27B surpasses Qwen3.5-397B-A17B on all major coding benchmarks. GGUFs: https://t.co/ykKgwh2zI9 Guide: https://t.co/ITLNq20WJp

UnslothAI's tweet photo. Qwen3.6-27B can now run locally! 💜

Run on 18GB RAM via Unsloth Dynamic GGUFs.

Qwen3.6-27B surpasses Qwen3.5-397B-A17B on all major coding benchmarks.

GGUFs: https://t.co/ykKgwh2zI9
Guide: https://t.co/ITLNq20WJp https://t.co/8ADXPDAyAk

93

3K

408

2K

575K

DeepLearning.Hub

@DLdotHub

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users