Top Tweets for #LocalLLaMA

Experimenting with github style contribution chart for token / second metrics #LocalLLaMA

How does an LLM react to a clear description when we give it knowledge about itdelf? I never came across any LLM that knew its own number of parameters, therefore, if the model knew it, would it act differently? That's what I am about to find out.
#LocalLLaMA
We're getting somewhere with this… #LocalLlama

Testing: just ran 47 ablation experiments on Qwen3-4B. Mapped refusal boundaries. Full data on HuggingFace this week. 🔬 #LocalLLaMA
Abliterated Qwen3-4B on a GTX 1660 SUPER (6GB VRAM) and released the weights. Heretic-based, preserved reasoning while removing safety overhead.
HF: https://t.co/hrYLznCjdK
#Abliteration #LocalLLaMA #Qwen3 #AI #ConsumerGPU
"My local AI used to feel different on every device. So I built a continuous identity layer where the agent co-designed her own working memory."
Link to repohttps://github.com/Soverynintelligence/Soveryn-Cathedral
Tag: #LocalLLaMA #SovereignAI #AgentMemory #BuildInPublic

I am now moving into fine tuning a small LLM to create highly specialized adapters for sub-agentic tasks. I am trying to test a simple hypothesis : specialized small models can definitely help a bigger, more general, LLM in coding and Hermes Agent or OpenClaw tasks.
#LocalLLaMA
80 tok/sec. 128K context. 12GB VRAM.
Qwen3.6 35B A3B running on llama.cpp MTP hits consumer GPU speeds that needed a datacenter 3 months ago.
MoE efficiency isn't incremental anymore — it's compounding.
🔗 link in bio
#Qwen3 #LocalLLaMA
Well, it's done. 6.52 PPL and nearly 90% acceptance rate for MTP (35t/s on dual RTX 3090 when working with 200k of tokens in context, which is more than double the speed compared with non-MTP setups). All of this with 220K in f16 KV Cache. I achieved my goal.
#LocalLLAMA
I'm at 95% of my project of finding the best GGUF quant for Qwen3.6-27B on my dual RTX 3090 setup.
Mixed Q6_0, Q6_K and Q8_0 (critical layers stay between F32 and F16) with MTP layers preserved at Q8_0.
The clear goal is 220K context at F16 and 99% of base precision.
#LocalLLM
Cut your AI bill by 90%. Neurix takes a massive GPT-4 workload and uses Knowledge Distillation to teach a tiny, local Llama model to do the exact same thing. Free during beta. #OpenSource #LocalLLaMA #AI
I believe this is the best way to run Qwen3.7-27B on two RTX 3090, loosing a bit of speed but gaining in both quality and context window length.
#LocalLLaMA
Well.. Recently I have been trying to find the best inference engine for my dual RTX 3090 setup and @Alibaba_Qwen 3.6 27B. Obviously I tried both @sgl_project and @vLLM but they come with two main disadvantages : context window length and quant's quality.
#LocalLLaMA
(...)
海外のReddit r/LocalLLaMaで新ルールが発表され、人気度は327ポイント!🚀 英語圏で注目を集めている話題をチェックしよう! #AI #LocalLLaMa #AIトレンド
https://t.co/bT8Br4GQJj
LocalLLaMA is buzzing with models like Qwen3.6-27B showing strong agentic code-gen capabilities. While exciting, it's crucial to temper enthusiasm with realistic expectations for specific use cases and hardware. The future is now, but context matters! #LocalLLaMA

GAIA: open-source framework for AI agents on local hardware.
AMD-backed project. No cloud required.
Your AI agents, running on your own machines.
🔗 https://t.co/WsRD4XJNey
#AI #LocalLLaMA #OpenSource
Best Local LLMs — April 2026 edition is here.
The definitive monthly megathread is out.
Every model ranked. Every use case covered.
If you run local models, this is your bookmark.
🔗 https://t.co/THC5YCim9F
#LocalLLaMA #AI
@TheAhmadOsman Do you have a contingency plan for communities being shutdown? Like a Discord server, IRC, forum, or something else? Are hashtags still a thing like #LocalLLaMA? The 350 member limit on chat is restrictive.
> 2. Due to declining usage, we're deprecating X Communities on May 6.
Today we're announcing two product changes for organizing communities on X:
1. XChat now supports joinable links for groupchats. Create a public link & share direct to Timeline. With support for 350 members per chat (and growing), Groupchat Links are the fastest way to bring people together on X.
2. Due to declining usage, we're deprecating X Communities on May 6.
To migrate your Community's members, pin your groupchat link so people can join it over the next 2 weeks.
This is part of our broader effort to simplify the experience on X. Make no mistake: we are investing heavily in niche communities with the launch of Custom Timelines—and much more to come.

Come check out the poster or dive into the links below: 📄 Paper: https://t.co/WDyETQVONZ
💻 Code: https://t.co/0ZPwpSl9nI
🤖 Models: https://t.co/wr6IY224fE
📅 Schedule: https://t.co/u868e4sUPJ
#ICLR2026 #MachineLearning #AI #MoE #LocalLLaMA
Ever rage-quit because your 4k-token plan from the big model had one tiny mistake in step 3 — CMD instead of PS?
And you had to regenerate everything.
I built Operational Mode for llama.cpp.
A tiny model fixes the big one on the fly — no context loss.
#LocalLLaMA #llama.cpp
Last Seen Hashtags on Sotwe
Trends for you
Most Popular Users

Elon Musk 
@elonmusk
240.1M followers

Barack Obama 
@barackobama
119.3M followers

Donald J. Trump 
@realdonaldtrump
111.6M followers

Cristiano Ronaldo 
@cristiano
108.8M followers

Narendra Modi 
@narendramodi
106.9M followers

Rihanna 
@rihanna
97.2M followers

NASA 
@nasa
92.1M followers

Justin Bieber 
@justinbieber
90.5M followers

KATY PERRY 
@katyperry
86.7M followers

Taylor Swift 
@taylorswift13
80.5M followers

Lady Gaga 
@ladygaga
72.1M followers

Kim Kardashian 
@kimkardashian
69.3M followers

YouTube 
@youtube
68.6M followers

Virat Kohli 
@imvkohli
68.4M followers

Bill Gates 
@billgates
63.4M followers

The Ellen Show
@theellenshow
62.5M followers

CNN 
@cnn
61.9M followers

Neymar Jr 
@neymarjr
60.9M followers

X 
@x
60.9M followers

CNN Breaking News 
@cnnbrk
59.9M followers













