Top Tweets for #llama_cpp
Just got llama.cpp running on IQ9 — fully on-device, no cloud!
Running Qwen3-0.6B Q4_0 (442MB) via the Hexagon HTP NPU backend:
- Prefill: 430 tok/s (pp512)
- Generation: 32 tok/s (tg128)
#llama_cpp #Qualcomm #EdgeAI #OnDeviceAI #Snapdragon

llama.cppに衝撃の噂。シェル内蔵で常識崩壊?超自律AI「Mythos」の戦慄と、無音でAIを操る「AudioHijack」の脅威。エンジニアが震える技術の最前線、激変の裏側を暴く。詳細は動画で! #GeekTerminal #AI #llama_cpp
VRAM12GBで35Bモデルが110tok/s!?ik_llama.cppが物理限界を突破。ミドルGPUで巨大AIが爆速駆動する理由は?夕方の深掘り動画で全貌を解明。絶対に見逃すな! #GeekTerminal #AI #llama_cpp #Qwen #Shorts
本日、ローカルAI界に革命!llama.cppにMTPが正式統合されRTX 3090 Tiで150 tok/sを記録。投機的デコードはもう古い?爆速化の秘密と衝撃の検証結果は、本日夕方公開の深掘り動画で!
#GeekTerminal #AI #llama_cpp #MTP #Qwen
ローカルLLM動かすならllama.cppを直接触れるようになっておくのが結局いちばん潰しが効く。Ollama・LM Studio・KoboldCppは全部これがエンジン。最新b9085対応の決定版ガイド書いた。
#llama_cpp #ローカルLLM
https://t.co/YwrmCLLQjx
Run a local LLM REST API on CPU in under 30 mins with llama.cpp — no GPU required. Step-by-step guide covering build, quantized model setup, and an OpenAI-compatible endpoint. https://t.co/37DJHPoEUR #LLM #MLOps #llama_cpp
Intel ArcのVulkanドライバにDescriptor Heaps実験的サポートが入った。A770でllama.cppの70Bモデル推論が12.5→14.2 t/sに改善。VRAM 16GBを活かし切る道が見えてきた
#IntelArc #ローカルLLM #llama_cpp
https://t.co/aSyopgFzMT
llama.cpp b9028、デバイスバッファの動的割り当てでVRAM約8%削減。RTX 4090の24GB環境でQwen2.5-72B Q4_K_Mテストしたら1.9GBの余裕が生まれた。CUDA/Metal/Vulkan/ROCm全対応なのが地味にデカい
#llama_cpp #ローカルLLM #VRAM節約
https://t.co/uO7GFc4QXE
vlang + AI: https://t.co/jnUGTtdoSi
v_llama_cpp is the V binding for llama.cpp, and with V's simple veb library, you can quickly spin up lightweight AI services.
#vlang #AI #llama_cpp

Just hooked up my Hermes Agent with Qwen3.5-9B running on RTX 3060 using llama.cpp. Both services dockerised and deployed via @nunet_global Appliance
#llama_cpp #AIAgents
Is MCP too slow for AI agents? I drafted an RFC for Liquid Context Protocol (LCP) — replacing JSON-RPC with zero-latency in-memory WASM tool execution. Looking for C++/Rust devs.
Code: https://t.co/7QEGAWNJii
#LocalLLaMA #WebAssembly #AI #llama_cpp @ggerganov
Just published! 📝 ~21 tok/s Gemma 4 on a Ryzen mini PC.
If you run AMD on Linux and want max iGPU performance for local AI, this guide is for you. 👇 https://t.co/YKuliTPydt
#AI #llama_cpp #Vulkan


5⃣/8 Want more control? Use llama.cpp directly:
🔸 git clone https://t.co/HJcx2knYJn
🔸 make
🔸 Download a .gguf model (HuggingFace)
🔸 ./llama-cli -m model.gguf -p 'Hello!'
GGUF is the quantized format. Look for Q4_K_M = best balance. #llama_cpp #OpenSource
Googleの「TurboQuant」すごすぎる
雑魚環境(雑魚CPU/8GBのGPU)でも
Llama-3-8B×64 KContextで6.6GB
前は1.7GBくらいVRAMから溢れて、共有に逃げて速度激落ちしてたのに…
64K分のKV積んで余裕持って動くのは感動
ローカルの壁が壊れはじめた気がする。
#VibeCoding #LLM #TurboQuant #llama_cpp

llama.cpp just hit 100K GitHub stars. The creator predicts: within 3–6 months, 90% of all AI agents will run locally via llama.cpp. Zero API costs, zero data leaks. Cloud AI is about to feel the pressure. #LocalAI #llama_cpp
`llama.cpp`'s q8 KV cache reduces AIME25 performance, but a rotation technique largely recovers it. This paves the way for advanced methods like TurboQuant & uses `llama-eval` for testing.
#llama_cpp #AI #performance

llama.cpp直叩き、確かにオーバーヘッドないから最速ですよね。Ollamaは便利だけど、リソース制限厳しい環境だとllama.cpp一択な気持ち、すごい分かります。細かい量子化制御したい派ですか? #LocalLLM #llama_cpp
https://t.co/onYs0Roqbr
llama.cpp is the core engine—Ollama adds abstraction layers that cut speed 20-30%+, bloat memory, and hide fine controls (quantization, offloading, Vulkan for AMD). Direct llama.cpp = max tokens/sec, smaller footprint, instant latest features.
Hermes Agent (Nous Research) is open-source with a built-in learning loop: it auto-creates/improves skills from your tasks, remembers across sessions, runs anywhere (local/VPS) without vendor lock-in or OpenClaw's reported privacy risks. Better for real agent work.
Last Seen Hashtags on Sotwe
gingerboy
Seen from Netherlands
bottomlessskirt
Seen from Russia
ไซด์ไลน์โคราช
Seen from Thailand
FlutterThrive
Seen from Venezuela
CooperPlumbingCompany
Seen from Brazil
BLM
Seen from Canada
CineEnHuelva
Seen from Ukraine
watercolorpainting
Seen from Argentina
balochisex
Seen from Pakistan
รับงานเครือสหพัฒน์
Seen from Thailand
Most Popular Users

Elon Musk 
@elonmusk
240.1M followers

Barack Obama 
@barackobama
119.3M followers

Donald J. Trump 
@realdonaldtrump
111.6M followers

Cristiano Ronaldo 
@cristiano
108.8M followers

Narendra Modi 
@narendramodi
106.9M followers

Rihanna 
@rihanna
97.2M followers

NASA 
@nasa
92.1M followers

Justin Bieber 
@justinbieber
90.5M followers

KATY PERRY 
@katyperry
86.7M followers

Taylor Swift 
@taylorswift13
80.5M followers

Lady Gaga 
@ladygaga
72.1M followers

Kim Kardashian 
@kimkardashian
69.3M followers

YouTube 
@youtube
68.6M followers

Virat Kohli 
@imvkohli
68.4M followers

Bill Gates 
@billgates
63.4M followers

The Ellen Show
@theellenshow
62.5M followers

CNN 
@cnn
61.9M followers

Neymar Jr 
@neymarjr
60.9M followers

X 
@x
60.9M followers

CNN Breaking News 
@cnnbrk
59.9M followers






















