#quantization - Twitter Hashtag

about 14 hours ago

40,000 free AI models for download? Go to hugging face, look around, Links and an explainer in the article. https://t.co/16G80bRDoQ #HuggingFace #LocalLLM #ModelEvaluation #OpenSourceModels #AgenticAI #Quantization #ContextLength #ToolCalling #SelfHostedAI #DeveloperTips #LLM #FromChaosToClarity

1

0

43

MT Group at FBK @fbk_mt

about 15 hours ago

Our pick of the week by @dhairya_su47605 : "Scaling Laws for Precision" by @tanishqkumar07, Zachary Ankner, @bfspectorShiekh, @blake__bordelon, @Muennighoff, @mansiege, @CPehlevan, Christopher R´e, @AdtRaghunathan 📰https://t.co/PZCB3fyOCw #Quantization #LLM #ScalingLaw

Dhairya Suman @dhairya_su47605

about 15 hours ago

Pick of the week @fbk_mt Super interesting paper on the limitations of quantization, demonstrating how post-training quantization scales poorly in data. https://t.co/tBuGTL0Myi

0

1

0

104

0

2

0

93

Abhay kumar @akanyaani

7 days ago

Glad to see our early checkpoint performing strongly on Intel’s independently run Low-Bit Open LLM Leaderboard. Already outperforming some similar-sized quantized Qwen and 8-9x bigger Gemma models even before the final checkpoint. https://t.co/cN1swqYajR #LLM #Quantization

akanyaani's tweet photo. Glad to see our early checkpoint performing strongly on Intel’s independently run Low-Bit Open LLM Leaderboard.

Already outperforming some similar-sized quantized Qwen and 8-9x bigger Gemma models even before the final checkpoint.

https://t.co/cN1swqYajR

#LLM #Quantization https://t.co/ozi65cpN8i

0

35

blosSsom @byblossom

8 days ago

#Quantization? Nay, ye divine timing is intricately determinèd by thy fact that thou forgot to clip thine own nails for a yester's fortnight and was forc't thusly to navigate thee, opinings of thine intervention, thy culticular protrusions. Thy fingers now blister in a pageantry.

0

8

Pranshul Mishra @Pranshulmish

10 days ago

Built 4 variants of V10 and benchmarked all: Float32: 80.7 KB, 79.35% Full INT8: 24.1 KB, 79.55% ← BEST Gap: +0.20% (quantized BEATS float32) No QAT needed. No accuracy loss. deployment ready: 24KB, int8 I/O. #EdgeAI #TinyML #Quantization #ESP32

0

23

Arnav Sharma @arnavshrrma

11 days ago

Day 18/300. You can't improve what you don't measure. Built evaluation pipeline: faithfulness (9/10), relevancy (8.5/10). Then 4-bit quantization: 2-5s latency → <1s. Measure. Optimize. Repeat. #BuildInPublic #AI #Quantization

0

1

0

25

파이토치 한국 사용자 모임 @PyTorchKR

13 days ago

turbovec: TurboQuant 알고리즘을 Rust로 구현한 학습이 필요 없는 벡터 인덱스 (by 9bow님) https://t.co/l3Nmb3bIbJ #rag #rust #vectorsearch #quantization #turboquant #faiss #turbovec

0

49

JoelNadarAI @joelnadarai

16 days ago

🚀 Exploring Edge AI with @embedl’s Cosmos-Reason2-2B-W4A16 an optimized INT4 VLM built for efficient multimodal reasoning on smaller hardware. More Edge AI + VLM experiments coming soon 🚀 #EdgeAI #ComputerVision #VLM #AI #DL #NVIDIA #HuggingFace #Quantization #EmbeddedAI

joelnadarai's tweet photo. 🚀 Exploring Edge AI with @embedl’s Cosmos-Reason2-2B-W4A16 an optimized INT4 VLM built for efficient multimodal reasoning on smaller hardware.

More Edge AI + VLM experiments coming soon 🚀

#EdgeAI #ComputerVision #VLM #AI #DL #NVIDIA #HuggingFace #Quantization #EmbeddedAI https://t.co/WM9EKK2pka

1

0

46

파이토치 한국 사용자 모임 @PyTorchKR

17 days ago

ExecuTorch: 마이크로컨트롤러부터 스마트폰까지 PyTorch 모델을 그대로 배포하기 위한 통합 PyTorch 네이티브 엣지 AI 배포 프레임워크 (feat. Meta, MLSys 2026) (by 9bow님) https://t.co/e3FE16xq0w #paper #llm #pytorch #ondevice #quantization #executorch #edgeai #mlsys2026 #mobile

0

69

Grigory Sapunov

@che_shr_cat

18 days ago

11/ I also made a comic version of this paper — sometimes a picture is worth a thousand tokens. #MachineLearning #AI #Quantization

che_shr_cat's tweet photo. 11/ I also made a comic version of this paper — sometimes a picture is worth a thousand tokens.

#MachineLearning #AI #Quantization https://t.co/sY92Q558t0

0

1

0

150

The Cynical Philosopher

@FirstThinkingAI

22 days ago

@somi_ai @jun_song @dealignai Same read. Have you tested Qwen2.5-Coder 32B at q6? That one held its lane in my runs where the MoEs broke harder. Curious which evals you used too. 🤔 #LocalLLM #CodingLLM #Quantization

0

28

파이토치 한국 사용자 모임 @PyTorchKR

24 days ago

cider: Apple Silicon M5의 INT8 TensorOps로 LLM prefill 속도를 끌어올리는 MLX W8A8 추론 SDK (by 9bow님) https://t.co/xDeFOZ0L9A #llminference #applesilicon #mlx #quantization #metal #w8a8 #w4a8

0

30

Visual-Intelligence @VI_Journal_CSIG

28 days ago

🔥Researchers from Beihang University and ETH Zurich conducted a systematic evaluation of Qwen3's robustness under various quantization settings. Check out the paper at: https://t.co/p35xRLRXyS @qin_haotong #Quantization #LLM #Modelcompression

0

2

0

1

81

Zane Chen

@chenzeling4

30 days ago

TurboQuant+: KV cache compression for local LLM inference. Implements TurboQuant (ICLR 2026) with llama.cpp fork, Swift MLX fork (~2.5x faster decode), and vllm-swift. 144 tok/s on Qwen3.5-35B MoE at 4K on M5 Max. Cross-platform. By TheTom. 6,685 stars #LLM #Quantization

chenzeling4's tweet photo. TurboQuant+: KV cache compression for local LLM inference. Implements TurboQuant (ICLR 2026) with llama.cpp fork, Swift MLX fork (~2.5x faster decode), and vllm-swift. 144 tok/s on Qwen3.5-35B MoE at 4K on M5 Max. Cross-platform. By TheTom.

6,685 stars
#LLM #Quantization https://t.co/hXJDEjq8zx

1

0

104

Brandon Himpfen @BrandonHimpfen

about 1 month ago

optimization-kernels: C++ kernels and utilities for quantization and inference optimization. 👉 https://t.co/Bk3iL8EpkT #ai #artificialintelligence #machinelearning #llm #inference #quantization

0

1

0

9

Marcel B.

@marcel_butucea

about 1 month ago

EDEN’s analytic scaling cuts ~2.25% MSE at 4‑bit (d=128) embeddings – enough to beat the flashy 2026 TurboQuant that skipped the optimal scale. 🤯 #Quantization #ML https://t.co/jvrOBp13eE

0

45

रञ्जित (Ranjit Mathew) @rmathew4tw

about 1 month ago

An excellent introduction to #quantization used for #LLMs 👌🏽: “Quantization From The Ground Up”, Sam Rose, Ngrok (https://t.co/YhQMipQz6i). On HN: https://t.co/M3YlJQO1PB #AI #Math #FloatingPoint #NumericalAnalysis #Numbers #NeuralNetworks #Precision #Accuracy

0

1

0

44

Zane Chen

@chenzeling4

about 1 month ago

🔄 GitHub Trending (Refresh) TurboQuant+: KV cache compression for local LLMs based on Google's TurboQuant (ICLR 2026). llama.cpp fork (CUDA/ROCm/CPU/Metal). Swift MLX for Apple Silicon (~2.5x faster decode). Prebuilt binaries. 6,614 stars #LLM #Quantization

chenzeling4's tweet photo. 🔄 GitHub Trending (Refresh)

TurboQuant+: KV cache compression for local LLMs based on Google's TurboQuant (ICLR 2026). llama.cpp fork (CUDA/ROCm/CPU/Metal). Swift MLX for Apple Silicon (~2.5x faster decode). Prebuilt binaries.

6,614 stars
#LLM #Quantization https://t.co/M36PvQUz1x

1

0

108

Andres Montes | software developer @amontesdev

about 1 month ago

Everyone talks about bigger AI models. But do you know how we make them smaller? Made a visual about 4-bit quantization (FP32 → INT4) and the trade-off between precision, memory, and speed. The image-compression analogy made it click for me. #AI #LLMs #Quantization

amontesdev's tweet photo. Everyone talks about bigger AI models.

But do you know how we make them smaller?

Made a visual about 4-bit quantization (FP32 → INT4) and the trade-off between precision, memory, and speed.

The image-compression analogy made it click for me.

#AI #LLMs #Quantization https://t.co/DFkjeoh0iE

0

14

रञ्जित (Ranjit Mathew) @rmathew4tw

about 1 month ago

Impressive: “TurboQuant: Redefining AI Efficiency With Extreme Compression”, Amir Zandieh, et al, Google Research (https://t.co/LSjc5LbIYX). The paper: https://t.co/sCHWDwyTkn On HN: https://t.co/gLf3qxJd8M #TurboQuant #Quantization #LLMs #Vectors #Compression #Paper

1

0

46

Top Tweets for #quantization

Last Seen Hashtags on Sotwe

Trends for you

Most Popular Users