ByteShape @byteshape - Twitter Profile

23 days ago

We released Qwen 3.6 35B A3B GGUF quants in both NTP and MTP. The benchmark results made one thing clear: size, speed, and quality do not move in a straight line. GPU-5 was hard to beat. If it fits, try it first. Blog: https://t.co/WJrOEKPjfk

ByteShape's tweet photo. We released Qwen 3.6 35B A3B GGUF quants in both NTP and MTP.

The benchmark results made one thing clear: size, speed, and quality do not move in a straight line.

GPU-5 was hard to beat. If it fits, try it first.

Blog: https://t.co/WJrOEKPjfk https://t.co/uV2vVeWHC0

0

7

3

1

556

ByteShape @ByteShape

24 days ago

@Capetlevrai @0xSero 👀

0

23

ByteShape @ByteShape

2 months ago

@KC_goes_digital We have them on our radar. Our models are optimized for the best quality vs speed/size tradeoff. We benchmark on real tasks and measure performance across different hardware, so it’s easier to find what actually works best on your setup.

1

0

49

ByteShape @ByteShape

2 months ago

We recently released our Qwen 3.5 35B A3B quants. If your setup can run GPU-7, you should try it. If not, we’ve got options across all hardware. Pi → 5090. Blog: https://t.co/d1TCjbIgMS

ByteShape's tweet photo. We recently released our Qwen 3.5 35B A3B quants.

If your setup can run GPU-7, you should try it.
If not, we’ve got options across all hardware.
Pi → 5090.

Blog: https://t.co/d1TCjbIgMS https://t.co/7CnyPyecJH

3

28

5

13

2K

ByteShape @ByteShape

2 months ago

@therealsol4ra Regardless, we're happy for everyone to perform independent evaluations of our models and test them out for yourselves! We're highly confident in their quality!

0

1

0

73

ByteShape @ByteShape

2 months ago

@therealsol4ra KLD is a distribution deviation metric measuring in this case the deviation in token generation between quants and the original model. However, it does not measure behaviour, like the model taking a different path to solve the same problem than the original model for example.

1

0

87

ByteShape @ByteShape

2 months ago

@Yeely1310 @ravikiran_dev7 We're glad our quants are not falling for that👀

0

2

0

26

ByteShape retweeted

wd 🔺

@populartourist

2 months ago

+ Hermes Agent + Qwen3.5 35B A3B + 4x parallel agents with 262k context window (each) + Over 200 t/s token generation + 3000 t/s prefill + 23.2GB total VRAM consumption on RTX 5090 It can take 5 parallel agents, 4 was the sweet spot with 2x in completion time vs 1.74x. Dream inference. @NousResearch @ByteShape

13

185

13

190

24K

ByteShape @ByteShape

2 months ago

Run your own local AI coding agent We just published a beginner guide for using @opencode with local models (@lmstudio, llama.cpp, @ollama). Mac, Linux, WSL2, full setup + API + config. https://t.co/TzgNNTxU49 From “I have a model” → “I have a working coding agent”

ByteShape's tweet photo. Run your own local AI coding agent

We just published a beginner guide for using @opencode with local models (@lmstudio, llama.cpp, @ollama).
Mac, Linux, WSL2, full setup + API + config.

https://t.co/TzgNNTxU49

From “I have a model” → “I have a working coding agent” https://t.co/USDPgzKwRo

1

8

4

396

ByteShape @ByteShape

2 months ago

GPUs are consistent. CPUs are not. With our ByteShape Qwen 3.5 9B quants, the same models perform well across GPUs, but CPUs each have their own “favorites”. No one-size-fits-all. Optimize for your hardware. https://t.co/RSX3iK3vgh

ByteShape's tweet photo. GPUs are consistent. CPUs are not.

With our ByteShape Qwen 3.5 9B quants, the same models perform well across GPUs, but CPUs each have their own “favorites”.

No one-size-fits-all. Optimize for your hardware.

https://t.co/RSX3iK3vgh https://t.co/U6brEWaogr

1

8

4

0

248

ByteShape retweeted

Allen Lau 🇨🇦

@allenlau

3 months ago

ByteShape was quietly launched just before the year end. Two weeks ago, we announced our investment in the company. Since its launch, and with minimal fanfare on purpose, @ByteShape cumulative downloads have easily blown past 100,000. No small feat for a new startup!

1

6

3

0

313

ByteShape retweeted

Allen Lau 🇨🇦

@allenlau

3 months ago

Announcing @twosmallfishvc's investment in @ByteShape. In short, ByteShape is delivering step-function gains in AI efficiency, including up to 7x faster training, up to 10x faster inference, plus up to 40% compression to reduce model size.

allenlau's tweet photo. Announcing @twosmallfishvc's investment in @ByteShape.

In short, ByteShape is delivering step-function gains in AI efficiency, including up to 7x faster training, up to 10x faster inference, plus up to 40% compression to reduce model size. https://t.co/YVBPFZMGBB

2

6

4

0

343

ByteShape @ByteShape

4 months ago

@atarl666028 @Alibaba_Qwen 👀

1

2

0

377

ByteShape retweeted

wd 🔺

@populartourist

4 months ago

Excellent quantisation on Devstral Small 2

0

5

2

0

261

ByteShape @ByteShape

4 months ago

We released ShapeLearn-optimized GGUFs for: • Devstral Small 2 24B, tuned for RTX 40/50 GPUs • Qwen3 Coder 30B, runs everywhere, yes even the Pi Maximum quality. Fastest TPS. Minimal compromise. GGUFs + interactive plots are live: https://t.co/VVZ87Pvm1p

ByteShape's tweet photo. We released ShapeLearn-optimized GGUFs for:

• Devstral Small 2 24B, tuned for RTX 40/50 GPUs
• Qwen3 Coder 30B, runs everywhere, yes even the Pi

Maximum quality. Fastest TPS. Minimal compromise.

GGUFs + interactive plots are live: https://t.co/VVZ87Pvm1p https://t.co/k0rCpYAUIV

0

9

3

2

581

ByteShape @ByteShape

5 months ago

Edge computing is getting spicy! Shoutout to @geerlingguy for showcasing our model. Love seeing what the community is building and how hard it’s being pushed. Clip: https://t.co/aPvRpsxIAC

Jeff Geerling

@geerlingguy

5 months ago

Raspberry Pi has a new AI HAT. This time with built-in 8 GB of RAM, so you can run machine vision + LLM inference all without touching the Pi's CPU. It's $130 and a little bit of a niche item. Find out why in my video: https://t.co/vMhZ5w1wCU

25

601

46

207

56K

0

6

2

0

344

ByteShape retweeted

Jeff Geerling

@geerlingguy

5 months ago

Raspberry Pi has a new AI HAT. This time with built-in 8 GB of RAM, so you can run machine vision + LLM inference all without touching the Pi's CPU. It's $130 and a little bit of a niche item. Find out why in my video: https://t.co/vMhZ5w1wCU

25

601

46

207

56K

ByteShape retweeted

HackerNewsTop5 @hackernewstop5

5 months ago

A 30B Qwen Model Walks into a Raspberry Pi and Runs in Real Time #HackerNews https://t.co/5v7VyAe7ES

0

4

2

0

374

ByteShape

@ByteShape

Last Seen Users on Sotwe

Trends for you

Most Popular Users