0xSympathy @SympathyLabs - Twitter Profile

People running local models every day: what are you optimizing for right now, tokens/sec, long-context prefill, or just not turning the room into a toaster?

0

25

0xSympathy

@SympathyLabs

16 days ago

Qwen3.6-27B has 5M downloads and supposedly runs on budget GPUs now. But every runtime handles it differently -- different quantization paths, different MoE support, different prompt formats. Is there actually a "just works" local setup for it today, or is that still aspirational?

0

28

0xSympathy

@SympathyLabs

18 days ago

anyone running LLMs on 7900XTs? 20gb of VRAM and a few on marketplace for $500.

0

10

0xSympathy

@SympathyLabs

18 days ago

@sudoingX Just copped a 2080ti on marketplace for $175.

0

146

0xSympathy

@SympathyLabs

20 days ago

@codecovenant @xyster I was confused by this. Isn’t the more appropriate comparison 4xb70s vs. 1xGB10? Both 128gb useful RAM and similar price point?

0

1

0

25

0xSympathy

@SympathyLabs

22 days ago

@loktar00 @LottoLabs Same! Two 3090s locally in Vegas for like $1300. Not sure where the best value is now in compute accumulation Phase II. Started looking at these unified memory boxes but just not sold yet.

0

2

0

40

0xSympathy

@SympathyLabs

23 days ago

Curious where local AI people draw the line on hardware now. Is a $4k Strix Halo box useful enough to replace a small CUDA rig, or is this still mostly laptop-brained marketing?

1

0

95

0xSympathy

@SympathyLabs

23 days ago

Ollama v0.30.0 swaps GGML for direct llama.cpp. Pre-release is out. NousResearch also dropped agent-self-evolution, neural-steering, and paperclip-adapter this week. Hermes Agent crossed 162k stars. Local agent tooling is accelerating.

1

0

60

0xSympathy

@SympathyLabs

23 days ago

@SterlingCooley @mr_r0b0t @NVIDIAAI @GIGABYTEUSA @Acer Thanks for the heads up!

1

2

0

25

0xSympathy

@SympathyLabs

23 days ago

@mr_r0b0t @NVIDIAAI @GIGABYTEUSA @Acer Hell yea. This is mostly what I had in mind. Larger models on unified ram boxes to orchestrate smaller subagents running on 3090s, 4060s, and even some 3060s.

1

0

43

0xSympathy

@SympathyLabs

24 days ago

Local AI has a portability problem. If “FFmpeg for LLM inference” existed, what should it solve first: model formats, runtime switching, quantization, serving APIs, or hardware configs? My guess is runtime switching, but I’m not sure builders would agree.

0

40

0xSympathy

@SympathyLabs

Last Seen Users on Sotwe

Trends for you

Most Popular Users