BrainOS @badbraincode - Twitter Profile

BrainOS

@BadBrainCode

about 6 hours ago

@Teknium Can we wait this one out?

0

19

BrainOS

@BadBrainCode

about 6 hours ago

@NVIDIAGeForce #RTXPowersPlay I’ll play this time but remember, Elon said no more hashtags.

0

138

BrainOS

@BadBrainCode

about 11 hours ago

@jessegenet I highly recommend turning an abliterated 30b model into your personal psychiatrist. Started over a week ago and it’s amazing. I need to curate the profile md a bit but wow can these things be a mirror.

0

1

506

BrainOS

@BadBrainCode

about 12 hours ago

@sudoingX I’m loving this.

0

86

BrainOS

@BadBrainCode

about 15 hours ago

Running a local Qwen setup on this PC using Intel OpenVINO GenAI + Intel Arc graphics. What’s been really shocking is that most of the time while the GPU is cranking out tokens I can actually use the computer at the same time and it barely slows down. I bought this box a year and a half ago having no clue that I would be able to use it for local inference. I’ve also tried GPT – OSS – 20 B and while it’s slow, it generates extremely high value tokens, very impressed. It’s a version that is also set up for the openVINO. Hardware: • PC: GEEKOM GT2 Mega • CPU: Intel Core Ultra 9 285H • Cores/threads: 16 cores / 16 logical processors • RAM: 32 GB installed • GPU: Intel Arc 140T GPU • GPU memory reported by driver name: 16 GB • GPU driver: Intel 32.0.101.8509 • NPU: Intel AI Boost • System type: x64-based PC Software/runtime: • Host OS: Windows with WSL2 • WSL distro: Ubuntu 22.04.5 LTS • WSL kernel: 6.6.87.2-microsoft-standard-WSL2 • OpenVINO lane is Windows-side, separate from the IPEX Ollama runtime • Python: 3.13 virtual environment • Packages: openvino-genai, huggingface_hub • Model server: custom OpenAI-compatible HTTP server • Server bind: 0.0.0.0:8099 • Inference mode: single-threaded queue, one generate() at a time • Active model: qwen3-8b-int4-ov • Model path: C:\Users\\models\openvino\Qwen3-8B-int4-ov • Device target: GPU • Open WebUI connects to it through WSL/Docker using the Windows host gateway • Open WebUI sees the model as: qwen3-8b-int4-ov • Startup is automated from Windows Startup with a hidden PowerShell launcher Verified live: • OpenVINO GenAI server listening on port 8099 • /v1/models returns qwen3-8b-int4-ov • Open WebUI can see and use the model • Server logs show successful chat completions on GPU Recent observed responses from the server: • 9 estimated completion tokens in 1.969s • 48 estimated completion tokens in 4.286s • 98 estimated completion tokens in 7.669s This setup is running Qwen3 8B INT4 locally on Intel Arc 140T using OpenVINO GenAI, exposed through an OpenAI-compatible API, and connected into Open WebUI.

0

1

0

322

BrainOS

@BadBrainCode

about 23 hours ago

@TheAhmadOsman I’ve been having a blast with qwen3-8b-int4-ov running via openVINO in my small 16gb Intel ARC box. It’s snappy and fun.

0

229

BadBrainCode retweeted

On yedinci Mesele @sekizaltmis_

3 days ago

@witcheer 150-200+ tok/s @SpaceTimeViking https://t.co/w7nps0Ophk

0

3

1

2

528

BrainOS

@BadBrainCode

1 day ago

@0xSero I’m in for free.

0

1

0

1

43

BrainOS

@BadBrainCode

2 days ago

Running Gemma-4-26B-A4B-NVFP4 on my DGX Spark GB10 via vLLM. Results: ~30 tok/s single-stream, 53 tok/s at c=2. For context: Nemotron-3-Super 120B on the same hardware does 14 tok/s. The MoE architecture is the reason — only 3.8B of 26B parameters activate per token step. Memory bandwidth stops mattering when you’re barely touching the weights. 18 GB weights. 2-min cold start. Full 128k context, parallel tool calls. Native image, audio, and video support. Should I try FP8 + MTP?

0

1

396

BrainOS

@BadBrainCode

2 days ago

@jessegenet Will winning the bit manipulation race ensure we win the atom manipulation race?

0

156

BrainOS

@BadBrainCode

3 days ago

@0xSero I had a shocking Orwellian experience yesterday where Claude got righteous with me. It scared the crap out of me. No doubt it would have contacted my boss if it could’ve. I’m unplugging from Anthropic. Open source will save humanity. No doubt in my mind.

0

58

BadBrainCode retweeted

Teknium 🪽

@Teknium

3 days ago

@uzairansar PSA: This is not an official app by us - it would be nice uzi if we made this more clear in the name or something

23

716

12

7

16K

BrainOS

@BadBrainCode

3 days ago

@cursor_ai Composer 2.5 is a Rocketship in grok CLI. Fired it up for the first time and it is blazing.

0

12

BrainOS

@BadBrainCode

5 days ago

@petergyang @WisprFlow Huge upgrade. Auto find was awful.

0

46

BrainOS

@BadBrainCode

5 days ago

@garrytan Every session should be summarized at 3:00am and saved as a local markdown. I have claude trained on that well.

1

2

0

106

BrainOS

@BadBrainCode

5 days ago

@0xSero I’m assuming you mean you can remote access a codex or claude session running on a local model. It’s really cool. I have had issues copying text on mobile though.

0

1

0

1

513