ilya @iryaboy24 - Twitter Profile

@sudoingX Qwen3.6-35B-A3B-UD-Q5_K_M.gguf \ --tensor-split 12,28 \ -ngl 99 \ -c 65536 \ -b 512 \ -np 1 \ --host 0.0.0.0 \ --port 8080 \ --flash-attn on \ --cache-type-k q8_0 \ --cache-type-v q8_0 \ --no-warmup

$iryaboy24's tweet photo. @sudoingX Qwen3.6-35B-A3B-UD-Q5_K_M.gguf \ --tensor-split 12,28 \ -ngl 99 \ -c 65536 \ -b 512 \ -np 1 \ --host 0.0.0.0 \ --port 8080 \ --flash-attn on \ --cache-type-k q8_0 \ --cache-type-v q8_0 \ --no-warmup https://t.co/W6467lbqwL$

0

6

0

299

Who to follow

David Nage🎯

@DavidNage

Venture Capital PM @arca | Base Layer Podcast Host on Apple & Spotify | Investor & Supporter of Shadowy Super Coders | Opinions my own | Not investment advice

BostonBetter

@BostonBetter93

Anyone ever bet on Curling? Just me? Cool.

2 months ago

@tobi I wrote something very similar locally. Persistent memory and Claude code hooks. https://t.co/9PI4CKryMY

0

3

0

140

ilya

@iryaboy24

2 months ago

@LottoLabs 150 tokens per second with Qwen 3.5 35B. Can’t wait to try 3.6

0

3

0

69

ilya

@iryaboy24

2 months ago

@garrytan Why would you use supabase instead of a local docker? Just plugging into another cloud SaaS for no reason

0

1

0

26

ilya

@iryaboy24

2 months ago

@kimmonismus Gemm4 26B is absolutely amazing and run on a 5070ti 16GB. Can not wait to try 31B on a 3090 with a larger context and kv

0

68

ilya

@iryaboy24

2 months ago

@NousResearch @Magnificode @openclaw Please check out the repo where I merge the 2 because I had the same question! https://t.co/FaBwr7IikR Benchmark results have come out pretty good. And retrieval speed is as fast with no wasted tokens.

0

3

0

2

161

ilya

@iryaboy24

2 months ago

@bcherny @trq212 @ClaudeCode @AnthropicAI @NousResearch @Teknium Benchmarks: • 50.2% LongMemEval (beats Mem0, entirely local) • 90.7% HaluMem false memory resistance • 34ms avg retrieval All self-hosted. No OpenAI. No cloud. Star it, try it, break it 🐾 https://t.co/FaBwr7IikR

1

0

75

ilya

@iryaboy24

2 months ago

I just open sourced my first project! Shiba Memory, persistent, self-improving memory for AI agents. 34ms hybrid retrieval. ACT-R scoring. Native Claude Code + Hermes support. Fully local, no cloud deps. 🔗 https://t.co/FaBwr7IikR Thread 🧵

2

4

0

1

174

ilya

@iryaboy24

2 months ago

@bcherny @trq212 @ClaudeCode @AnthropicAI Hermes users, Shiba ships as a native memory provider plugin. 🔌 shiba_recall, shiba_remember, shiba_forget tools available to the LLM out of the box. Auto memory on every turn. Session summaries. Prefetch before each response. @NousResearch @teknium

1

2

0

81

ilya

@iryaboy24

2 months ago

What Shiba actually does: • Remembers across ALL sessions & projects • Hybrid semantic + full-text search • Links related memories via a knowledge graph • Instincts evolve into skills over time • Ingests web, RSS, git, files • HTTP gateway for any agent

0

41

ilya

@iryaboy24

2 months ago

@sudoingX Adding a 3090 to the rig this week! I have a 5070ti and it runs gemma4 26B pretty well but not enough context bc of the space. Qwen14B is terrible. Plan to try gemma4 31B first on the 3090

0

2

0

1

685

ilya

@iryaboy24

2 months ago

@Teknium Gemma 26B is the best one. Hermes ↓ Gemma4 26B local ↓ Shiba Brain (living memory) ├── pgvector semantic search ├── Full-text hybrid search ├── Knowledge graph (memory_links) ├── ACT-R decay (memories that evolve) └── HTTP gateway

0

1

0

19

ilya

@iryaboy24

2 months ago

@Teknium Also make sure to create a cli wrapper so Hermes can talk to it and recall very quickly without using the full context window.

0

1

0

22

ilya

@iryaboy24

2 months ago

@Teknium pgvector on docker has worked really well. Do a combination of what karpathy talks about with the wiki LLM obsedian md files and then underlying it with core/long term memory in the db

1

2

0

315

ilya

@iryaboy24

2 months ago

@Teknium Can honcho run locally now as well?

1

2

0

461

ilya

@iryaboy24

3 months ago

@potatokmish A solution is coming, https://t.co/vP1zQpdfTM

0

1

0

23

ilya

@iryaboy24

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users