Josh Stevenson | RecursiveIntell

Verified account

@RecursiveIntell

AI systems engineer building AiDENs: Rust-native agents with receipts, replay, permits, and provenance-first memory. turbo-quant · semantic-memory

Albertville, AL

Joined February 2025

3.4K Following

1.4K Followers

8.9K Posts

Josh Stevenson | RecursiveIntell

@RecursiveIntell

about 7 hours ago

I'm getting 17 tok/s from a 2.5M parameter model on an esp32s3 micro-controller with my custom inference engine and my way of scoring and reading from still compressed kv cache which drastically drops cpu usage, which in turn allows me to put more into inference. This blows the previous record completely out of the water. guy got 2 tok/s on a ~250k parameter model. Not only that, but it opens up lots of on device ai options that were never possible before. This is on a $4 chip from amazon btw.

RecursiveIntell's tweet photo. I'm getting 17 tok/s from a 2.5M parameter model on an esp32s3 micro-controller with my custom inference engine and my way of scoring and reading from still compressed kv cache which drastically drops cpu usage, which in turn allows me to put more into inference. This blows the previous record completely out of the water. guy got 2 tok/s on a ~250k parameter model. Not only that, but it opens up lots of on device ai options that were never possible before. This is on a $4 chip from amazon btw.

1

1

0

0

6

Josh Stevenson | RecursiveIntell

@RecursiveIntell

about 11 hours ago

@awnadam0 @AnthropicAI Don't understand time zones? Also, they never said when.

1

1

0

0

873

Josh Stevenson | RecursiveIntell

@RecursiveIntell

about 18 hours ago

Yup, that's what I do. I just built a custom kv cache compression that allows scoring and recalling from the still compressed cache. Built the custom scorer and everything. It's about to give everyone a 4x speed increase along with a few more things to esp32/esp32s3. I have a 6.5 million parameter model running at 2 tok/s on it with the model only using 4.5MB. All with gpt 5.5 and my custom Hermes agent.

0

1

0

0

75

Josh Stevenson | RecursiveIntell

@RecursiveIntell

2 days ago

Using what I learned working on all the quantization crates I have and building my stack, I built an advanced context compactor that does a better job and does it a lot faster. Combine this with my semantic memory and you get a meaningfully upgraded agent beyond most.

RecursiveIntell's tweet photo. Using what I learned working on all the quantization crates I have and building my stack, I built an advanced context compactor that does a better job and does it a lot faster. Combine this with my semantic memory and you get a meaningfully upgraded agent beyond most. https://t.co/jMKvKiTI5P

1

1

0

0

32

Josh Stevenson | RecursiveIntell

@RecursiveIntell

4 days ago

my memory system is 66-100x faster at retrieval than Zep, all while achieving higher quality retrieval as well. Nothing else even comes close. sub-ms simple top5 retrieval. 3ms advanced graph traversal. It allows you to rely heavily on the memory without performance cost.

RecursiveIntell's tweet photo. my memory system is 66-100x faster at retrieval than Zep, all while achieving higher quality retrieval as well. Nothing else even comes close. sub-ms simple top5 retrieval. 3ms advanced graph traversal. It allows you to rely heavily on the memory without performance cost. https://t.co/9xkzfDnTOn

0

1

0

0

28

Josh Stevenson | RecursiveIntell

@RecursiveIntell

5 days ago

Check out https://t.co/s7Wl3klrwc, please @NousResearch, as it will help transform any agent into a next Gen learning machine, fully local and it beats any other agent memory in features and retrieval quality, not necessarily speed, but it's in the 20-130ms range for most skills. I've poured over 6 months into the project this is from and the crate it wraps is very mature. I use it and couldn't live without anymore.

0

0

0

0

7

Josh Stevenson | RecursiveIntell

@RecursiveIntell

5 days ago

a graphic explaining the different profiles and the amount of tools available in semantic-memory-mcp

RecursiveIntell's tweet photo. a graphic explaining the different profiles and the amount of tools available in semantic-memory-mcp https://t.co/GbvXJ129TY

0

0

0

0

18

Josh Stevenson | RecursiveIntell

@RecursiveIntell

5 days ago

Benchmarks of semantic-memory-mcp. For a local-first database, i'm quite happy with how things are developing. I may even have a big upgrade for database size soon that shrinks it both on disk and in ram dramatically.

RecursiveIntell's tweet photo. Benchmarks of semantic-memory-mcp. For a local-first database, i'm quite happy with how things are developing. I may even have a big upgrade for database size soon that shrinks it both on disk and in ram dramatically. https://t.co/FJh3zmswqX

0

0

0

0

12

Josh Stevenson | RecursiveIntell

@RecursiveIntell

5 days ago

diagram that covers how everything connects and is used with my semantic memory mcp server.

RecursiveIntell's tweet photo. diagram that covers how everything connects and is used with my semantic memory mcp server. https://t.co/dieisHV6TM

1

2

0

0

26

Josh Stevenson | RecursiveIntell

@RecursiveIntell

6 days ago

Short demo of the autonomous closed-loop learning built with AiDENs. requires no human intervention, but you can give it goals if you like, otherwise it finds gaps in its knowledge and figures it out itself. you can see information about the runs at https://t.co/0u4o18TLaA

0

1

0

0

23

Josh Stevenson | RecursiveIntell

@RecursiveIntell

6 days ago

https://t.co/0u4o18TLaA

0

0

0

0

10

Josh Stevenson | RecursiveIntell

@RecursiveIntell

6 days ago

I'll be dropping some demos of AiDENs later on. This is the first impressive thing i've done with it. It's done and a thing to behold. It could infinitely run and learn if i could afford the usage, lol. The greatest part about it is everything, every crate is modular and able to work with any other. @anthropicai i know this is something you are interested in and I had to wait until my memory system was mature enough to handle it. Fully autonomous, closed-loop learning system. You can give it a goal to work toward or let it figure out what to do on its own.

RecursiveIntell's tweet photo. I'll be dropping some demos of AiDENs later on. This is the first impressive thing i've done with it. It's done and a thing to behold. It could infinitely run and learn if i could afford the usage, lol. The greatest part about it is everything, every crate is modular and able to work with any other. @anthropicai i know this is something you are interested in and I had to wait until my memory system was mature enough to handle it. Fully autonomous, closed-loop learning system. You can give it a goal to work toward or let it figure out what to do on its own.

1

0

0

0

20

Josh Stevenson | RecursiveIntell

@RecursiveIntell

6 days ago

I'm building the entire ai ecosystem in rust for speed, memory savings, reliability and determinism not easily achieved in other languages. i've shipped a fully local memory system that has every feature all the competitors have, plus a whole lot more. it's part of my stack, but most useful as is. next-gen is the idea. https://t.co/bufwxCGcMo

0

1

0

0

14

Josh Stevenson | RecursiveIntell

@RecursiveIntell

7 days ago

@ClaudeDevs @OpenAIDevs @NousResearch and nothing leaves your machine. no cloud dependencies. fully local.

0

0

0

0

15

Josh Stevenson | RecursiveIntell

@RecursiveIntell

7 days ago

my memory system has become flawless in integration for @ClaudeDevs (claude code) @OpenAIDevs (codex), @NousResearch (hermes) and more. it transforms them from stateless into not making the same mistake twice, being able to learn and forget. all the latest on git. easy install.

RecursiveIntell's tweet photo. my memory system has become flawless in integration for @ClaudeDevs (claude code) @OpenAIDevs (codex), @NousResearch (hermes) and more. it transforms them from stateless into not making the same mistake twice, being able to learn and forget. all the latest on git. easy install. https://t.co/1OQhNG6YTs

2

1

0

0

34

Josh Stevenson | RecursiveIntell

@RecursiveIntell

7 days ago

@ClaudeDevs @OpenAIDevs @NousResearch https://t.co/qJZ5lwchJs

0

0

0

0

14

Josh Stevenson | RecursiveIntell

@RecursiveIntell

7 days ago

@dmkai98 @karpathy They do with actual memory. I wrote an agent agnostic advanced semantic-memory system that's fully local, no cloud embedding or vector database. It's much more advanced than RAG. the readme isn't fully updated with all the features. https://t.co/qJZ5lwchJs

2

3

0

8

519

Josh Stevenson | RecursiveIntell

@RecursiveIntell

10 days ago

https://t.co/CKLa62meRm This is a plugin for claude code that automatically installs and sets up semantic-memory so claude never forgets anything relevant ever again. Also comes with repo ingestion commands to automatically populate the database with relevant info for easy start

0

0

0

0

27

Josh Stevenson | RecursiveIntell

@RecursiveIntell

11 days ago

video of a cold start hermes agent answering a personal and complex question using my memory system.

0

1

0

0

31

Josh Stevenson | RecursiveIntell

@RecursiveIntell

11 days ago

an example of codex using semantic-memory-mcp for the first time. all i did was point it at it and it did the rest.

RecursiveIntell's tweet photo. an example of codex using semantic-memory-mcp for the first time.

all i did was point it at it and it did the rest. https://t.co/kTldf9ZnOF

0

0

0

0

26

Last Seen Users on Sotwe

Trends for you

Most Popular Users