G🐙

Verified account

@giannoklein

Productivity, AI and Crypto enthusiast.

Joined June 2011

174 Following

42 Followers

344 Posts

giannoklein retweeted

2 days ago

Vector databases are no longer a cloud product. They're becoming a pip install. A new open-source project called turbovec just crossed 10K stars on GitHub. And once you understand what it does, you understand why. It's a Rust vector index with Python bindings, built on Google Research's TurboQuant algorithm, a quantizer accepted at ICLR 2026 that compresses embeddings to within a hair of the theoretical Shannon limit. No codebook training. No train phase. No rebuilds as your corpus grows. You add vectors, they're indexed. Done. The headline number: A 10 million document corpus takes 31 GB of RAM as float32. turbovec fits it in 4 GB and searches it faster than FAISS. Read that again. Faster than FAISS. The library Meta has tuned for a decade. Hand-written NEON and AVX-512 kernels beat FAISS FastScan by 12–20% on ARM and match-or-beat it on x86. (And the recall benchmarks are published openly against FAISS as the baseline including the configs where it loses. That honesty alone is rare in this space.) But the speed isn't even the strategic part. The strategic part is what this enables: Fully local, air-gapped RAG. 10M documents in 4 GB means your entire company knowledge base fits in the RAM of a MacBook. Pair it with an open-source embedding model and nothing not a query, not a vector, not a document ever leaves your machine. It also ships drop-in replacements for the vector stores inside LangChain, LlamaIndex, and Haystack. Swap one import, keep your pipeline. The switching cost is approximately zero. The obvious comparison is SQLite. Databases used to be servers you provisioned and paid for. Then SQLite made the database a file inside your app, and an entire category of managed infrastructure became optional for most use cases. The same compression-driven collapse is now coming for vector search. Every startup selling "managed vector search" as a line item should be paying attention. When the index fits in laptop RAM, runs faster than the industry standard, and installs in one line the moat was never the database. The vector database is becoming an embedded library, not a cloud service. And the frontier of RAG just moved on-device. Really cool to see.

hasantoxr's tweet photo. Vector databases are no longer a cloud product. They're becoming a pip install.

A new open-source project called turbovec just crossed 10K stars on GitHub. And once you understand what it does, you understand why.

It's a Rust vector index with Python bindings, built on Google Research's TurboQuant algorithm, a quantizer accepted at ICLR 2026 that compresses embeddings to within a hair of the theoretical Shannon limit.

No codebook training. No train phase. No rebuilds as your corpus grows. You add vectors, they're indexed. Done.

The headline number: A 10 million document corpus takes 31 GB of RAM as float32. turbovec fits it in 4 GB and searches it faster than FAISS.

Read that again. Faster than FAISS. The library Meta has tuned for a decade. Hand-written NEON and AVX-512 kernels beat FAISS FastScan by 12–20% on ARM and match-or-beat it on x86.

(And the recall benchmarks are published openly against FAISS as the baseline including the configs where it loses. That honesty alone is rare in this space.)

But the speed isn't even the strategic part. The strategic part is what this enables:

Fully local, air-gapped RAG.

10M documents in 4 GB means your entire company knowledge base fits in the RAM of a MacBook. Pair it with an open-source embedding model and nothing not a query, not a vector, not a document ever leaves your machine.

It also ships drop-in replacements for the vector stores inside LangChain, LlamaIndex, and Haystack. Swap one import, keep your pipeline. The switching cost is approximately zero.

The obvious comparison is SQLite.

Databases used to be servers you provisioned and paid for. Then SQLite made the database a file inside your app, and an entire category of managed infrastructure became optional for most use cases. The same compression-driven collapse is now coming for vector search.

Every startup selling "managed vector search" as a line item should be paying attention. When the index fits in laptop RAM, runs faster than the industry standard, and installs in one line the moat was never the database.

The vector database is becoming an embedded library, not a cloud service. And the frontier of RAG just moved on-device.

Really cool to see.

28

636

122

873

37K

about 20 hours ago

Want to see how good Sam Altman lies? https://t.co/YwzZ4sYlxT

0

0

0

0

18

about 22 hours ago

@CultureCrave Imagine if we get this before GTA 6🥴

3

1

0

0

4K

giannoklein retweeted

ldt @madeby_ldt

1 day ago

➡️ Easier way, no need to disable SIP: sudo defaults write "/Library/Preferences/FeatureFlags/Domain/GenerativeModels.plist" "EnhancedSiriWaitlist" -dict-add Enabled -bool NO

28

464

29

614

87K

Who to follow

@ItsWigglesworth

seeking asylum on X from Instagram

1 day ago

@CastAsHuman This looks cleaner

giannoklein's tweet photo. @CastAsHuman This looks cleaner https://t.co/FWV5MkKnIT

1

6

0

4

680

giannoklein retweeted

2 days ago

More customization in the https://t.co/NtZBcmpMLM app, coming soon in the next update 👀

9

114

4

89

10K

1 day ago

What happened to the designer at Quarter? Bro cooked so hard and disappeared from my timeline like nothing happened.

giannoklein's tweet photo. What happened to the designer at Quarter?

Bro cooked so hard and disappeared from my timeline like nothing happened. https://t.co/eWM8ivZ94V

giannoklein's tweet photo. What happened to the designer at Quarter?

Bro cooked so hard and disappeared from my timeline like nothing happened. https://t.co/eWM8ivZ94V

giannoklein's tweet photo. What happened to the designer at Quarter?

Bro cooked so hard and disappeared from my timeline like nothing happened. https://t.co/eWM8ivZ94V

giannoklein's tweet photo. What happened to the designer at Quarter?

Bro cooked so hard and disappeared from my timeline like nothing happened. https://t.co/eWM8ivZ94V

0

0

0

0

39

giannoklein retweeted

Polymarket Money

@PolymarketMoney

2 days ago

THE NEW WORLD ORDER Meta Anthropic Nvidia Google OpenAI SpaceX

PolymarketMoney's tweet photo. THE NEW WORLD ORDER

Meta
Anthropic
Nvidia
Google
OpenAI
SpaceX https://t.co/nk3w6oZCmp

93

1K

161

171

99K

2 days ago

I understand Chinese consumers obsession with maximalism, but what the hell is this?🫪

2 days ago

K2geIsland v1.0.1 - Supports sending notifications to the Dynamic Island

4

59

4

14

8K

0

0

0

0

73

2 days ago

@iikalyango Lol, it’s not even been out for longer than 30 hours😂

0

1

0

0

6

2 days ago

HOW TO SPEED UP SIRI AI WAITLIST Simple. Ensure your language is set to US English, not UK. Close settings and navigate to Siri again. Update should start instantly

giannoklein's tweet photo. HOW TO SPEED UP SIRI AI WAITLIST

Simple. Ensure your language is set to US English, not UK. Close settings and navigate to Siri again. Update should start instantly https://t.co/7zaRcZFERW

48

129

5

26

56K

2 days ago

@ParthJadhav8 Pretty useless. If the trailer were to drop right now, you would know within the next hour without any alerts. That’s how big the GTA franchise is. Checking daily does not help when it comes to this level of anticipation.

giannoklein's tweet photo. @ParthJadhav8 Pretty useless.

If the trailer were to drop right now, you would know within the next hour without any alerts. That’s how big the GTA franchise is.

Checking daily does not help when it comes to this level of anticipation. https://t.co/84dCa47xMy

1

20

0

1

5K

2 days ago

@ostynhyss 54k context window? I’d rather use openrouter free models then.

0

4

0

1

1K

2 days ago

@first_mccoy @ArjunSubr The beta version itself exceeds the Siri model by almost double. I’m certain the amount of traffic during the update itself surpasses the Siri Ai by a mile.

0

1

0

0

186

2 days ago

@Perp101 @AdeeKulkarni @Mrwhosetheboss Sources?

0

1

0

0

11

2 days ago

@aaronp613 Lol this is old as hell

0

0

0

0

45

giannoklein retweeted

Guybrush Threepwood

@twistedmatrices

3 months ago

PSA: If you have multiple macbooks that support RDMA, you can cluster them using @exolabs and run 30B+ models at 70 tok/s over thunderbolt5. tensor parallelism on consumer hardware is a solved problem. you are renting GPUs that are worse than the laptop on your couch. 2X M4 Max(64GB each) running mlx-community/Qwen3-30B-A3B-4bit @ 70 TPS

25

629

40

421

659K

2 days ago

GOATED feature 🔒

giannoklein's tweet photo. GOATED feature 🔒 https://t.co/Fn3cpNle7K

0

0

0

0

406

giannoklein retweeted

3 days ago

If you've adopted AI at your company but haven't seen any tangible results, read this 1990 article: "The Dynamo and the Computer" by Paul David. When electricity first arrived, factories that "adopted" it barely got faster. They just swapped the steam engine for an electric one and ran everything else exactly as before: same machine layout, same workflow, same management. Electricity in, no real gains out. The most common mistake with any new technology is to drop it into the old organization and then declare the transformation done. The real leap came decades later, when each machine got its own small motor. Suddenly machines no longer had to be lined up around one central drive shaft. They could be rearranged around the actual flow of work. The productivity gains didn't come from electricity. They came from REDESIGNING THE ENTIRE FACTORY around it. AI is the same. Bolting it onto your existing process gets you a faster steam engine. The payoff comes when you redesign the work itself. (link to paper in comments)

zarazhangrui's tweet photo. If you've adopted AI at your company but haven't seen any tangible results, read this 1990 article: "The Dynamo and the Computer" by Paul David.

When electricity first arrived, factories that "adopted" it barely got faster. They just swapped the steam engine for an electric one and ran everything else exactly as before: same machine layout, same workflow, same management. Electricity in, no real gains out.

The most common mistake with any new technology is to drop it into the old organization and then declare the transformation done.

The real leap came decades later, when each machine got its own small motor. Suddenly machines no longer had to be lined up around one central drive shaft. They could be rearranged around the actual flow of work.

The productivity gains didn't come from electricity. They came from REDESIGNING THE ENTIRE FACTORY around it.

AI is the same. Bolting it onto your existing process gets you a faster steam engine. The payoff comes when you redesign the work itself.

(link to paper in comments)

142

4K

737

4K

276K

2 days ago

You want Siri AI, we want side loading. We are not the same.

David Leuliette

2 days ago

So basically, we will never have Siri AI in Europe. I have been waiting for the iPhone mirroring feature for over two years now, and we still don't have it. So why would the new Siri be introduced? #wwdc

flexbox_'s tweet photo. So basically, we will never have Siri AI in Europe.

I have been waiting for the iPhone mirroring feature for over two years now, and we still don't have it. So why would the new Siri be introduced?

#wwdc https://t.co/EqPjFe1aCm

131

1K

32

57

106K

0

0

1

0

138

2 days ago

@thsottiaux Told my agent to run nested loops. It complained, but I told it “just do it”. Not sure what it’s been doing the last 24hrs exactly but I’m sure Peter Steinberger is happy.

0

5

0

1

834

Last Seen Users on Sotwe

Trends for you

Most Popular Users