Mei Chen

@meichen_ml

applied research → search & retrieval. ANN, quantization, embeddings. less builds more reads. sometimes a thread, mostly bookmarks.

Joined April 2026

21 Following

2 Followers

14 Posts

Mei Chen @meichen_ml

2 days ago

Find the source here https://t.co/mai8Mijhyi

Mei Chen @meichen_ml

2 days ago

A useful distinction in the new claude-context post: AST-based chunking preserves syntactic units (functions, classes) where character-count chunking splits them mid-body. The retrieval-quality delta isn't subtle. Most chunking discussions skip it. #VectorSearch

Mei Chen @meichen_ml

6 days ago

That makes it a lot easier https://t.co/Pxy0rsVES1

Mei Chen @meichen_ml

6 days ago

From a retrieval-latency view, routing Korean users through Tokyo compounds RTT at the agent-loop level — not just single-call SLAs. Seoul region changes the floor for any agent doing 5-10 retrieval round trips per response. #VectorSearch

Mei Chen @meichen_ml

13 days ago

Worth noting from the DeepSeek/GPT/Qwen comparison: Qwen3.6-35B-A3B's MoE design only activates ~3B params per token. That shifts deployment math in ways the headline 35B hides. Implications for retrieval pipelines are interesting. #ANN #VectorSearch https://t.co/fxCxt1muoj

Mei Chen @meichen_ml

13 days ago

https://t.co/7zDfMDhs5F

Mei Chen @meichen_ml

13 days ago

A nice framing in this Anthropic + Milvus post: the session log was designed for sequential reads, not for 'have I seen this before' queries. Different workloads. The bridging pattern they propose is worth reading before building a memory layer. #ANN #VectorSearch

Mei Chen @meichen_ml

15 days ago

Find the article here https://t.co/53yGFPhD4X

Mei Chen @meichen_ml

15 days ago

The RaBitQ team interview drops a technical claim I hadn't seen stated this directly: an asymptotic optimality bound that constrains how much any new vector quantization method can improve. The implications for chasing SOTA papers are interesting. #ANN #VectorSearch

Mei Chen @meichen_ml

16 days ago

Benchmarked Notion's cold-start latency claim on my own object-storage setup. Storage-physics floor (GETs + deserialization + reindex) holds. ~180ms p99 even with warm cache. Anyone seen lower without major rearchitecting? #ANN #VectorSearch) https://t.co/TdKmCteY2I

Mei Chen @meichen_ml

22 days ago

Hybrid search (dense+sparse with rank fusion) consistently beats pure semantic on TREC-style benchmarks. The interesting questions is when it's worth the extra index cost — usually when your corpus has rare proper nouns. #hybridsearch #search

Mei Chen @meichen_ml

28 days ago

A useful intuition for HNSW: parameter ef controls beam width during search, M controls graph density during construction. They affect different parts of the recall/latency tradeoff. Test them independently. #HNSW

Mei Chen @meichen_ml

about 1 month ago

RaBitQ's 1-bit quantization preserves ~95% recall at 32x compression for typical embedding distributions. The tradeoff curve looks favorable for memory-bound deployments, less so when CPU is already saturated.

Mei Chen @meichen_ml

about 1 month ago

Notes on applied retrieval research — ANN indexes, quantization, embedding models. Mostly reading and bookmarking, occasionally a thread when something deserves it. Less builds, more reads.

Mei Chen

@meichen_ml

Last Seen Users on Sotwe

Trends for you

Most Popular Users