Jiang Chen

@jiangc1010

DevRel and Solutions @ Zilliz/Milvus; Prev: PM & TL @ Google Search Indexing

California, USA

Joined September 2016

105 Following

125 Followers

128 Posts

Jiang Chen @jiangc1010

6 months ago

@karanjagtiani04 @milvusio The operations in Milvus can be observed via https://t.co/wUHt76xrmN The performance and health of the embedding service can be observed by the service provider’s dashboard.

Jiang Chen @jiangc1010

9 months ago

@hack_the_cloud Cool work! How much data volume is that designed for or tested on?

Jiang Chen @jiangc1010

10 months ago

Claude Context is open-sourced https://t.co/NoNGATaUTS

Jiang Chen @jiangc1010

10 months ago

I’m a fan of Claude Code, an exceptional AI coder invented by @bcherny @_catwu. I wish its context could remember my entire codebase. However, fitting millions of lines of code in each call would burn so many tokens to drive me bankrupt. To solve this problem, we developed an MCP that efficiently stores large codebases in a vector database and searches for related sections to use as context. The result is Claude Context, a semantic search MCP plugin for Claude Code. Here's how it works: 🔍 Semantic Code Search allows you to ask questions such as "find functions that handle user authentication" and retrieves the code from functions like ValidateLoginCredential(), overcoming the limitations of keyword matching. ⚡ Incremental Indexing: Efficiently re-index only changed files using Merkle trees. 🧩 Intelligent Code Chunking: Analyze code in Abstract Syntax Trees (AST) for chunking. Understand how different parts of your codebase relate. 🗄️ Scalable: Powered by @zilliz’s scalable vector search, works for large codebases regardless of their size. Thanks @claudeai @AnthropicAI for the inspiration — Claude Code + semantic search is a powerful duo. 👊

116

Who to follow

sky9

@deepsolo9

joanneyuyu.eth(🌸, 🌿) | ETHGas ⛽

@Joanne_YuYu

https://t.co/MOAE4XlWcs lhttps://light.so/ https://t.co/2QWrYn4lCx… ⏳@t2wrld

xzhao

@xzhao_

Jiang Chen @jiangc1010

11 months ago

@badlogicgames We open-sourced one with similar idea: https://t.co/Wdf4Wstej2 It indexes the codebase so claude code can search code via MCP.

Jiang Chen @jiangc1010

about 1 year ago

I’ve been asked by many Milvus users how BM25 works in a vector database. Well, there is no secret. If elasticsearch can implement that with reverted index, so does a vector database with sparse vector. Here's how we did it: 🔑𝐊𝐞𝐲 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐢𝐨𝐧: Traditionaly (I mean a year ago😛), Milvus required users to pre-compute BM25 score with term frequencies and corpus-level stats like avg doc length. However, as more docs are ingested, the stats is ever-changing. So that wasn't really practical in the real-world. Since version 2.5, Milvus can compute the BM25 score internally. Users just need to ingest raw text and let Milvus generate the sparse vector. We also added a new 𝐁𝐌25 metric for the sparse vector field and compute the BM25 score dynamically during query time. This hides all the complexity from users and keep the search accurate. Moreover, Milvus can speed up the performance of BM25 search with optimizations unique to sparse vectors. In addition, Milvus doesn't have the burden of JVM as elasticsearch. As a result, the full-text search on Milvus can be 3-5x faster than elasticsearch. Under the hood, Milvus stores raw 𝐓𝐅 (Term Frequency) values as the document vector, and keep the stats like avg doc length as system wide metadata. So the doc vectors don't need to be updated as avg doc length is changing over time. We call this 𝐒𝐩𝐚𝐫𝐬𝐞-𝐁𝐌25. 𝐇𝐨𝐰 𝐃𝐨𝐞𝐬 𝐒𝐩𝐚𝐫𝐬𝐞-𝐁𝐌25 𝐖𝐨𝐫𝐤 𝐢𝐧 𝐌𝐢𝐥𝐯𝐮𝐬: - During text ingestion, 𝐌𝐢𝐥𝐯𝐮𝐬 tokenizes the text, removes stop-words, and stems the tokens - The tokenized text are converted into 𝐬𝐩𝐚𝐫𝐬𝐞 𝐯𝐞𝐜𝐭𝐨𝐫𝐬 by Milvus and stored in a sparse vector field. Milvus also maintains global term distribution statistics across the corpus that are required to compute 𝐈𝐃𝐅 (Inverse Document Frequency) and 𝐚𝐯𝐠𝐝𝐥 (average document length). - Milvus builds the index for the sparse vectors for efficient search. - At search time, Milvus uses the global term distribution stats to compute the BM25 score dynamically to perform the 𝐀𝐍𝐍 (𝐀𝐩𝐩𝐫𝐨𝐱𝐢𝐦𝐚𝐭𝐞 𝐍𝐞𝐚𝐫𝐬𝐭 𝐍𝐞𝐢𝐠𝐡𝐛𝐨𝐮𝐫) 𝐬𝐞𝐚𝐫𝐜𝐡. Want to learn more? 🔗 Here is the full blog that unveils all the details: https://t.co/yaUDkay3CI Drop a comment or DM me if you have any questions.

jiangc1010's tweet photo. I’ve been asked by many Milvus users how BM25 works in a vector database. Well, there is no secret. If elasticsearch can implement that with reverted index, so does a vector database with sparse vector. Here's how we did it:
🔑𝐊𝐞𝐲 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐢𝐨𝐧:
Traditionaly (I mean a year ago😛), Milvus required users to pre-compute BM25 score with term frequencies and corpus-level stats like avg doc length. However, as more docs are ingested, the stats is ever-changing. So that wasn't really practical in the real-world. Since version 2.5, Milvus can compute the BM25 score internally. Users just need to ingest raw text and let Milvus generate the sparse vector.

We also added a new 𝐁𝐌25 metric for the sparse vector field and compute the BM25 score dynamically during query time. This hides all the complexity from users and keep the search accurate. Moreover, Milvus can speed up the performance of BM25 search with optimizations unique to sparse vectors. In addition, Milvus doesn't have the burden of JVM as elasticsearch. As a result, the full-text search on Milvus can be 3-5x faster than elasticsearch.

Under the hood, Milvus stores raw 𝐓𝐅 (Term Frequency) values as the document vector, and keep the stats like avg doc length as system wide metadata. So the doc vectors don't need to be updated as avg doc length is changing over time. We call this 𝐒𝐩𝐚𝐫𝐬𝐞-𝐁𝐌25.

𝐇𝐨𝐰 𝐃𝐨𝐞𝐬 𝐒𝐩𝐚𝐫𝐬𝐞-𝐁𝐌25 𝐖𝐨𝐫𝐤 𝐢𝐧 𝐌𝐢𝐥𝐯𝐮𝐬:
- During text ingestion, 𝐌𝐢𝐥𝐯𝐮𝐬 tokenizes the text, removes stop-words, and stems the tokens
- The tokenized text are converted into 𝐬𝐩𝐚𝐫𝐬𝐞 𝐯𝐞𝐜𝐭𝐨𝐫𝐬 by Milvus and stored in a sparse vector field. Milvus also maintains global term distribution statistics across the corpus that are required to compute 𝐈𝐃𝐅 (Inverse Document Frequency) and 𝐚𝐯𝐠𝐝𝐥 (average document length).
- Milvus builds the index for the sparse vectors for efficient search.
- At search time, Milvus uses the global term distribution stats to compute the BM25 score dynamically to perform the 𝐀𝐍𝐍 (𝐀𝐩𝐩𝐫𝐨𝐱𝐢𝐦𝐚𝐭𝐞 𝐍𝐞𝐚𝐫𝐬𝐭 𝐍𝐞𝐢𝐠𝐡𝐛𝐨𝐮𝐫) 𝐬𝐞𝐚𝐫𝐜𝐡.

Want to learn more?
🔗 Here is the full blog that unveils all the details: https://t.co/yaUDkay3CI
Drop a comment or DM me if you have any questions.

Jiang Chen @jiangc1010

about 1 year ago

I think 2025 will be the year of multi-agent 🤖🤝🤖 Google's new Agent-to-Agent (A2A) protocol tackles a critical challenge in agent systems: enabling multiple AI agents to work as a team. The framework enables agents to communicate, assign tasks, and synchronize information with each other. If MCP is the USB protocol of agent and peripheral devices, A2A is the HTTP protocol for agents to collaborate with each other, like a service mesh. The A2A protocol carries several key features: 🌍 Capability Discovery: Just like microservices need service discovery, so does an agent network. With A2A, agents can "show their capabilities" through JSON-formatted "Agent Cards" so that client agents can select the best remote agent to complete a task. 🛠️ Structured Task Lifecycle: Tasks are treated as entities with defined states such as pending, running, completed, or failed. This structure allows for clear tracking and management of tasks throughout their execution. 🛜 Asynchronous Communication: Agents can handle long-running operations by communicating asynchronously, ensuring that tasks can progress without requiring constant real-time interaction. 🧑‍🤝‍🧑 Collaboration and Error Handling: The protocol supports collaborative task execution, where agents can seek clarification, request additional information, or handle errors through specialized recovery agents, enhancing resilience in task management. To see how this works in action, let's imagine implementing robot recruiters with multi-agents: 1️⃣ The recruitment manager can instruct the agent to search for candidates based on job descriptions, locations, skills, etc. 2️⃣ The agent collaborates with other specialized recruitment agents, integrating with platforms like LinkedIn or internal HR systems, to summarize candidate suggestions using A2A. 3️⃣ After the manager reviews the suggestions, the agent can arrange interviews or engage another agent to conduct background checks. Early prototypes of agents like Devin (AI software engineer) and Manus (general AI agent) are just a start, what's gonna really unlock the potential of agents is vast adoption of tool-using (MCP) and cross-agent collaboration (A2A).

jiangc1010's tweet photo. I think 2025 will be the year of multi-agent 🤖🤝🤖
Google's new Agent-to-Agent (A2A) protocol tackles a critical challenge in agent systems: enabling multiple AI agents to work as a team. The framework enables agents to communicate, assign tasks, and synchronize information with each other. If MCP is the USB protocol of agent and peripheral devices, A2A is the HTTP protocol for agents to collaborate with each other, like a service mesh.
The A2A protocol carries several key features:
🌍 Capability Discovery: Just like microservices need service discovery, so does an agent network. With A2A, agents can "show their capabilities" through JSON-formatted "Agent Cards" so that client agents can select the best remote agent to complete a task.
🛠️ Structured Task Lifecycle: Tasks are treated as entities with defined states such as pending, running, completed, or failed. This structure allows for clear tracking and management of tasks throughout their execution.
🛜 Asynchronous Communication: Agents can handle long-running operations by communicating asynchronously, ensuring that tasks can progress without requiring constant real-time interaction.
🧑‍🤝‍🧑 Collaboration and Error Handling: The protocol supports collaborative task execution, where agents can seek clarification, request additional information, or handle errors through specialized recovery agents, enhancing resilience in task management.

To see how this works in action, let's imagine implementing robot recruiters with multi-agents:
1️⃣ The recruitment manager can instruct the agent to search for candidates based on job descriptions, locations, skills, etc.
2️⃣ The agent collaborates with other specialized recruitment agents, integrating with platforms like LinkedIn or internal HR systems, to summarize candidate suggestions using A2A.
3️⃣ After the manager reviews the suggestions, the agent can arrange interviews or engage another agent to conduct background checks.

Early prototypes of agents like Devin (AI software engineer) and Manus (general AI agent) are just a start, what's gonna really unlock the potential of agents is vast adoption of tool-using (MCP) and cross-agent collaboration (A2A).

jiangc1010 retweeted

Milvus

@milvusio

about 1 year ago

🚀 𝐇𝐨𝐰 𝐭𝐨 𝐛𝐨𝐨𝐬𝐭 𝐑𝐀𝐆 𝐭𝐡𝐫𝐨𝐮𝐠𝐡𝐩𝐮𝐭 𝐛𝐲 20𝐱 𝐮𝐬𝐢𝐧𝐠 𝐋𝐥𝐚𝐦𝐚𝐈𝐧𝐝𝐞𝐱 & 𝐌𝐢𝐥𝐯𝐮𝐬 𝐀𝐬𝐲𝐧𝐜 𝐀𝐏𝐈? Async processing unlocks parallelism of your RAG pipeline. Compared to synchronous scheduling, async API can achieve significant performance gains for doc ingestion and search query serving at high throughput. You can build async RAG pipeline with @LlamaIndex and 𝐌𝐢𝐥𝐯𝐮𝐬 to fully utilize the hardware's potential, e.g. 📂 Ingest 1,000 doc chunks: - Sync: 62.91s → Async: 3.22s (19.5x faster) 🔍 Serve 1,000 search queries: - Sync: 308.80s → Async: 8.81s (35x faster) Why is that possible? Asynchronous processing avoids being blocked by the serialized stream of requests. For LlamaIndex, by simply setting "𝒖𝒔𝒆_𝒂𝒔𝒚𝒏𝒄=𝑻𝒓𝒖𝒆" in your VectorStoreIndex with Milvus, you can enjoy 10~50x improvement on throughput and query latency relative to serial scheduling. In addition, by leveraging local embedding models like BGE or E5 and high-performance inference frameworks like vLLM or NVIDIA Triton, this architecture eliminates bottlenecks like OpenAI’s API rate limits. You can consider this reference architecture for mission-critical search applications such as real-time enterprise RAG and high-volume financial document processing. 🔧 Tutorial: https://t.co/PMug8kxEnu Discussion: What’s your biggest RAG performance challenge? Share your use case below 👇

milvusio's tweet photo. 🚀 𝐇𝐨𝐰 𝐭𝐨 𝐛𝐨𝐨𝐬𝐭 𝐑𝐀𝐆 𝐭𝐡𝐫𝐨𝐮𝐠𝐡𝐩𝐮𝐭 𝐛𝐲 20𝐱 𝐮𝐬𝐢𝐧𝐠 𝐋𝐥𝐚𝐦𝐚𝐈𝐧𝐝𝐞𝐱 & 𝐌𝐢𝐥𝐯𝐮𝐬 𝐀𝐬𝐲𝐧𝐜 𝐀𝐏𝐈?
Async processing unlocks parallelism of your RAG pipeline. Compared to synchronous scheduling, async API can achieve significant performance gains for doc ingestion and search query serving at high throughput. You can build async RAG pipeline with @LlamaIndex and 𝐌𝐢𝐥𝐯𝐮𝐬 to fully utilize the hardware's potential, e.g.
📂 Ingest 1,000 doc chunks:
- Sync: 62.91s → Async: 3.22s (19.5x faster)
🔍 Serve 1,000 search queries:
- Sync: 308.80s → Async: 8.81s (35x faster)

Why is that possible? Asynchronous processing avoids being blocked by the serialized stream of requests. For LlamaIndex, by simply setting "𝒖𝒔𝒆_𝒂𝒔𝒚𝒏𝒄=𝑻𝒓𝒖𝒆" in your VectorStoreIndex with Milvus, you can enjoy 10~50x improvement on throughput and query latency relative to serial scheduling.

In addition, by leveraging local embedding models like BGE or E5 and high-performance inference frameworks like vLLM or NVIDIA Triton, this architecture eliminates bottlenecks like OpenAI’s API rate limits. You can consider this reference architecture for mission-critical search applications such as real-time enterprise RAG and high-volume financial document processing.

🔧 Tutorial: https://t.co/PMug8kxEnu

Discussion: What’s your biggest RAG performance challenge? Share your use case below 👇

334

Jiang Chen @jiangc1010

about 1 year ago

If you're struggling with 𝐑𝐀𝐆 𝐪𝐮𝐚𝐥𝐢𝐭𝐲, here are a few advanced RAG techniques that are most popular 💬 1. 𝐀𝐠𝐞𝐧𝐭𝐢𝐜 𝐑𝐀𝐆 / 𝐃𝐞𝐞𝐩 𝐑𝐞𝐬𝐞𝐚𝐫𝐜𝐡 The core idea of Deep Research and agentic RAG is self-reflection plus query routing. This strategy can optimize the overall process by reasoning about user queries to understand complex intentions and break them down into sub-queries. (e.g. "Differences between Milvus vs Zilliz Cloud?" → Sub-Query 1: "Milvus features" | Sub-Query 2: "Zilliz Cloud features") 🛠️ https://t.co/7q9bZhniJv 2. 𝐇𝐲𝐛𝐫𝐢𝐝 𝐒𝐞𝐚𝐫𝐜𝐡 Combining semantic search and full-text search captures both contextual semantics and special terms for more comprehensive results. This approach requires a vector database like @milvus-io that supports both dense vector embeddings and BM25 scoring with sparse vectors. 🛠️ https://t.co/Uphp5ClbSl 3. 𝐂𝐨𝐧𝐭𝐞𝐱𝐭𝐮𝐚𝐥 𝐑𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 To address the semantic fragmentation caused by chunking, Contextual Retrieval enriches each document chunk with relevant context before embedding. Leveraging a KV Cache reduces redundant computations, making the approach more cost-efficient. 🛠️ https://t.co/XCZfBSwrAW 4. 𝐆𝐫𝐚𝐩𝐡 𝐑𝐀𝐆 RAG enhanced with Knowledge Graphs (KGs) to effectively handle complex entity relationships and multi-hop questions. By representing entities and their interrelations through knowledge engineering, Graph RAG enhances traditional semantic search, leading to a better understanding of intricate connections within text corpora. 🛠️ https://t.co/XrukA0IVKn A more detailed report 👉https://t.co/zFz6AH9CW2

jiangc1010's tweet photo. If you're struggling with 𝐑𝐀𝐆 𝐪𝐮𝐚𝐥𝐢𝐭𝐲, here are a few advanced RAG techniques that are most popular 💬
1. 𝐀𝐠𝐞𝐧𝐭𝐢𝐜 𝐑𝐀𝐆 / 𝐃𝐞𝐞𝐩 𝐑𝐞𝐬𝐞𝐚𝐫𝐜𝐡 The core idea of Deep Research and agentic RAG is self-reflection plus query routing. This strategy can optimize the overall process by reasoning about user queries to understand complex intentions and break them down into sub-queries. (e.g. "Differences between Milvus vs Zilliz Cloud?" →
Sub-Query 1: "Milvus features" | Sub-Query 2: "Zilliz Cloud features") 🛠️ https://t.co/7q9bZhniJv
2. 𝐇𝐲𝐛𝐫𝐢𝐝 𝐒𝐞𝐚𝐫𝐜𝐡 Combining semantic search and full-text search captures both contextual semantics and special terms for more comprehensive results. This approach requires a vector database like @milvus-io that supports both dense vector embeddings and BM25 scoring with sparse vectors. 🛠️ https://t.co/Uphp5ClbSl
3. 𝐂𝐨𝐧𝐭𝐞𝐱𝐭𝐮𝐚𝐥 𝐑𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 To address the semantic fragmentation caused by chunking, Contextual Retrieval enriches each document chunk with relevant context before embedding. Leveraging a KV Cache reduces redundant computations, making the approach more cost-efficient. 🛠️ https://t.co/XCZfBSwrAW
4. 𝐆𝐫𝐚𝐩𝐡 𝐑𝐀𝐆 RAG enhanced with Knowledge Graphs (KGs) to effectively handle complex entity relationships and multi-hop questions. By representing entities and their interrelations through knowledge engineering, Graph RAG enhances traditional semantic search, leading to a better understanding of intricate connections within text corpora. 🛠️ https://t.co/XrukA0IVKn

A more detailed report 👉https://t.co/zFz6AH9CW2

157

Jiang Chen @jiangc1010

over 1 year ago

A new SOTA 𝒆𝒎𝒃𝒆𝒅𝒅𝒊𝒏𝒈 𝒎𝒐𝒅𝒆𝒍 𝒇𝒐𝒓 𝑴𝒖𝒍𝒕𝒊𝒎𝒐𝒅𝒂𝒍 𝑺𝒆𝒂𝒓𝒄𝒉: 🌟 BGE has just released BGE-VL-MLLM series. What is Multimodal Search: Also known as 𝘊𝘰𝘮𝘱𝘰𝘴𝘪𝘵𝘦 𝘐𝘮𝘢𝘨𝘦 𝘙𝘦𝘵𝘳𝘪𝘦𝘷𝘢𝘭 (CIR), it accepts a pair of image and text as input, using the text to augment the search intention expressed by the image, such as 𝐚 𝐩𝐡𝐨𝐭𝐨 of 𝘴𝘩𝘰𝘦𝘴 𝘸𝘪𝘵𝘩 𝘴𝘱𝘦𝘤𝘪𝘢𝘭 𝘥𝘦𝘴𝘪𝘨𝘯 and 𝐭𝐞𝐱𝐭 "𝘰𝘯 𝘵𝘩𝘦 𝘣𝘦𝘢𝘤𝘩" to find images of 𝘥𝘦𝘴𝘪𝘨𝘯𝘦𝘳 𝘴𝘩𝘰𝘦𝘴 𝘱𝘭𝘢𝘤𝘦𝘥 𝘰𝘯 𝘵𝘩𝘦 𝘣𝘦𝘢𝘤𝘩. In https://t.co/vgAZkRVM6T live demo we showed the futuristic search experience of image + text retrieval. The recently released 𝐁𝐆𝐄-𝐕𝐋-𝐌𝐋𝐋𝐌-𝐒1 is trained with a techique from paper 𝑴𝒆𝒈𝒂𝑷𝒂𝒊𝒓𝒔: Massive Data Synthesis For Universal Multimodal Retrieval published by @BAAIBeijing. Similar to @GoogleDeepMind's MagicLens paper, it uses large scale data mining and synthetic data generation with VLM to construct a massive dataset of image tuples and the text that describes the relation between the pair. It then uses the dataset and contrastive learning to train the multimodal model. 𝐁𝐆𝐄-𝐕𝐋-𝐌𝐋𝐋𝐌-𝐒2 is further trained with fine-tuning on the MMEB benchmark training set, achieving even better retrieval quality. The models are available on @huggingface: https://t.co/OB6FvzeSPc. 🧑💻 See attached code snippets for how to use it with vector db for search. Paired with a performant vector db @milvusio, the new models can unlock more use cases with multimodal search, such as data exploration and mining on massive scale of visual content. 🔮 We are excited to add this to the upcoming vector lake feature in Milvus 3.0 release. If you have that use case, such as multimodel retrieval on billions of items with serving-grade sub-100ms latency target or high tolerance for interactive exploration, please reach out to me!

jiangc1010's tweet photo. A new SOTA 𝒆𝒎𝒃𝒆𝒅𝒅𝒊𝒏𝒈 𝒎𝒐𝒅𝒆𝒍 𝒇𝒐𝒓 𝑴𝒖𝒍𝒕𝒊𝒎𝒐𝒅𝒂𝒍 𝑺𝒆𝒂𝒓𝒄𝒉: 🌟 BGE has just released BGE-VL-MLLM series.

What is Multimodal Search: Also known as 𝘊𝘰𝘮𝘱𝘰𝘴𝘪𝘵𝘦 𝘐𝘮𝘢𝘨𝘦 𝘙𝘦𝘵𝘳𝘪𝘦𝘷𝘢𝘭 (CIR), it accepts a pair of image and text as input, using the text to augment the search intention expressed by the image, such as 𝐚 𝐩𝐡𝐨𝐭𝐨 of 𝘴𝘩𝘰𝘦𝘴 𝘸𝘪𝘵𝘩 𝘴𝘱𝘦𝘤𝘪𝘢𝘭 𝘥𝘦𝘴𝘪𝘨𝘯 and 𝐭𝐞𝐱𝐭 "𝘰𝘯 𝘵𝘩𝘦 𝘣𝘦𝘢𝘤𝘩" to find images of 𝘥𝘦𝘴𝘪𝘨𝘯𝘦𝘳 𝘴𝘩𝘰𝘦𝘴 𝘱𝘭𝘢𝘤𝘦𝘥 𝘰𝘯 𝘵𝘩𝘦 𝘣𝘦𝘢𝘤𝘩. In https://t.co/vgAZkRVM6T live demo we showed the futuristic search experience of image + text retrieval.

The recently released 𝐁𝐆𝐄-𝐕𝐋-𝐌𝐋𝐋𝐌-𝐒1 is trained with a techique from paper 𝑴𝒆𝒈𝒂𝑷𝒂𝒊𝒓𝒔: Massive Data Synthesis For Universal Multimodal Retrieval published by @BAAIBeijing. Similar to @GoogleDeepMind's MagicLens paper, it uses large scale data mining and synthetic data generation with VLM to construct a massive dataset of image tuples and the text that describes the relation between the pair. It then uses the dataset and contrastive learning to train the multimodal model. 𝐁𝐆𝐄-𝐕𝐋-𝐌𝐋𝐋𝐌-𝐒2 is further trained with fine-tuning on the MMEB benchmark training set, achieving even better retrieval quality. The models are available on @huggingface: https://t.co/OB6FvzeSPc.

🧑💻 See attached code snippets for how to use it with vector db for search. Paired with a performant vector db @milvusio, the new models can unlock more use cases with multimodal search, such as data exploration and mining on massive scale of visual content.

🔮 We are excited to add this to the upcoming vector lake feature in Milvus 3.0 release. If you have that use case, such as multimodel retrieval on billions of items with serving-grade sub-100ms latency target or high tolerance for interactive exploration, please reach out to me!

Jiang Chen @jiangc1010

over 1 year ago

@milvusio Learn more details of the roadmap 👉 https://t.co/GieHIojUaD

Jiang Chen @jiangc1010

over 1 year ago

I'm really impressed by @manusai – It's close to my imagination of 𝘈𝘐 𝘈𝘨𝘦𝘯𝘵 𝘪𝘯 𝘢𝘤𝘵𝘪𝘰𝘯, 𝘣𝘦𝘺𝘰𝘯𝘥 𝘱𝘢𝘴𝘴𝘪𝘷𝘦 𝘴𝘶𝘨𝘨𝘦𝘴𝘵𝘪𝘰𝘯𝘴. 𝘕𝘰 𝘴𝘩𝘰𝘵𝘨𝘶𝘯 𝘱𝘰𝘴𝘪𝘵𝘪𝘰𝘯, 𝘵𝘢𝘬𝘦 𝘵𝘩𝘦 𝘥𝘳𝘪𝘷𝘦𝘳 𝘴𝘦𝘢𝘵 𝘱𝘭𝘦𝘢𝘴𝘦. I like the idea of it: From "Idea"💡 to "Result" ⚙️ End-to-End Automation and sophisticated tool using that can do: PPT Creation 📊｜Stock Analysis 📈｜Property Search 🏠｜Travel Planning 🌍｜Contract Review ⚖️｜Audio Production 🎧 This feels a true "agent": automating workflows from idea → execution → delivery to close complex tasks in one click. 💡 The core tech under the hood: A "Triple-Engine" Architecture ▫️ Smart Model × Tool Integration × Task Orchestration ▫️ Cloud-based async execution for long-chain tasks. ▫️ Memory × Knowledge Base Historical data + real-time learning = AI evolves from "responder" to "collaborator"! 🔮 Next-Level: What if it's equipped with a highly scalable vector db @milvusio? → Infinite long-term memory, always context aware → Knowledge base for massive proprietary data → Multimodal retrieval to not limited to text I think when AI stops being a "suggestion box" and becomes a true results-maker, the future is here 🤖

jiangc1010's tweet photo. I'm really impressed by @manusai – It's close to my imagination of 𝘈𝘐 𝘈𝘨𝘦𝘯𝘵 𝘪𝘯 𝘢𝘤𝘵𝘪𝘰𝘯, 𝘣𝘦𝘺𝘰𝘯𝘥 𝘱𝘢𝘴𝘴𝘪𝘷𝘦 𝘴𝘶𝘨𝘨𝘦𝘴𝘵𝘪𝘰𝘯𝘴. 𝘕𝘰 𝘴𝘩𝘰𝘵𝘨𝘶𝘯 𝘱𝘰𝘴𝘪𝘵𝘪𝘰𝘯, 𝘵𝘢𝘬𝘦 𝘵𝘩𝘦 𝘥𝘳𝘪𝘷𝘦𝘳 𝘴𝘦𝘢𝘵 𝘱𝘭𝘦𝘢𝘴𝘦.

I like the idea of it: From "Idea"💡 to "Result" ⚙️
End-to-End Automation and sophisticated tool using that can do:
PPT Creation 📊｜Stock Analysis 📈｜Property Search 🏠｜Travel Planning 🌍｜Contract Review ⚖️｜Audio Production 🎧
This feels a true "agent": automating workflows from idea → execution → delivery to close complex tasks in one click.

💡 The core tech under the hood: A "Triple-Engine" Architecture
▫️ Smart Model × Tool Integration × Task Orchestration
▫️ Cloud-based async execution for long-chain tasks.
▫️ Memory × Knowledge Base
Historical data + real-time learning = AI evolves from "responder" to "collaborator"!
🔮 Next-Level: What if it's equipped with a highly scalable vector db @milvusio?
→ Infinite long-term memory, always context aware
→ Knowledge base for massive proprietary data
→ Multimodal retrieval to not limited to text

I think when AI stops being a "suggestion box" and becomes a true results-maker, the future is here 🤖

Jiang Chen @jiangc1010

over 1 year ago

@jc_stack @langchain That’s possible. We had follow up work to open source an implementation that can plug in most LLMs: https://t.co/MT9N4vS2mu

Jiang Chen @jiangc1010

over 1 year ago

Cool perspective! Are you interested in adding https://t.co/asYjLtcuL1 into the comparison? Feel free to DM me and I’m happy to help on the set up! Would also love to hear your feedback on the user experience of Milvus. We also open sourced a benchmark for more production-level test cases (search while doing ingestion, up to billion vector scale, taking machine cost into consideration etc). Maybe you will find that useful: https://t.co/9wamUF0QKK

Jiang Chen @jiangc1010

over 1 year ago

2024 was a transformative year for Information Retrieval! 🧠 We’ve witnessed the production-scale adoption of RAG and breakthroughs in Graph RAG, multi-modality, ColBERT/ColPali, and Text2SQL—redefining search infrastructure, data discovery, and knowledge synthesis. Deep learning-powered IR now seamlessly integrates LLMs, hybrid search, rerankers, and structured knowledge tools like Knowledge Graphs, making retrieval more precise and scalable than ever. This is a new Age of Discovery for humanity, as groundbreaking as Columbus reaching the New World over 500 years ago. I’m thrilled to be part of this exploration! With Milvus 3.0, we are pushing the boundaries of vector databases—evolving from a search-serving infrastructure to a comprehensive unstructured data platform. And with 2025 on the horizon, the next wave of innovation is just beginning. 🔗 Read more about my reflection on IR in 2024: https://t.co/sso9YmFTWe #AI #InformationRetrieval #RAG #ColBERT #Text2SQL #LLMs #DeepLearning #VectorDatabases

107

Jiang Chen @jiangc1010

over 1 year ago

🚀 Deep Research is impressive! So we built one using open-source tools: @langchain + @milvusio + @deepseek_ai. It’s amazing how far open source can take you these days! But there’s more—this agent isn’t limited to the public web. It can search private data in your vector DB, deployed on-prem or in your cloud VPC (see Zilliz Cloud BYOC https://t.co/YiUT76hOyD)—all while keeping your data secure. Open source gives you the flexibility to shape this design pattern however you need. Check out the sample report from a demo research agent by @stefan_webb! 🔥 #AI #VectorDB #OpenSource

LangChain

@LangChain

over 1 year ago

⭐️"I Built a Deep Research with Open Source—and So Can You!" You’ve probably heard about OpenAI’s latest release, Deep Research—designed for more detailed, informed, and nuanced responses. But how does the underlying technology actually work? Our partner, Milvus, has a new post that breaks it down, showing how to build a research assistant similar to Deep Research using open-source tools. It combines LangChain for RAG, Milvus as the vector database, and DeepSeek R1 as the reasoning model. The post explores key concepts behind agents and reasoning models—tool usage, memory, structured output, and planning—giving you a hands-on look at how this powerful capability can be unlocked. Check it out: https://t.co/xdUHO3RYyo

LangChain's tweet photo. ⭐️"I Built a Deep Research with Open Source—and So Can You!"

You’ve probably heard about OpenAI’s latest release, Deep Research—designed for more detailed, informed, and nuanced responses. But how does the underlying technology actually work?

Our partner, Milvus, has a new post that breaks it down, showing how to build a research assistant similar to Deep Research using open-source tools. It combines LangChain for RAG, Milvus as the vector database, and DeepSeek R1 as the reasoning model.

The post explores key concepts behind agents and reasoning models—tool usage, memory, structured output, and planning—giving you a hands-on look at how this powerful capability can be unlocked.

Check it out: https://t.co/xdUHO3RYyo

177

181

16K

324

jiangc1010 retweeted

LangChain

@LangChain

over 1 year ago

177

181

16K

Jiang Chen @jiangc1010

over 1 year ago

great question! wikipedia data was just used as an example. you can put any data in vector db, like sensitive financial reports for enterprises. https://t.co/asYjLtcuL1 is open source, you can self-host it or use fully-managed version on Zilliz Cloud with free trial: https://t.co/4T416iHNm7

jiangc1010 retweeted

Milvus

@milvusio

over 1 year ago

🎉 Announcing Milvus 2.5: Smarter Search, Better Performance See the latest additions 👇

547

jiangc1010 retweeted

Jiang Chen @jiangc1010

over 1 year ago

While I strongly believe in upholding academic integrity, it’s equally important to fight against racial discrimination and bias. Associating a lack of integrity with a specific nationality or racial group is a clear discriminatory act. This should not be overlooked just because it’s disguised as a call for moral standards. When attending graduate school in the U.S., I’ve also heard claims like ‘Chinese, Indian, or Asian students bend the rules for better grades,’ which unfairly targets certain groups. The truth is, academic dishonesty exists across all racial groups. It’s a universal issue that the academic community must address together. I hope NeurIPS and the speaker genuinely acknowledge this issue and take meaningful steps to prevent racial discrimination against any group, rather than masking it under the guise of “cultural generalization”.

375

Jiang Chen

@jiangc1010

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users