David Chou

非常非常值得一看的来自 LangChain 团队的 RAG 视频：当 LLM 的上下文足够长了就不需要 RAG 了吗？ —— RAG在长上下文大语言模型(LLM)中的应用探讨这是@rlancemartin最近在几个聚会上关于在长上下文LLM时代使用RAG的讲座。随着上下文窗口增至超过100万Token，很多人质疑RAG是否已经过时。我们结合几个最新的项目成果来分析这个问题。我们讨论了长上下文LLM在事实推理和信息检索方面现有的限制（采用多针索引分析法），同时也探讨了上下文窗口扩展可能带来的RAG应用场景的变化，如文档中心的索引技术和RAG的流程优化。幻灯片展示：[查看详情](https://t.co/9R200tHtVq) 重点参考文献： 1/ 多针索引分析，合作研究者@GregKamradt [阅读更多](https://t.co/qKawZwFmtY) 2/ RAPTOR研究项目，主要研究者包括@parthsarthi03 [项目首页](https://t.co/NtpXg1ieDG) [视频介绍](https://t.co/Ncu8ltjKLa) 3/ Dense-X / 多维数据索引技术，主要研究者@tomchen0 [学术论文](https://t.co/RmC25jQgr0) [相关博客](https://t.co/uP6II4AtlM) 4/ 长上下文数据嵌入技术，研究者包括@JonSaadFalcon, @realDanFu, @simran_s_arora [研究概览](https://t.co/ywr8FrvjfP) [技术教程](https://t.co/a74Obw78xv) 5/ 自适应RAG (@AkariAsai等)，及C-RAG (Shi-Qi Yan等) [论文一](https://t.co/OuaLCdh2D2) [论文二](https://t.co/2pUK1jy0rv) [研究动态](https://t.co/N4t89gcDww) 0:20 - 上下文窗口正逐渐增大 2:10 - 多针索引挑战 9:30 - RAG的未来变革 12:00 - 查询机制分析 13:07 - 以文档为中心的索引技术 16:23 - 自我反思的RAG模式 19:40 - 总结

570

198

752

83K

Who to follow

Yakko Wu

@yakkowu

Creating, Exploring, and quietly Recharging. ∣UI/UX 產品設計師∣視覺設計師∣圖文作家 (New)∣一位從 15 歲開始燃燒設計與繪畫熱情的北棲女森，家裡滿滿的 📚 書、🎮 遊戲、🖌️ 繪圖工具、🌵 植物小園地。

Love programming and cloud infrastructure. #kubernetes #sdn #golang Workout addiction #freediving #cook #powerlift

David Chou @david74chou

about 2 years ago

@kakashiliu 房子🏠

David Chou @david74chou

over 2 years ago

GopherDays 投稿投起來～

Evan Lin@LINE DevRel

@Evan_Lin

over 2 years ago

#GopherDay TW 2024! #Golang Submission link: https://t.co/6Xe2QQZnk8 1. Submission deadline: March 24, 2024 2. Conference date: May 25, 2024 Location: Institute for Information Industry Living Lab+ (4F., No.133, Sec. 4 Minsheng E. Rd., Songshan District, Taipei City 105, TW)

111

david74chou retweeted

Jim Fan

@DrJimFan

over 2 years ago

If you think OpenAI Sora is a creative toy like DALLE, ... think again. Sora is a data-driven physics engine. It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, "intuitive" physics, long-horizon reasoning, and semantic grounding, all by some denoising and gradient maths. I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be! Let's breakdown the following video. Prompt: "Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee." - The simulator instantiates two exquisite 3D assets: pirate ships with different decorations. Sora has to solve text-to-3D implicitly in its latent space. - The 3D objects are consistently animated as they sail and avoid each other's paths. - Fluid dynamics of the coffee, even the foams that form around the ships. Fluid simulation is an entire sub-field of computer graphics, which traditionally requires very complex algorithms and equations. - Photorealism, almost like rendering with raytracing. - The simulator takes into account the small size of the cup compared to oceans, and applies tilt-shift photography to give a "minuscule" vibe. - The semantics of the scene does not exist in the real world, but the engine still implements the correct physical rules that we expect. Next up: add more modalities and conditioning, then we have a full data-driven UE that will replace all the hand-engineered graphics pipelines. https://t.co/7BikSgE7iN

531

13K

david74chou retweeted

Jeff Dean

@JeffDean

over 2 years ago

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long context capabilities, supporting millions of tokens of multimodal input. The multimodal capabilities of the model means you can interact in sophisticated ways with entire books, very long document collections, codebases of hundreds of thousands of lines across hundreds of files, full movies, entire podcast series, and more. Gemini 1.5 was built by an amazing team of people from @GoogleDeepMind, @GoogleResearch, and elsewhere at @Google. @OriolVinyals (my co-technical lead for the project) and I are incredibly proud of the whole team, and we’re so excited to be sharing this work and what long context and in-context learning can mean for you today! There’s lots of material about this, some of which are linked to below. Main blog post: https://t.co/QAsDKXBdao Technical report: “Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context” https://t.co/CTzTHNDCdo Videos of interactions with the model that highlight its long context abilities: Understanding the three.js codebase: https://t.co/yq7d6OSD6c Analyzing a 45 minute Buster Keaton movie: https://t.co/adyMgDYHoK Apollo 11 transcript interaction: https://t.co/Pqvq3Eac1R Starting today, we’re offering a limited preview of 1.5 Pro to developers and enterprise customers via AI Studio and Vertex AI. Read more about this on these blogs: Google for Developers blog: https://t.co/x73Vun0kVS Google Cloud blog: https://t.co/OlaTW6PYGn We’ll also introduce 1.5 Pro with a standard 128,000 token context window when the model is ready for a wider release. Coming soon, we plan to introduce pricing tiers that start at the standard 128,000 context window and scale up to 1 million tokens, as we improve the model. Early testers can try the 1 million token context window at no cost during the testing period. We’re excited to see what developer’s creativity unlocks with a very long context window. Let me walk you through the capabilities of the model and what I’m excited about!

JeffDean's tweet photo. Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length

Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long context capabilities, supporting millions of tokens of multimodal input. The multimodal capabilities of the model means you can interact in sophisticated ways with entire books, very long document collections, codebases of hundreds of thousands of lines across hundreds of files, full movies, entire podcast series, and more.

Gemini 1.5 was built by an amazing team of people from @GoogleDeepMind, @GoogleResearch, and elsewhere at @Google. @OriolVinyals (my co-technical lead for the project) and I are incredibly proud of the whole team, and we’re so excited to be sharing this work and what long context and in-context learning can mean for you today!

There’s lots of material about this, some of which are linked to below.

Main blog post:
https://t.co/QAsDKXBdao

Technical report:
“Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context”
https://t.co/CTzTHNDCdo

Videos of interactions with the model that highlight its long context abilities:
Understanding the three.js codebase: https://t.co/yq7d6OSD6c
Analyzing a 45 minute Buster Keaton movie: https://t.co/adyMgDYHoK
Apollo 11 transcript interaction: https://t.co/Pqvq3Eac1R

Starting today, we’re offering a limited preview of 1.5 Pro to developers and enterprise customers via AI Studio and Vertex AI. Read more about this on these blogs:
Google for Developers blog:
https://t.co/x73Vun0kVS
Google Cloud blog:
https://t.co/OlaTW6PYGn

We’ll also introduce 1.5 Pro with a standard 128,000 token context window when the model is ready for a wider release. Coming soon, we plan to introduce pricing tiers that start at the standard 128,000 context window and scale up to 1 million tokens, as we improve the model.

Early testers can try the 1 million token context window at no cost during the testing period. We’re excited to see what developer’s creativity unlocks with a very long context window.

Let me walk you through the capabilities of the model and what I’m excited about!

179

David Chou @david74chou

over 2 years ago

@kakashiliu 原來現在有 Cosmopolitan 這種 compiler，真的可以 build 一次跑在各種平台上

david74chou retweeted

Browny

@brownylin

almost 3 years ago

#耐讀 "這個和我最近的帶 team 感悟完全 match 🤣": How to do Nothing - https://t.co/ZUJMWQ2iC0

david74chou retweeted

Mengxin Liu

@liumengxinfly

almost 3 years ago

看上去 Big TCP 是个很有潜力的性能优化方案 https://t.co/u1FjTOfnXE

12K

david74chou retweeted

Simon Willison

@simonw

about 3 years ago

@mitchellh One of the fun things about ChatGPT plugins is that the prompts for all of those are public - I collected a few of them here: https://t.co/svPtPPl1rT

121

david74chou retweeted

马东锡 NLP

@dongxi_nlp

over 3 years ago

1/ 在OpenAI发布plugin后，后知后觉地意识到了toolformer这篇论文的重要性，于是重读了论文，把一些读后感做个thread 看到其他推友如@Tisoga 也写过类似的总结，深入浅出的分析了toolfomer可以干什么。此thread从language modeling的角度，通俗地谈一下toolformer是如何做到学会使用工具的。 🧵

dongxi_nlp's tweet photo. 1/ 在OpenAI发布plugin后，后知后觉地意识到了toolformer这篇论文的重要性，于是重读了论文，把一些读后感做个thread

看到其他推友如@Tisoga 也写过类似的总结，深入浅出的分析了toolfomer可以干什么。

此thread从language modeling的角度，通俗地谈一下toolformer是如何做到学会使用工具的。

🧵 https://t.co/g8HIAh07gZ

887

251

521

241K

david74chou retweeted

Jiayuan (JY) Zhang

@jiayuan_jy

over 3 years ago

OpenAI 刚刚发布了 GPT-4 GPT-4 是大型多模态模型（large multimodal model），支持图像和文本的输入，并生成文本结果。这个 thread 会汇总一下有关 GPT-4 的一些信息（包括论文中的一些要点和实际的体验）。 🧵

108

757

704

609K

David Chou @david74chou

over 3 years ago

@GTB_moonshadow 蛤好慘QQ

David Chou @david74chou

over 3 years ago

The Inference Cost Of Search Disruption – Large Language Model Cost Analysis, by @dylan522p https://t.co/ucEvG6eutY

david74chou retweeted

Hao Chen @haoel

over 3 years ago

这几天好些人来问我对 ChatGPT 的看法，正好周末有时间，写了这篇关于 ChatGPT 的文章。这篇文章主要是讨论了一下，基于“内容生成套路”，而不是基于正确和有价值的内容的ChatGPT，和基于正确和有价值，但不生成内容的搜索引擎连姻，会产生出什么样的化学反应？欢迎大家讨论……https://t.co/wzBtfgT0me

365

105K

david74chou retweeted

Pyroscope @PyroscopeIO

over 3 years ago

In #Golang 1.20 the Go team introduced an experimental new method of memory management called Go arenas. In this blog post we show how we combined continuous profiling with memory arenas to improve performance of one of our cloud services by ~8% ! https://t.co/0diYIUH9hc

123

10K

david74chou retweeted