🚀Announcing LangGraph Studio: The first agent IDE
LangGraph Studio offers a new way to develop LLM applications by providing a specialized agent IDE that enables visualization, interaction, and debugging of complex agentic applications
With visual graphs and the ability to edit state, you can better understand agent workflows and iterate faster. LangGraph Studio integrates with LangSmith so you can collaborate with teammates to debug failure modes
LangGraph Studio is available for free to all LangSmith users on any plan tier during its early development.
Read more about it here: https://t.co/VKnJkOpusv
Watch a YouTube walkthrough: https://t.co/1qruVoJXe9
Try out LangGraph Studio for free here: https://t.co/3pwnQLkDTd
Sign up for a LangSmith account: https://t.co/NZbDhzp8xX
🚪UX for Agents, Part 2: Ambient
One of the UXs for agents that we are most interested is ambient agents
This in theory allows us to have teams of agents helping us without needing to kick them off... but it can be a tricky UX to nail.
Thoughts here: https://t.co/sq63jprjen
🚀 It's finally here! We've just launched https://t.co/CT4KHXLP87 Fuzer v0.1 on @huggingface Space via @Gradio! 🎉
We're tackling a major challenge in image generation AI & ControlNet: achieving image consistency while preserving text labels and maintaining control of input shape.
Check it out and let us know what you think! Can't wait to see what y'all will create!! #AI #MachineLearning #ControlNet #ImageGeneration 👉
https://t.co/kHx15zdLC5
I've been saying that the future of document RAG is multimodal indexing/retrieval, and ColPali by @ManuelFaysse is big step towards that direction.
⚠️ No more text-extraction
✅ Use a fine-tuned vision model with a late interaction mechanism
✅ The model is small and outperforms text-based and text-image embedding methods.
We're doing a webinar with the first-author @ManuelFaysse and our very own @disiok this Friday at 9am PT, check it out: https://t.co/LNQMtaJ9mv
ColPali Paper https://t.co/s9ma6wcCAg
🧠Planning for agents
One of the big limitations for agents working reliably is their ability to plan. I talk about that in a new blog post. Covers:
- What is planning?
- How to improve planning?
- Generic vs custom cognitive architectures
and more!
https://t.co/GfVotZxFUL
Leju Robotics just launched a full-size humanoid robot called Kuafu
The bot is powered by Huawei's Multimodal LLM PanGU to allow it to understand natural language, plan tasks, and execute with bimanual coordination
Agility Robotics just signed a multi-year deal with GXO Logistics to bring the company’s humanoid robots to warehouses
The agreement is the first Robots-as-a-Service (RaaS) deal and ‘formal commercial deployment’ of the humanoids
Are subscription-based robots the future?
🤖"In the loop"
Everyone is trying to figure out how to best build agentic applications
I've decided to start sharing some of my thoughts and learnings on how to do so in a new blog series (titled "In the loop")
First one: what is an agent?
https://t.co/KCwYmzSd5Z
🧬🧬 LLM Generated UIs
We've added a series of templates and documentation showing off how to build generative UI applications using LangChain JS/TS & Next.js. These templates include:
- 🌆 generative UI in Next.js
- 🤖 streaming agent events
- 🛠️ streaming tool calls
and more!
Checkout our Next.js template repo for these examples and extensive docs: https://t.co/V1VNzd6JoC
Building always-on, business-critical AI applications or agents on a constantly updating and growing volume of unstructured data requires resilient and fast data infrastructure.
I am super excited to finally announce @tensorlake's open-source, real-time data framework, Indexify.
Real-time processing: Optimized for tasks like summarization, extraction, embedding, and parsing, Indexify works well with frequently updated data. It can ingest any data modality at scale, with incremental updates that don't require re-processing entire documents.
Reliability, Multi-Cloud and Hardware Acceleration: Indexify reliably processes data even during transient infrastructure failures, ensuring high availability . Extracted data is automatically stored in storage systems. Pipelines can run on GPUs, CPUs, and across multiple clouds for flexibility and resilience.
Observability: Fully observable, Indexify allows you to identify bottlenecks in extraction pipelines and retrieval APIs for semantic searches and SQL queries.
Indexify has been tested on AWS with hundreds of thousands of documents and images to ensure production-readiness.
It comes with retrieval APIs for RAG applications, autonomous agents or any AI application. It's fully extensible, allowing you to bring any model into pipelines.
Blog Post: https://t.co/wG6ofubaC5
GitHub: https://t.co/as17Pi2UM2
Website: https://t.co/rQLevPKGeE
Discord Community: https://t.co/mofdmarOZj
🚀Cohere Toolkit
There are a lot of toolkits for getting started with building RAG apps, but this may be one of the best:
🖌️Beautiful UI
🧑🏫Uses all of @cohere's advanced RAG features
🚀Built by the Cohere team themselves (they know what they're doing!
https://t.co/tPa3mhDe5l
🧮Regression testing your LLM applications
We've made big improvements to the "regression testing" flow in LangSmith to allow you to quickly spot changes in your application's performance:
🟥Automatically highlighting changes
⛪️Filter to changed datapoints
📄Different ways of viewing the outputs
🔎Sidepanel view
To go from prototype to production, you need to experiment with lots of prompts, models, retrieval strategies, cognitive architectures, and more
Having the tools to explore your data quickly and efficient are paramount to that
Check out the blog post here: https://t.co/tBTVdxo8GQ
And YouTube Walkthrough here: https://t.co/Mekz2yAKtz
⚙️How Dosu used LangSmith to achieve a 30% accuracy improvement with no prompt engineering
One of the goals of LangSmith is to help teams set up a data flywheel. By capturing LLM outputs alongside user feedback, we can help developers automatically use that feedback to improve their application
That's exactly what @devstein64 and the @dosu_ai team did. They have GitHub bot that auto-labels your issues and PRs (among other features).
We sat down with them to understand how they used logging, datasets, and our new automations feature to automatically improve their application by 30% purely by capturing user feedback. We then set up a cookbook so ANY LangSmith user can replicate what they did.
Blog: https://t.co/aWNs0NOvk7
YouTube: https://t.co/zoBqhuTc3L
Cookbook: https://t.co/iHHQTsrhdM
@Gemini Ukraine’s Ministry of Foreign Affairs just introduced the world's first AI diplomat, ‘Victoriya Shi’.
The new spokesperson is an AI-generated avatar that will deliver pre-prepared official statements on behalf of the ministry.
To improve RAG, you generally need to do two things: 1️⃣ improve retrieval, and 2️⃣ improve generation.
@_nerdai_ has a fantastic cheatsheet below showing you all the strategies for optimizing each independently, or optimizing them jointly 👇
To optimize retrieval, you can tune metadata/chunk sizes, do sliding windows, do mixed retrieval, add knowledge graphs, and more.
To optimize generation, you can rerank the retrieved results, compress context, optimize the prompt.
To do both, you can interleave the two in a multi-hop setting or integrate the models.
Check out the introduction to RAG bootcamp: https://t.co/XokZYDoApO
Check out the video: https://t.co/meiEoAQ9Az
🕷️ ScrapeGraphAI: You Only Scrape Once
ScrapeGraphAI is a web scraping python library that uses LLMs to create scraping pipelines for websites, documents and XML files. Just say which information you want to extract and the library will do it for you!
https://t.co/h9CnTwKHWi
Memory is now available to all ChatGPT Plus users. Using Memory is easy: just start a new chat and tell ChatGPT anything you’d like it to remember.
Memory can be turned on or off in settings and is not currently available in Europe or Korea. Team, Enterprise, and GPTs to come.
Open AI presents The Instruction Hierarchy
Training LLMs to Prioritize Privileged Instructions
Today's LLMs are susceptible to prompt injections, jailbreaks, and other attacks that allow adversaries to overwrite a model's original instructions with their own malicious prompts.