Big update: the @GetContextAI team is joining @OpenAI to build the future of evals!
It's clear that evals are a key requirement to building high performing AI applications - but they’re hard to get right.
We are thrilled to be joining the incredible OpenAI team to build the tools developers need to succeed
We’ve been working something fun - Hype Squad
It generates a podcast hype squad (or roast) from your LinkedIn
It hypes you up and crafts a story that explains your success
Or it roasts - poking holes and making fun. It doesn't hold back
Listen to mine
https://t.co/o68C9CpV6n
It’s been a big first week for Podial!! 🎙️
We've had thousands of signups, we got our first paying customers, we won education product of the week on Product Hunt, and we’ve listened to a ton of incredible generated podcasts
We’ve also made the product much better:
Today we’re launching a totally new product: ✨Podial✨
Podial generates engaging educational podcasts on any topic and from any documents
This is completely new for us! And it’s our first consumer product!! So why are we doing it?
What products are enabled by multi-agent frameworks? 🤖👥🤖👥
Everyone is excited about multi-agent frameworks - even though their performance isn’t quite ready for prime-time.
Much has been written about the frameworks and how they work - but the product layer is underexplored. This new technology is only going to deliver value by enabling great products, and as a product manager this is the layer I find most interesting.
I spoke to a number of developers in the space to understand what they’re building - here are some of the most interesting use cases:
https://t.co/agBfQdlHaW
People should be talking about the Retool State of AI Report 💼
It contains a ton of data on LLM adoption in businesses - just like last year 📊
My takeaways:
How do you know if the guardrails on your LLM product are working? 🛡️🎯
Some people wait until they show up in the The New York Times - like McDonald's, Air Canada, or Chevrolet
LLMs haven’t significantly improved since GPT4: is progress slowing? 🐢
Dramatically more powerful model training clusters are being built: 15 of them, with 31 times more power than trained GPT4
This means models much more powerful than GPT4 are coming 🐇
How do the best tech companies find product market fit for their LLM products? 🔎 🎯
It starts with deeply understanding their users 💬🔬
Yet many people building in the LLM ecosystem still don’t know how or why people use their products.
How to ship amazing LLM products (without getting fired when they go wrong):
Building LLM products requires threading a needle. Product people trade off iteration speed, product quality, and safety risks - and the best PMs know how to do it 🧵🪡
We now support self-hosted deployment!
Desperate to understand how people are using your LLM product, but don’t want to share your transcripts with a third party for analysis?
Good news! You can now run the @getcontextai product within your own cloud instance
Today we’re launching the ecosystem’s best support for evaluating multi-call chains 🔎⛓️
This allows you to evaluate multi-stage workflows with many calls to LLMs and functions using https://t.co/s6LF1JgIcZ, and you can evaluate these both end-to-end and across any stage of the chain
https://t.co/vEP2lmx955
What did we ship in March? 🚢
Lots of improvements to evals!
We now allow users to repeat LLM generations and evaluations to get more certainty in evaluation results, we version our custom evaluators, we’ve improved support for large test sets and added search, we now support Mistral models and a new Haystack integration, and our comparison page and global evaluator assignment have been improved.
Got feedback or ideas for the team? Please get in touch
https://t.co/yb5no3CuCp
ICYMI, here are your Haystack updates for the week.
🛠️ up first, integrations:
- @milvusio is now supported in Haystack 2.0: https://t.co/5DuVNJ0ZHH
- New @getcontextai integration shows analytics about how your Haystack pipelines are being used : https://t.co/TJhWFgMW1M
- Haystack now supports @AnthropicAI Claude 3 models via the Amazon Bedrock integration https://t.co/V2384IpXn3
https://t.co/bU8qyHBgoX is now integrated with Haystack by @deepset_ai ! 🚢
This provides developers using Haystack with product analytics to monitor the performance of their LLM product with real users
https://t.co/JscU9zO1Iu