@ycombinator@YisZzz@AntonanteP ⚙️ One-Click Deployment
Click to deploy. Built-in access control and integrations mean you go from idea to agent in minutes.
@ycombinator@YisZzz@AntonanteP Nuvi lets anyone define, build, and deploy intelligent agents using natural language. Check out what makes Nuvi different.
@ycombinator@YisZzz@AntonanteP 🧪 Fully Tested by Default
User stories become behavioral tests. You can simulate how your agent behaves before launch — and continuously validate after. Using only language.
@ycombinator@YisZzz@AntonanteP 🤝 Built for Collaboration
Specifications are readable, editable, and versioned — so people across different teams stay aligned.
@ycombinator@YisZzz@AntonanteP 🧠 All in Natural Language
If you can explain what you want, Nuvi helps you build it. No coding, no frameworks, no LLM know-how required.
.@RelariAI just launched https://t.co/ivET8UQHvH— the AI agent builder for Software 3.0.
Write specs in English. Get working agents.
With Nuvi, natural language is the new source code.
Congrats on the launch, @yiszzz & @AntonanteP!
https://t.co/uu9IZMDspV
Majority of “AI Agents” advertised today aren’t really agents at all. They’re 𝐢𝐧𝐭𝐫𝐢𝐜𝐚𝐭𝐞 𝐩𝐢𝐩𝐞𝐥𝐢𝐧𝐞𝐬 𝐰𝐢𝐭𝐡 𝐝𝐞𝐭𝐞𝐫𝐦𝐢𝐧𝐢𝐬𝐭𝐢𝐜 𝐜𝐨𝐧𝐭𝐫𝐨𝐥 𝐟𝐥𝐨𝐰𝐬, where LLMs are used only for single-prompt, narrow tasks in isolation.
Ironically, those who jumped on LLMs early and built these intricate pipelines are now the ones facing the 𝐠𝐫𝐞𝐚𝐭𝐞𝐬𝐭 𝐢𝐧𝐞𝐫𝐭𝐢𝐚 𝐭𝐨 𝐬𝐰𝐢𝐭𝐜𝐡 𝐭𝐨 𝐟𝐮𝐥𝐥 𝐚𝐠𝐞𝐧𝐭𝐬.
I saw this phenomenon firsthand with the self-driving car industry: companies spent years building brittle, rule-based systems, patching edge cases one by one, reluctant to invest in end-to-end training. Only Tesla committed fully to this switch early on – and now they're the only ones with a truly scalable system adaptable to almost anywhere.
Few companies running LLMs in production are ready to switch to real agents. It's risky – all those edge cases you spent months tuning could suddenly break. Your metrics might regress. Your system might behave in ways you didn't expect.
That fear is real. You know you have to make the switch eventually, but before you know it, you’re stuck maintaining a 𝐟𝐫𝐚𝐠𝐢𝐥𝐞, 𝐮𝐧𝐬𝐜𝐚𝐥𝐚𝐛𝐥𝐞 𝐬𝐭𝐚𝐜𝐤 that’s harder and harder to adapt as use cases grow more demanding. Meanwhile, O1-level reasoning models are becoming faster, cheaper, and more powerful by the day.
𝐒𝐨, 𝐬𝐡𝐨𝐮𝐥𝐝 𝐲𝐨𝐮 𝐬𝐰𝐢𝐭𝐜𝐡 𝐭𝐨 𝐚 𝐦𝐨𝐫𝐞 𝐚𝐠𝐞𝐧𝐭𝐢𝐜 𝐰𝐨𝐫𝐤𝐟𝐥𝐨𝐰?
Yes—but you can't do it overnight. Building systems that are both autonomous and reliable takes time. You need:
- 𝐂𝐥𝐞𝐚𝐫 𝐛𝐨𝐮𝐧𝐝𝐚𝐫𝐢𝐞𝐬 for what your agent can and can't do
- 𝐅𝐚𝐢𝐥-𝐬𝐚𝐟𝐞𝐬 that kick in when things go wrong
- 𝐑𝐨𝐛𝐮��𝐭 𝐯𝐞𝐫𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 to catch problems early
Then you can gradually expand which tasks you trust to your agent – similar to how self-driving cars expanded their 𝐎𝐩𝐞𝐫𝐚𝐭𝐢𝐨𝐧𝐚𝐥 𝐃𝐞𝐬𝐢𝐠𝐧 𝐃𝐨𝐦𝐚𝐢𝐧𝐬 (𝐎𝐃𝐃𝐬).
The key is to start now. The longer you wait, the harder it gets to change. If you begin experimenting with agents today, you'll be ready when powerful reasoning LLMs become the standard. Don't let today's comfort become tomorrow's constraint!
How to choose the right agent framework? We built the same Agentic Finance App using LangGraph, CrewAI, and OpenAI Swarm and here's our recommendation.
Deep-dive blog: https://t.co/oGCJHaHtuT
@langchain@crewAIInc@OpenAI@RelariAI
I was inspired by the SF breakfasts hosted by @Inferless_ and started organizing "GenAI Developer Breakfast" at Harvard Square.
I find these small-group conversations super fun and insightful. Thank you everyone for who came to exchange learnings on LLMs today!
PSA for Boston area folks:
I'm pleased to announce the speaker lineup for the June @aicampai Boston meetup. Next Tuesday, June 18th at Microsoft NERD in Cambridge.
RSVP here to attend:
https://t.co/OBC0fOEq5E
Many more Boston area AI events here:
https://t.co/CNm1hZLiFW
Are you getting the most out of your LLMs and RAGs? In the latest episode of the ODSC Ai X Podcast, join Pasquale Antonante, PhD, Co-founder and CTO at Relari AI, to discuss innovative evaluation methods for LLM and RAG applications.
🎧 Listen Now: https://t.co/JT4kXsHkRK
Giving a talk on synthetic data using @RelariAI tomorrow alongside folks from @milvusio, @TectonAI, and @bentomlai. Come join us at Github's HQ:
https://t.co/lIrNlfhPVa
🙌 We're excited to collaborate with @milvusio on this case study! Our synthetic data pipeline has been integrated with the new lightweight Milvus Lite vector database released today!
Check out our latest case study:
1️⃣ Generate Synthetic Dataset
2️⃣ Systematically benchmark 3 RAG systems with the synthetic dataset
https://t.co/JsLJfnbFJm
🚀@RelariAI launched! Testing and Simulation Stack for GenAI Apps
"Harden your GenAI Systems with Synthetic Data"
🌐 https://t.co/uyY2lSoGw0
⭐ Helps AI teams simulate, test, and validate complex AI applications throughout the development lifecycle
📊 30+ open-source metrics, synthetic test-set generators, and online monitoring tools
📈 Generate large-scale synthetic datasets tailored to your application to test your models and applications at a fraction of the cost
Congrats @Yiszzzz@AntonanteP!
https://t.co/X6IFoVYDII
💡 All is packaged in a new, simpler API that makes it easier to use. Check out the latest version (0.3.9) and let us know what you think!
We leveraged sqlglot in these metrics, thanks @captaintobs for the amazing package!
We added two deterministic SQL metrics to help you evaluate text-to-SQL applications. These metrics understand the database schema and interpret the semantics of the generated query to give you a score you can trust.
Text-to-SQL is one of the most common LLM applications, but it is a complex task. Human language is inherently ambiguous and understanding large databases is challenging.