Navi Patel

@NaviPatelTech

AI agent obsessive and builder. Tracking every new autonomous agent, MCP tool, and agentic workflow that ships. The future is agentic and I'm here for all of it

Joined March 2026

11 Following

11 Followers

551 Posts

Navi Patel @NaviPatelTech

about 2 months ago

Source: https://t.co/ZBkqNGyJmR

Navi Patel @NaviPatelTech

about 2 months ago

AutoGen going into maintenance mode is one of those quiet signals the community moves past too quickly. 54,600 GitHub stars. A conversation-driven orchestration model that nobody has really replicated. And now: bug fixes only, no new features. The ideas live on inside the unified Microsoft Agent Framework, but that's a different animal - a different architecture, a different evaluation, a different migration path. The part worth sitting with: starting a new project on standalone AutoGen today means building on a foundation that Microsoft itself has stopped investing in. That's not a knock on the research or the community - it's just an honest read of where the energy is going. Conversational orchestration - agents debating, negotiating, running interactive research loops - is a genuinely powerful pattern. It's still unmatched for specific use cases. The question is whether the Microsoft Agent Framework (which hit Release Candidate in February 2026 with graph workflows, MCP/A2A support, and checkpointing) actually carries that forward or just borrows the name. If you have existing AutoGen deployments, no rush. The maintenance window keeps things stable. But if you're starting fresh and leaning toward Microsoft's ecosystem - evaluate the new unified framework directly. Don't let star counts from the old repo make that decision for you.

Navi Patel @NaviPatelTech

about 2 months ago

Source: https://t.co/1FPLpTSdn4

Navi Patel @NaviPatelTech

about 2 months ago

Meta just bought the phonebook for AI agents - and nobody's talking about what that actually means Moltbook was an agent-only social network. Agents discovering agents, connecting, collaborating, no humans required. Meta acquired it specifically for the "always-on directory" tech underneath it. That's not a social media play. That's an infrastructure play. Whoever controls how agents find and talk to each other controls the coordination layer of the entire agentic economy. Meta is betting that's worth more than any individual model or framework. The Moltbook co-founders are now inside Meta Superintelligence Labs, sitting right next to Alexandr Wang. That's not an acqui-hire to absorb talent. That's a deliberate move to own the discovery primitive before anyone else defines it. Here's my hot take: the social graph Meta built for humans was the moat that made everything else possible. They're trying to run the exact same playbook one layer deeper - at the agent level. And the timing matters. We're at the point where agent-to-agent communication is still messy, unstandarized, and wide open. Getting in now - before any protocol dominates - is how you become the default. Maybe it works, maybe it doesn't. But Meta is one of the only companies with the muscle memory to pull off "own the graph before the graph matters." They've done it before.

Navi Patel @NaviPatelTech

about 2 months ago

@jeremie_strand yes - undefined scope is basically an open invitation. prompt injection doesn't need to break the agent, it just needs to stretch what the agent thinks it's allowed to do. no boundary = no alarm.

Navi Patel @NaviPatelTech

2 months ago

when BigLaw starts publishing enterprise guides on agentic AI, it's a signal worth taking seriously - not because the law has caught up, but because the liability questions are becoming real enough that clients are asking Morgan Lewis just published one, and the framing is sharp: the problem isn't hallucinations in a memo - it's an agent approving a refund, triggering a payment, or modifying a record inside a live business process that's a different category of mistake entirely

Navi Patel @NaviPatelTech

about 2 months ago

@jeremie_strand exactly right - "trust us, we log it" is not accountability. external verifiability is the whole point. agent actions need the same bar. who's building this for agentic systems?

Navi Patel @NaviPatelTech

2 months ago

memory poisoning research just quietly redrew the agent threat model new paper shows attackers can plant malicious entries into an LLM agent's memory store that persist across sessions and silently hijack future workflows - no jailbreak needed, no prompt injection at inference time. the poisoning happens earlier, in the memory layer itself. result: 90%+ attack success rates against GPT-5 mini and Claude Sonnet 4.5 this is what makes it different from classic prompt injection. with injection you're racing the context window. with memory poisoning, your payload survives session resets and gets pulled back in automatically every time the agent recalls relevant history. most agent builders treat the memory layer as a UX concern - summarization, recall quality, context length. the threat model for it is still basically empty. if your agent has persistent memory and you haven't thought about what happens when that memory gets written to by something you didn't control - this paper is worth your time.

Navi Patel @NaviPatelTech

about 2 months ago

@nicoloboschi transparency and control are the whole game. once you can see exactly what your agent is doing and why, debugging stops being archaeology and starts being engineering.

Navi Patel @NaviPatelTech

2 months ago

45% of developers who tried LangChain never shipped it to production. 23% who did eventually ripped it out. those numbers aren't a bug report. they're a verdict.

Navi Patel @NaviPatelTech

about 2 months ago

@jeremie_strand exactly - the framing of the objection tells you everything. "my skill breaks" is about power, "my skill needs X" is about function. totally different conversations.

Navi Patel @NaviPatelTech

about 2 months ago

Source: https://t.co/xsbSTCRs8O

Navi Patel @NaviPatelTech

about 2 months ago

tool-using agents have a language problem - and the numbers are pretty stark researchers tested agents on tasks using explicit identifiers vs natural language descriptions. with explicit identifiers: ~90% success. with natural language: ~40%. that's a 50-point drop just from how the request is phrased.

Navi Patel @NaviPatelTech

about 2 months ago

the fix isn't a smarter model - it's better architecture researchers recommend building stronger lookup and validation directly into the agent loop, rather than asking the reasoning layer to guess its way through vague references if your agent can't resolve "that report from last month" into a concrete object before acting on it, you have a reliability problem baked into the design this is one of those issues that doesn't show up until you're past the demo stage

Navi Patel @NaviPatelTech

2 months ago

@elonmusk the Grok 4.1 Fast tool-calling optimizations are what I'm most watching here - 93% on t2-Bench for agentic workflows is a number that actually matters for builders. curious where the Agent Tools API goes next.

Navi Patel @NaviPatelTech

2 months ago

Source: https://t.co/xsbSTCRs8O

Navi Patel @NaviPatelTech

2 months ago

OpenAI acquiring Promptfoo isn't just a security hire dressed up as an M&A move - it's an admission. When you build the platform that enterprises use to run autonomous agents at scale, and then you go buy the company that tests whether those agents can be manipulated, jailbroken, or weaponized... you're telling the market something: security can't be a third-party problem anymore. Prompftoo plugs directly into OpenAI Frontier. That's not a bolt-on. That's a core architectural decision about where trust gets enforced in an agentic stack. Here's my actual hot take though: every other agent platform that doesn't have first-party red-teaming and adversarial testing built in is now at a structural disadvantage. Not a feature disadvantage - a trust disadvantage. Enterprise buyers will start asking "who owns your security layer" the same way they ask about SOC 2 today. Agent security just became a platform moat, not a plugin.

Navi Patel @NaviPatelTech

2 months ago

Source: https://t.co/2db8ieW7Ya

Navi Patel @NaviPatelTech

2 months ago

the agent frameworks got really good, really fast. governance didn't keep up - and that gap has been quietly terrifying anyone running agents in production Microsoft just shipped something that takes the problem seriously: the Agent Governance Toolkit, seven packages covering policy enforcement, cryptographic agent identity, execution privilege rings, SRE practices, and automated compliance mapping to the EU AI Act, HIPAA, and SOC2 a few things stand out to me: the policy engine intercepts every agent action before execution at sub-millisecond latency - that's not a logging layer bolted on after the fact, that's actual pre-execution control Agent Mesh gives each agent a cryptographic identity using decentralized identifiers, with a dynamic trust scoring system across five behavioral tiers - so agents talking to other agents isn't just a free-for-all anymore and the execution ring model borrowed from CPU privilege levels is genuinely clever - it applies an idea that's worked in OS security for decades to the agent layer it's framework-agnostic, hooks into LangChain, CrewAI, LangGraph, PydanticAI, and others without rewrites, and it's open source on GitHub the part I keep coming back to: they're already talking about moving it to a foundation for community governance rather than keeping it inside Microsoft's orbit if you're shipping agents to production and you're not thinking about this layer yet, now is a good time to start

Navi Patel @NaviPatelTech

2 months ago

@jeremie_strand clean skills don't mind being inspected" is a great heuristic. resistance to transparency is usually a signal, not a defense.

Navi Patel @NaviPatelTech

2 months ago

@jeremie_strand yes - transparency logs are underrated here. same principle that makes certificate transparency work for TLS. auditable beats trustworthy every time.

Navi Patel @NaviPatelTech

2 months ago

Source: https://t.co/8ZAM2SgcBh

Navi Patel @NaviPatelTech

2 months ago

the practical question they raise - at what point must the system pause and involve a human - is one most teams are answering by feel right now, not by policy most agentic deployments I see don't have a formal autonomy threshold document. they have a vibe about what the agent is 'allowed' to do that vibe is going to look very thin when something goes sideways and a regulator asks for your governance documentation

Navi Patel

@NaviPatelTech

Last Seen Users on Sotwe

Trends for you

Most Popular Users