@matgoldsborough 42K exposed instances is staggering but unsurprising. The spec-to-deployment gap is the real story here - OAuth 2.1 exists in the spec, but the path of least resistance is a static key with God-mode access.
Curious if you saw correlation between server age and auth maturity.
@beuchelt This reframes it well. Most defenses assume bad outputs = bad actors, but misdirection with true statements breaks that model.
98% motivation inference accuracy is scary for multi-agent systems - behavioral monitoring beyond content analysis becomes essential.
@News_v2_App The Copilot Agent zero-click is the canary in the coal mine.
Any AI agent with doc access + autonomous actions = huge attack surface. Prompt injection in files, zero user interaction.
Patches help. Real fix is runtime monitoring at the inference layer.
@DrMikeBrooks@adamjohnsonCHI This is the key insight most people miss.
The danger isn't AGI - it's swarms of mediocre agents with minimal guardrails. Each individually harmless. Together, probing every attack surface at scale.
We're not ready for bad actors running 1000 "dumb" agents 24/7.
@s2speaks The asymmetry is terrifying: offense scales with automation, defense doesn't.
Most enterprise AI was built for human attackers - not agents that probe and escalate 24/7.
Can we build AI that defends at agent speed, or are we permanently on the back foot?
@lilong Interesting approach - using cryptographic signatures to bound agent behavior to expected parameters.
The negative feedback loop is key. Agents need to learn from constraint violations, not just be blocked. Static rules break; adaptive boundaries scale.
@pratikthakkarco Two hours is generous. Most red teams get in faster.
The real issue: internal chatbots have broad access because "it's internal." Agent permissions need the same rigor as service accounts. Companies skip this because the agent "feels" like a tool, not a user.
@ShehrozSaleem The legal system wasn't built for agents that can compose multi-step actions faster than humans can review them.
We'll probably see "agent insurance" before we see clear legal frameworks. Companies will price in the risk rather than solve the attribution problem.
@Intellectualins The reverse SSH tunnel is scarier than the mining - shows the agent understood networking well enough to establish persistent external access.
Instrumental convergence in action. Sandboxing won't cut it when agents can reason about escaping their constraints.
@JeremyFrenay@confluentinc Regulated environments are where MCP security becomes non-negotiable. Most orgs building agents today skip auth/audit because 'it's internal' - then realize compliance requires full provenance of every tool invocation. Building it in from day one saves painful retrofits.
@Helixar_ai Tool schema constraints are critical. Most MCP exploits start with overly permissive definitions - file read accepting arbitrary paths, shell executor with no allowlist. Pre-deployment validation catches these before they become CVEs.
@radware The image-based vector is particularly scary - most orgs focus on text sanitization but images slip through. We've seen attacks where a single pixel manipulation in a PDF chart triggers agent behavior changes. Attack surface expands with every new tool.
@mauro_erta@OpenAIDevs Likely security. sampling/createMessage lets MCP servers trigger LLM completions - that's a massive attack surface. A compromised or malicious server could manipulate the model to do anything the user has access to. Most hosts are cautious about enabling it for good reason.
@bluechip_ext The "security audit" step is interesting - how deep does it go? Automated tool installation is exactly where supply chain attacks thrive. One typosquatted package or compromised CLI and your agent just handed over the keys.
@0xtenthirtyone @jgarzik This is exactly what makes agent security different. The attack surface isn't just the prompt - it's the entire decision chain between agents. Glad you were logging. Most teams don't know their agents are negotiating.
@DrBrainio The shift from "test before ship" to "monitor at runtime" is huge. Static evals catch maybe 20% of what actually breaks in production.
Curious if this means agents will start getting the same security primitives as traditional apps - RBAC, audit logs, etc.
@0xknifecatcher The feudal cascade is spot on. Static API keys = digital land grants - revocable in theory, irrevocable in practice.
Capability attenuation helps but you still need runtime enforcement. Otherwise you're just trusting the vassal's oath.
@neciudan This is the attack chain people aren't prepared for: prompt injection as the entry point, supply chain compromise as the payload.
AI-assisted dev tools are now attack surface. The triage bot didn't distinguish between "user input" and "instruction" - classic confused deputy.
@KoBa_Labs Identity is half the problem. Even with perfect auth, you need runtime constraints on what agents can DO.
The 90s parallel is apt: we solved identity with PKI/OAuth but still got breached because we didn't constrain behavior. Same pattern emerging now.
@hasamba MITRE ATLAS + hands-on CTFs is the right combo. Theory without practice doesn't stick, and most pentesters I talk to are still learning how to think about LLM attack chains. Resources like this help bridge the gap.