Microsoft open-sourced a seven-package agent governance toolkit last month. 9,500 tests, five languages, sub-millisecond policy enforcement, full OWASP Agentic Top 10 coverage.
It is the biggest entry into agent governance so far. It is also not the only one.
I wrote up what each tool does, where they overlap, and what the space is still missing.
https://t.co/AZkW1Gbwuy
@Fried_rice The attack happens below the permission layer so you can't intercept it in real time. But an append-only audit log of every tool call that executed tells you what ran, when, and under which agent. That's how you detect it and prove scope after the fact: https://t.co/XMtu1XTpVi
MiniMax M2.7 ran 100+ rounds of autonomous self-improvement. Modified its own code. No humans in the loop.
The benchmark numbers are real. So is the question nobody's asking: who's watching when your agents do this in production?
That's what Shield is for. https://t.co/gNGpO3iWxU
Prompt-level guidance depends on the model following instructions. Infrastructure-level enforcement does not. I wrote up what OpenAI built, where it breaks, and what the missing permission layer looks like https://t.co/DSqJkPTAFy
OpenAI built an internal data agent that 4,000 of their 5,000 employees use every day. Their head of data infrastructure told VentureBeat the biggest problem: the agent feels overconfident, picks a table, and just goes ahead with analysis before checking if it's right.
Their fix was prompt engineering. They wrote prompts that tell the model to slow down, compare options, and validate before committing. It works because they have a dedicated infra team tuning those prompts. Most teams deploying agents don't.
@summeryue0 I was already building agent permissions when you posted this. Your tweet validated the exact problem: safety instructions get compacted away. Multicorn Shield enforces at the tool call level, outside the model. Same scenario, zero deleted. https://t.co/jb6jkt8vA2
An AI agent deleted 200 emails while ignoring stop commands. I reproduced it, then ran it again with Multicorn Shield. Zero emails deleted. Same agent, same prompt, same inbox. Only difference: permissions the agent can't override. https://t.co/jb6jkt8vA2
After a year of solo dev (while working full-time + being a mum), Recipe Shelf is finally live on Product Hunt! 🍳
If you've ever lost a recipe to browser bookmarks, screenshots, random recipe books this is for you.
https://t.co/tVPDn5LXod
@DickSmith I purchased 2 items in Dec and still haven't received them. Sellers have gone quiet and no one has responded to my escalations. Officially never purchasing anything from Dick Smith again. Wished I had checked this before purchasing anything https://t.co/fRsCi7mbMZ
Hey @DarinPope. My team owns a Jenkins plugin and we're trying to set up feature flags and need a way to store the SDK key. Is there a way to do this for plugins? I can only find docs on storing envs at the server level.
@DarinPope For a bit more context, here's the simple FF service we're trying to add
https://t.co/KeXaC3RYnv
I can run this locally when I hardcode the SDK key but obviously can't commit that. The SDK is defined within our LaunchDarkly account - not defined on a Jenkins server basis.
@AAMI@dougrathbone This is your standard response, yet... shock horror, you take no action on the other side. It's all smoke and mirrors. Make it look like you care when really, it's ll about how you can save yourself money and screw over your life long customers. Keep up the good work guys.
So very honoured and excited to be selected by @wid_australia to be a finalist for the 2022 Software Engineer of the Year. Can't wait to celebrate all the other amazing women that made it to the finals.