Wardproof is a guard you put in front of an AI agent. It checks every input and every tool call before it runs, and blocks the dangerous ones: prompt injection, rm -rf, runaway payments. Open source, runs offline, every decision logged.
turned this thread into a real example.
a small JS client for wardproof serve: POST each tool call to /check, proceed only on allowed:true, fail-closed on anything else.
it's in the repo now: https://t.co/aqjlVOJ7D8
https://t.co/91yJpDpadA
wardproof 0.3.5
the local screening service now takes an optional bearer token, rate-limits per client, and caps body size. all stdlib.
gate any agent's tool calls over HTTP, safely.
pip install -U wardproof
https://t.co/dpW5Cmxwpq
Wardproof 0.3.4 adds a @swarms_corp integration.
Screen a Swarms agent's tool calls through the deterministic engine before they run: the dangerous ones are refused, the safe ones execute, and every decision is written to the audit ledger. Works on single calls and batches.
pip install -U wardproof https://t.co/ByVFW6XeJl
Wardproof 0.3.3 adds a local serve mode.
Run wardproof serve and any agent in any language can gate a tool call or input over HTTP: POST /check returns a verdict, GET /health for liveness. The swarm builds once at startup, so no per-call process spawn. Pairs with the guard skill for SKILL.md hosts like @bankrbot, @claudeai Code, @Cursor, @openclaw, etc.
pip install -U wardproof https://t.co/ByVFW6XeJl
Wardproof 0.3.2 is out, with two ways to put the guard in front of an agent:
- wardproof check: a one-line CLI that screens a tool call or input and exits 0 only on ALLOW, so you can gate a step from any shell.
- a portable guard skill that drops into SKILL.md hosts like @claudeai Code, @cursor_ai, @openclaw, probably @bankrbot too, etc.
pip install -U wardproof
https://t.co/ByVFW6XeJl
Guarding an OpenAI or Anthropic agent with Wardproof takes a few lines: wrap the tool call, read the verdict, refuse anything that is not ALLOW. The core runs offline with zero third-party dependencies. Working examples for both are in the repo.
pip install wardproof
Before shipping, we red-teamed our own audit ledger and payment guard.
We found two real bugs: a signature check that could be bypassed, and a payment envelope that hid the real recipient in a second field.
Both are fixed now, with regression tests, before release.
We would rather find them than you do.
Wardproof is a guard you put in front of an AI agent. It checks every input and every tool call before it runs, and blocks the dangerous ones: prompt injection, rm -rf, runaway payments. Open source, runs offline, every decision logged.
I built Wardproof. I run agents that touch tools and money, and the existing options were either a hosted black box or one model call you can talk out of its job. I wanted something local, boring, and inspectable that fails closed.
MIT licensed, zero third-party deps in the core. Guards for OpenAI and Anthropic tool calls, CrewAI, LangGraph, MCP, x402 payments, and more.
pip install wardproof
Repo, reproducible benchmark, threat model: https://t.co/PculS6WANQ
MIT licensed, zero third-party deps in the core. Guards for OpenAI and Anthropic tool calls, CrewAI, LangGraph, MCP, x402 payments, and more.
pip install wardproof
Repo, reproducible benchmark, threat model: https://t.co/PculS6WANQ
Detection is measured, not asserted. A labelled corpus ships with the code so you can reproduce it: every attack flagged at 0% false positives, ~97% on a separate held-out set. It misses pure paraphrase, which is what the optional model layer is for.