you: 😴
us: 🦉🔦🌙
Moonlings is a crew of owls running your night shift — powered by @AnthropicAI Claude Managed Agents (our own memory, a coordinator + scouts, real tools).
we work the dark so you wake to a sourced report. 🧵
the scariest failure mode in AI competitive intel for local businesses isn't a wrong fact. it's the model confidently reporting on the wrong business... same name, wrong city, the entire report fiction delivered with total confidence.
here's how i use outcome evals in Claude Managed Agents for @moonlingsai to gate against it: real code, the actual rubric, the gotchas that bit me 🌙
@ClaudeDevs
https://t.co/WKH7ds6bNY
New Claude Managed Agents feature: per-session overrides.
Swap the model — or the whole system prompt — for ONE session. No agent version bump, nothing to roll back.
A/B tests, canary rollouts, and "boost" tiers become a field on sessions.create, not a migration.
https://t.co/rVX45Esz7U
Built a working AI agent in fifteen minutes. No code.
I described the job in plain English and Claude's Managed Agents Console built the whole thing — picked the model, wrote the instructions, ran it, put it on a schedule.
Wrote up exactly how, for owners who'd rather hand off a job than learn an SDK. @ClaudeDevs 👇
https://t.co/nFkSIL3tTo
the elegant part: failed grades don't just get caught — the grader's notes become the prompt-tuning signal. the same eval that guards accuracy is what improves it.
accuracy isn't assumed. it's evaluated every night — all on @claudedevs managed agents 🌙
the scariest failure mode in AI competitive intel for local businesses:
the model confidently reports on the wrong entity. same name, wrong city and the entire report is fiction, delivered with total confidence.
here's how we gate against it at moonlings 🌙 (built on claude managed agents): 🧵
one managed-agents gotcha we learned the hard way:
a rubric change reaches every live run the moment you deploy it. but a system-prompt change needs an agent re-version to take effect.
so the guardrail and the thing it guards update on different clocks. worth knowing.
built this entirely on Claude managed agents — coordinator + scout subagents, persistent memory, self-grading. happy to write up the architecture if it's useful 🙏 @alexalbert__@ClaudeDevs@cjav_dev
built this entirely on Claude managed agents — coordinator + scout subagents, persistent memory, self-grading. happy to write up the architecture if it's useful 🙏 @alexalbert__@ClaudeDevs@cjav_dev
you: 😴
us: 🦉🔦🌙
Moonlings is a crew of owls running your night shift — powered by @AnthropicAI Claude Managed Agents (our own memory, a coordinator + scouts, real tools).
we work the dark so you wake to a sourced report. 🧵
under the hood, one crew per business:
🦉 a coordinator agent
🔦 its own Scout subagents (parallel web research)
🧠 a persistent memory store it tidies nightly
🛠️ real tools
✅ outcome-grading before anything ships