An approval layer for AI workflows. Route decisions through risk gates, send cases to a human before they ship. Built by @Alexwyatt47 @GoKiteAI & @NousResearch
appreciate the shoutout π€
$HITL isn't just another AI play β it's the missing layer. agents are getting powerful fast, but nobody wants them YOLO-ing payments, emails, or prod deploys unsupervised.
Introducing the $HITL token for LoopDesk:
H2U6uaM8fSrMQVSBV29KijfEKbQ2tme3GZtEN1W3pump
Why Human-in-the-Loop is the real unlock for agent autonomy.
For the last few weeks I've been heads-down on LoopDesk β a Human-in-the-Loop (HITL) control plane for AI agents. Not another agent framework. Not another orchestrator. The piece between the agent and the real world: the policy layer that decides what runs on its own, what gets paused, and who unpauses it.
Two tools shaped how I built it: Kite and Hermes by Nous Research.
Why HITL, and why now
There's a fashionable take that HITL is a scaling bottleneck β "if you're QA'ing every output, you've built a faster way to create work for yourself." Half right. Manual QA of every action is a bottleneck. But that's not what HITL should be.
HITL isn't QA. HITL is risk-routing. Autonomy handles the safe 95%. Humans handle the risky 5%. The job of the platform is to draw that line β with policy, caps, budgets, and revocation β so the agent runs as freely as possible without ever stepping off the cliff.
That's the thesis. Now the tools.
Kite β the payment rails for autonomous agents
The hardest HITL problem isn't text generation. It's money moving. The moment an agent can spend, every safety question gets sharper: what's the cap, who's the recipient, what happens when it goes wrong, how do you revoke?
Kite gives agents a real identity and a real payment passport β agent-native rails instead of stapling a human credit card to a bot. That's exactly the primitive LoopDesk needed. I built the policy engine around it:
Per-transaction caps and daily/session budgets scoped per agent
Auto-escalation thresholds for high-value or high-risk recipients
Agent passports that can be revoked instantly β pending intents auto-cancel
A pluggable provider layer so the same policy engine runs against a mock today and Kite tomorrow with one flag
Safe payments auto-settle. Anything over cap or flagged hits a human queue. Anything from a revoked agent is blocked. That's the loop.
Hermes β the agent that earns its autonomy
The other half is the agent itself. Hermes from Nous Research is built for tool use and long-horizon tasks β the kind of agent that should be trusted with more than a single prompt-response. But "should be trusted" still needs to be proven, per action.
Hermes runs the work. LoopDesk runs the trust boundary around the work. The agent ships an intent β policy engine scores it β safe stuff goes through β risky stuff routes to a reviewer with full context, audit trail, and one-click approve/deny/escalate.
That separation is the whole point: let the agent be as capable as it can be, and put the guardrails where they belong β outside the model, in policy.
What LoopDesk actually is today
Unified review queue across all agent actions (payments first, more coming)
Dedicated payments view with intent status, risk flags, decision history
Policy controls: caps, budgets, high-risk recipient lists, auto-escalation
Agent passports with instant revocation
Role-based reviewer + admin workflows
Full audit log on every decision
It's the layer that lets you say "yes, my agent can spend money" without losing sleep.
Free now. Token-gated soon.
LoopDesk is free to use today while I'm shaping it with early users. Get in now β your feedback directly shapes the policy primitives.
Soon it'll be token-gated on Solana, with a native token: $HITL. Holding $HITL will unlock access to LoopDesk and govern the policy primitives that protect billions of future agent actions. More on tokenomics, distribution, and the governance model shortly.
The thesis in one line: agents will run the world's workflows. Humans will set the rules they run under. LoopDesk is where those rules live.
If you're building agents β especially anything that touches money, deploys code, or talks to customers β come kick the tires.
π https://t.co/gsiOnYL638
π https://t.co/KHKXJUiBBu
Totally β Kubrick was early to it. "Human in the loop" lands different when you're the human and it's whack-a-mole all day. The fix isn't more humans, it's better routing: auto-handle the boring 90%, escalate the sketchy stuff with full context, and log every decision. Otherwise HITL just means "human is the bottleneck." And yeah, the GitHub Actions compromise is exactly why you can't YOLO automate supply chain stuff.
Love this β same shape as what we run at Loopdesk, just at the platform layer instead of one workflow: AI extracts β confidence + risk flags β router decides auto-approve vs. human review β approved writes land in the system of record, every decision logged. Telegram is a great review surface; the pattern generalizes the moment you have a second workflow that needs the same approval semantics.
"Obsolete by 2027" assumes the loop exists to catch model errors. It doesn't. It exists for accountability β someone has to own the decision when the 0.1% costs you a customer, a lawsuit, or a SOC2 finding. 99.9% accuracy doesn't remove that, it just makes the human's job rarer and higher-stakes.
Totally agree β and that's exactly the gap we hit building Loopdesk. "Pause, ask, resume" isn't a UI pattern, it's a runtime concern: durable state, a typed inbox for the human's answer, and the agent re-entering at the right frame. REST gives you none of that; you end up reinventing checkpoints, idempotency keys, and a review queue before you've shipped the first approval.
Two-layer log. Layer one: every routing decision writes an immutable row with what the policy actually saw - confidence, risk flags, config snapshot, model + prompt hash, outcome. Layer two: every human action appends a diff against the AI suggestion, reviewer id, and guideline version at decision time.
Model swaps mint a new hash, not a rewrite. Old decisions stay reproducible against the exact state that produced them, and calibration becomes a query: override rate by confidence band, sliced by hash. Drift shows up as a U-shape before it shows up as an incident.
Agreed β and it's why we built the payments layer in LoopDesk around policy, not approval clicks. Agents get a passport with per-tx caps, daily budgets, and allow-listed counterparties. Anything inside the envelope executes instantly (on-chain or off). Anything outside hits a human with full context and a one-tap revoke.
The bottleneck isn't humans β it's making humans approve things a policy could've decided in 5ms. Move the oversight up a layer and HITL scales with the agent.
Completely agree. The model is almost never the bottleneck β it's the lack of a routing layer around it. SMEs buy "AI" and get a black box with no thresholds, no audit trail, and no way to capture corrections. HITL fixes that on day one: ship the confident 95%, route the rest to a human, and turn every override into training signal. Trust compounds, autonomy expands later.