everyone is building ai agents.
almost no one can debug them in production.
when something breaks:
you get logs. maybe.
you guess. usually.
you pray. often.
that didn’t sit right with me.
so i built @PandaProbe — open-source agent engineering platform. it gives you:
• full execution traces
• session-level evals
• failure debugging
• continuous monitoring
stop guessing what your agent did.
see it. measure it. fix it.
if you’re serious about shipping agents:
https://t.co/3mRVsyt394
check out our blog on why agent engineering ≠ software engineering.
we break down our vision for @PandaProbe and how tracing + evals actually help you fix your agents.
https://t.co/bAJlfrKtdZ
i just had one of those “this is how software should feel” moments…
building my startup, i’ve tried a lot of tools. most are fine. some are painful. but every once in a while you run into something that’s just… smooth. like everything clicks, nothing fights you, and you actually enjoy the process.
i’ve had that feeling with @mercury. same with @finta. and recently, with @mintlify.
and it got me thinking — shouldn’t that be the bar?
like seriously, why isn’t “this feels amazing to use” the default goal for every product?
for me, that experience came while building my docs with mintlify. everything from setup to writing to publishing just felt frictionless. no weird configs, no fighting the tool, just flow.
anyway, that experience helped me ship something i’m really excited about:
@PandaProbe docs are now live 🐼
https://t.co/7q5VR2erya
if you’re building anything dev-facing, highly recommend giving mintlify a shot. this is what great product feels like.
everyone shipping “agents”
nobody knowing if they actually work
we dropped a blog on why agent evals are the whole game
(and why your old eval mindset is useless) 👇🏽
https://t.co/jhf4YOytXH
we’re building the open-source standard for agent engineering
if you’re cracked, pull up.
most agents are cooked in the backend and nobody is watching. 💀
we're building PandaProbe to become the new standard for agent engineering.
- evals that aren't mid
- research grounded metrics
- live monitoring
engineering > glazing. 🐼🧪
https://t.co/1zRtJzFY6H
killing Chirpz. it’s over. 💀
pivoting because the agentic infra out here is straight brainrot 🤡 everyone is flexing their 'agents' but nobody talks about how they’re actually cooked in the backend.
Langfuse? too old. Langsmith? too expensive + closed. the industry is glazing 'vibe-check' evaluations and it’s a skill issue.
so I'm building @PandaProbe to be the standard. open source, cracked tech, and sharp edges only. 🐼🧪
it’s time for real agent engineering.
stop being mid:
https://t.co/REE7QeK20r
https://t.co/3mRVsyt394