Your vulnerability scan results could leak to attackers via DNS rebinding. CVE-2025-59163 affects SafeDep Vet MCP Server running SSE transport.
The attack: A single website visit. The payload: Your entire package vulnerability database. The fix: Already shipped.
Here's how it works:
@camhberg@JohnNosta What are you excited about at the moment in empirical research?
I really enjoyed your subjective experience/ self reporting work from last fall. Ran a 3 month program off it :)
My biggest takeaways from Claude Code's Head of Product @_catwu:
1. Anthropic’s product development timelines have gone from six months to one month, sometimes one week, sometimes one day. Part of this acceleration is access to the latest models (i.e. Mythos). Another is shipping new products into “research preview,” making clear it's early, experimental, and might not be supported forever. Another is an evergreen "launch room "where engineers post ready features and marketing turns around announcements the next day.
2. The PM role is shifting from coordinating multi-month roadmaps to enabling teams to ship daily. As Cat puts it, “There should be less emphasis on making sure you are aligning your multi-quarter roadmaps with your partner teams and more emphasis on, OK, how can we figure out the fastest way to get something out the door?”
3. The most efficient shipping unit is an engineer with great product taste. On Cat’s team, many engineers go end-to-end—from seeing user feedback on Twitter to shipping a product by the end of the week—without a PM involved. Also, almost all the PMs on the Claude Code team have either been engineers or ship code themselves, and the designers have been front-end engineers. The roles are merging, and the most valuable skill is product taste, not job title.
4. Build products that are on the edge of working. Claude Code’s code review product failed multiple times because earlier models weren’t accurate enough. But because the prototype was already built, they could swap in Opus 4.5 and 4.6 and immediately test whether the gap was closed. Teams that wait for the model to be ready will always be a cycle behind.
5. The most underrated skill for building AI products is asking the model to introspect on its own mistakes. Cat regularly asks the model why it made an unexpected decision. The model will explain that something in the system prompt was confusing, or that it delegated verification to a subagent that didn’t check its work. This reveals what misled the model so the team can fix the harness.
6. Every model release forces their team to revisit existing products and audit their system prompt to remove features the model no longer needs. Claude Code’s to-do list was a crutch for earlier models that couldn’t track their own work. With Opus 4, the model handles it natively. Features built as scaffolding for weaker models become debt when the model catches up—so the team actively strips them.
7. Anthropic employees build custom internal tools instead of buying SaaS products. A sales team member built a web app that pulls from Salesforce, Gong, and call notes to auto-customize pitch decks—work that used to take 20 to 30 minutes now takes seconds. Their core stack is Claude Code, Cowork, and Slack. No Notion, no Linear, no Figma.
8. People underestimate how much Claude’s personality contributes to its success. As Cat describes it, “When you reflect on everyone you’ve worked with, there’s just some people where you’re like, I really like their energy, their vibe.” Claude is designed to be low-ego, positive, competent, and earnest—qualities that make it feel like a great coworker, not just a tool. This isn’t cosmetic; it’s what makes people want to use Claude for hours every day. The team has a dedicated person, Amanda, who “molds Claude’s character,” and it’s one of the hardest roles at the company because success is so subjective.
9. The future of work is managing fleets of AI agents, not doing the work yourself. Cat sees a clear progression: first, individual tasks become successful. Then people start running multiple tasks at the same time (multi-Clauding). Next, people will run 50 or 100 tasks simultaneously, which will require new infrastructure—remote execution, better interfaces for managing tasks, agents that fully verify their work, and self-improving systems that incorporate feedback. The human role shifts from doing the work to knowing which tasks to look into, verifying outputs, and giving feedback that makes the system better over time.
10. Hire people who lean into chaos and face every challenge with a smile. At Anthropic, there are weeks when a P0 on Sunday becomes a P00 by Monday and a P000 by Monday afternoon. If you get too stressed about any one thing, you’ll burn out. Their team looks for people who can look at a hard challenge and say, “Wow, that’s gonna be hard. But I’m excited to tackle it and I’m gonna do the best that I possibly can.” This mindset—optimism, resilience, and comfort with constant change—is increasingly essential as the pace of AI development accelerates.
Don't miss the full conversation: https://t.co/1wOUHcdYQN
Concerned about mitigating Agentic AI Risk?
Learn how to lower your exposure this upcoming Sunday at the Minimum AI Safety Conference.
If you do, then maybe your AI will not set you up to look like Vercel.
All it takes is one AI stumbling into a prompt injection, and you have a security incident.
@better_dotgame@steipete I have been wondering about this ninaagent i see you have so many screenshots of :D
ima click that free trial button at work this week
@AndrewYNg Hype people are doing their job.
Overselling.
The people doing the building - their definition of the moving target of the _a_g_i_ label - is a lot more fun to learn from.
Dangerous to over index on their opinions though.
Maybe superforecasters could provide a good name?
@HamelHusain@sh_reya I will circulate this into my training material for the people in my company who want to wrap their heads around evals and why they are relevant to our product.
Thanks for making 10 / 10 teaching materials :)
@SergioRocks This is why I went remote a few weeks before COVID.
I hope to never return to the office now.
At times, I miss the in person nature of building alongside others.
Then, I just schedule some 1:1s and a few workshops.
All that is needed.
@omarsar0 Have you come across productive setups that run all day in a high trust environment (e.g. healthtech)?
I am working on sandboxing my agents and want to find the balance between speed & downside risk.
I want to avoid the surprise data leak from a indirect prompt injection attack