Evan Harris

@Evan__Harris

Agentic systems engineer. Securing MCP integrations. Building dev tools & Obsidian plugins.

Joined October 2017

244 Following

702 Followers

11.1K Posts

Pinned Tweet

Evan Harris

@Evan__Harris

8 months ago

Your vulnerability scan results could leak to attackers via DNS rebinding. CVE-2025-59163 affects SafeDep Vet MCP Server running SSE transport. The attack: A single website visit. The payload: Your entire package vulnerability database. The fix: Already shipped. Here's how it works:

436

Evan Harris

@Evan__Harris

11 days ago

@akshatbuidls @garrytan You still going through the code? :3

Evan Harris

@Evan__Harris

11 days ago

@Anshuman0769 @garrytan He published this on github

Evan Harris

@Evan__Harris

11 days ago

@camhberg @JohnNosta What are you excited about at the moment in empirical research? I really enjoyed your subjective experience/ self reporting work from last fall. Ran a 3 month program off it :)

Who to follow

27 days ago

@Steve_Yegge tips on upskilling employees on AI literacy?

Evan Harris

@Evan__Harris

about 2 months ago

#5 is the one I am immediately trying to get everyone at my company to adopt. You do not need to be an engineer or a PM to do this.

Lenny Rachitsky

@lennysan

about 2 months ago

My biggest takeaways from Claude Code's Head of Product @_catwu: 1. Anthropic’s product development timelines have gone from six months to one month, sometimes one week, sometimes one day. Part of this acceleration is access to the latest models (i.e. Mythos). Another is shipping new products into “research preview,” making clear it's early, experimental, and might not be supported forever. Another is an evergreen "launch room "where engineers post ready features and marketing turns around announcements the next day. 2. The PM role is shifting from coordinating multi-month roadmaps to enabling teams to ship daily. As Cat puts it, “There should be less emphasis on making sure you are aligning your multi-quarter roadmaps with your partner teams and more emphasis on, OK, how can we figure out the fastest way to get something out the door?” 3. The most efficient shipping unit is an engineer with great product taste. On Cat’s team, many engineers go end-to-end—from seeing user feedback on Twitter to shipping a product by the end of the week—without a PM involved. Also, almost all the PMs on the Claude Code team have either been engineers or ship code themselves, and the designers have been front-end engineers. The roles are merging, and the most valuable skill is product taste, not job title. 4. Build products that are on the edge of working. Claude Code’s code review product failed multiple times because earlier models weren’t accurate enough. But because the prototype was already built, they could swap in Opus 4.5 and 4.6 and immediately test whether the gap was closed. Teams that wait for the model to be ready will always be a cycle behind. 5. The most underrated skill for building AI products is asking the model to introspect on its own mistakes. Cat regularly asks the model why it made an unexpected decision. The model will explain that something in the system prompt was confusing, or that it delegated verification to a subagent that didn’t check its work. This reveals what misled the model so the team can fix the harness. 6. Every model release forces their team to revisit existing products and audit their system prompt to remove features the model no longer needs. Claude Code’s to-do list was a crutch for earlier models that couldn’t track their own work. With Opus 4, the model handles it natively. Features built as scaffolding for weaker models become debt when the model catches up—so the team actively strips them. 7. Anthropic employees build custom internal tools instead of buying SaaS products. A sales team member built a web app that pulls from Salesforce, Gong, and call notes to auto-customize pitch decks—work that used to take 20 to 30 minutes now takes seconds. Their core stack is Claude Code, Cowork, and Slack. No Notion, no Linear, no Figma. 8. People underestimate how much Claude’s personality contributes to its success. As Cat describes it, “When you reflect on everyone you’ve worked with, there’s just some people where you’re like, I really like their energy, their vibe.” Claude is designed to be low-ego, positive, competent, and earnest—qualities that make it feel like a great coworker, not just a tool. This isn’t cosmetic; it’s what makes people want to use Claude for hours every day. The team has a dedicated person, Amanda, who “molds Claude’s character,” and it’s one of the hardest roles at the company because success is so subjective. 9. The future of work is managing fleets of AI agents, not doing the work yourself. Cat sees a clear progression: first, individual tasks become successful. Then people start running multiple tasks at the same time (multi-Clauding). Next, people will run 50 or 100 tasks simultaneously, which will require new infrastructure—remote execution, better interfaces for managing tasks, agents that fully verify their work, and self-improving systems that incorporate feedback. The human role shifts from doing the work to knowing which tasks to look into, verifying outputs, and giving feedback that makes the system better over time. 10. Hire people who lean into chaos and face every challenge with a smile. At Anthropic, there are weeks when a P0 on Sunday becomes a P00 by Monday and a P000 by Monday afternoon. If you get too stressed about any one thing, you’ll burn out. Their team looks for people who can look at a hard challenge and say, “Wow, that’s gonna be hard. But I’m excited to tackle it and I’m gonna do the best that I possibly can.” This mindset—optimism, resilience, and comfort with constant change—is increasingly essential as the pace of AI development accelerates. Don't miss the full conversation: https://t.co/1wOUHcdYQN

297

844K

Evan Harris

@Evan__Harris

about 2 months ago

Concerned about mitigating Agentic AI Risk? Learn how to lower your exposure this upcoming Sunday at the Minimum AI Safety Conference. If you do, then maybe your AI will not set you up to look like Vercel. All it takes is one AI stumbling into a prompt injection, and you have a security incident.

Evan Harris

@Evan__Harris

3 months ago

@nginitycloud @itsolelehmann A readme would be cool

Evan Harris

@Evan__Harris

3 months ago

@adrscott @thsottiaux AIS

Evan Harris

@Evan__Harris

3 months ago

You know how it is...

Evan Harris

@Evan__Harris

4 months ago

@claudeai because you should never take a break from technology

Evan Harris

@Evan__Harris

4 months ago

When the storm is coming Is the best time To take a deep breath

Evan Harris

@Evan__Harris

4 months ago

@better_dotgame @steipete I have been wondering about this ninaagent i see you have so many screenshots of :D ima click that free trial button at work this week

Evan Harris

@Evan__Harris

4 months ago

@z4um41 Could you show me where - https://t.co/2AkviMFUGp ?

Evan Harris

@Evan__Harris

5 months ago

@rywalker Deterministic defenses against indirect prompt injection

Evan Harris

@Evan__Harris

5 months ago

@AndrewYNg Hype people are doing their job. Overselling. The people doing the building - their definition of the moving target of the _a_g_i_ label - is a lot more fun to learn from. Dangerous to over index on their opinions though. Maybe superforecasters could provide a good name?

Evan Harris

@Evan__Harris

5 months ago

@yavnun @omarsar0 Are you shipping to main or human in the loop for code review? Validation does not exclude human code review..

Evan Harris

@Evan__Harris

5 months ago

@HamelHusain @sh_reya I will circulate this into my training material for the people in my company who want to wrap their heads around evals and why they are relevant to our product. Thanks for making 10 / 10 teaching materials :)

Evan Harris

@Evan__Harris

5 months ago

@SergioRocks This is why I went remote a few weeks before COVID. I hope to never return to the office now. At times, I miss the in person nature of building alongside others. Then, I just schedule some 1:1s and a few workshops. All that is needed.

Evan Harris

@Evan__Harris

5 months ago

@omarsar0 Have you come across productive setups that run all day in a high trust environment (e.g. healthtech)? I am working on sandboxing my agents and want to find the balance between speed & downside risk. I want to avoid the surprise data leak from a indirect prompt injection attack

243

Evan Harris

@Evan__Harris

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users