New on the Engineering Blog: The access and permissions we grant agents should evolve with their capabilities. In our own products, we set these parameters through sandboxing, which limits the scope of any potentially destructive actions.
Read more: https://t.co/KfBKW8O9kP
We’ve shipped a security-guidance plugin for Claude Code that helps identify and fix vulnerabilities as you’re writing code.
Available for all Claude Code users. Install from the plugin marketplace (/plugins).
Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software.
It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans.
https://t.co/NQ7IfEtYk7
New on the Engineering Blog: How we designed Claude Code auto mode.
Many Claude Code users let Claude work without permission prompts. Auto mode is a safer middle ground: we built and tested classifiers that make approval decisions instead.
Read more: https://t.co/dpcMcWMf5k
New in Claude Code: auto mode.
Instead of approving every file write and bash command, or skipping permissions entirely, auto mode lets Claude make permission decisions on your behalf.
Safeguards check each action before it runs.
Introducing Claude Code Security, now in limited research preview.
It scans codebases for vulnerabilities and suggests targeted software patches for human review, allowing teams to find and fix issues that traditional tools often miss.
Learn more: https://t.co/n4SZ9EIklG
Check out the security-guidance plugin that I worked on in this launch! It automatically injects security guidance if Claude uses potentially dangerous libraries or functions. This is an early experiment, but we already have data showing this helping Claude write more secure code
Today we’re introducing Claude Code Plugins in public beta.
Plugins allow you to install and share curated collections of slash commands, agents, MCP servers, and hooks directly within Claude Code.
Today we’re introducing Claude Code Plugins in public beta.
Plugins allow you to install and share curated collections of slash commands, agents, MCP servers, and hooks directly within Claude Code.
We’re at an inflection point in AI’s impact on cybersecurity.
Claude now outperforms human teams in some cybersecurity competitions, and helps teams discover and fix code vulnerabilities.
At the same time, attackers are using AI to expand their operations.
@IceSolst One thing that could be interesting to experiment with: instructing Claude (via custom-security-scan-instructions) to mark opt out comments like these as vulnerabilities. It surely isn't a silver bullet, but if this in your threat model this might be a good additional mitigation
@IceSolst Cool demo! I definitely agree that Claude can be a little overly trusting of maliciously written code or comments, which is why we recommend this always be paired with a human review (even from a non-security expert) who can flag the prompt injection here.
@IceSolst It's definitely important to find ways to manage the costs here. One good solution is requiring workflow run approval for external PRs, see https://t.co/449xIYUZ3n. I'll make sure to get our docs updated to mention this!
Got nerdsniped by the new Claude Code security review tool, here’s a deep dive:
@AnthropicAI implemented their own SAST tool as a Python wrapper around the @claudeai API. It can run locally (in CC) or within Github actions to focus on PRs.
Tests I ran:
1. It found Heartbleed!
CVE-2014-0160 was a missing bounds check in OpenSSL’s ssl/t1_lib.c that caused memory leaks.
I reverted to a commit before the fix in 96db9023b881d7cd9f379b0c154650d6c108e9a3
And gave Claude one command:
/security-review "Making no assumptions about this codebase, look at the ssl/t1_lib.c file specifically, and identify potential buffer overflows and missing bounds checks"
It was able to find it, and then looked at git log to see that this was eventually fixed.
2. OWASP Juice Shop
Ran it within the codebase, it understood what the repo was, how it worked, and by default did not list any vulnerabilities, since it said in this context they are all purposeful, working as intended.
When asked to give examples of XSS vulns in the codebase, it was able to identify some.
3. Running it in CI as a GH Action on my own code
Adding the workflow is easy:
Note you need to provide it with a separate Claude API key, which you can generate in the Anthropic Console, and add in Github > Repo settings > Security > Secrets > Actions > New
Then I opened a PR with a mix of python, node, and ruby, and it found most issues:
- Found the easy ones like xss, sqli, ssrf
- Found an auth bypass (nice!)
- Found verbose pw logging (great!)
- Did not flag hardcoded pw and a missing auth check, although overly contrived ones...
4. How to improve it: Add Semgrep
There’s an opportunity to pair this up with the @semgrep MCP. Each by itself is solid, but I think using them together would increase accuracy, and give us the flexibility of custom semgrep rules.
Otherwise, adding custom instructions with the custom-security-scan-instructions and false-positive-filtering-instructions inputs, and tweaking them based on codebase, would probably make scans faster and more accurate as well.
Particularly excited for this launch — Claude Code can now review your code for security vulnerabilities. We're using this internally at Anthropic and it's already caught issues before we shipped them.
this started as a hackathon project that we used ourselves to find vulns!
In the next 2 years, the world might 10/100/1000x the code it puts out.
The only way to keep up is by using models to make it secure before it ever becomes a problem
We just shipped automated security reviews in Claude Code. Catch vulnerabilities before they ship with two new features:
- /security-review slash command for ad-hoc security reviews
- GitHub Actions integration for automatic reviews on every PR
Excited to present Security Signals with @ddworken and @we1x, my primary project at Google for the past five years. Thanks, @madwebwork!
Paper: https://t.co/8d6V5HaYHE
Slides: https://t.co/WTGY9UcL2U
"This blog post aims to provide a detailed blueprint for how Google has created and deployed a high-assurance web framework that almost completely eliminates exploitable web vulnerabilities."
https://t.co/7TsK8ZFRrO