I kinda feel bad for the tech bros in cyber security that spend 8-10 hours a day writing agents, prompts, solving hard problems, and handholding AI all day long all to say that AI did all the work completely on its own and fully unguided when they find a good bug.
I've been seeing posts all over about the state of CTFs post-LLM. I've seen many attempts to explain why this is just a new evolution of CTFs, but I fundamentally disagree. I believe the original spirit is gone and I've written why in my blog.
https://t.co/tgUZOGkhGV
🚨 BREAKING: Wiz Research discovered Remote Code Execution on https://t.co/SvN2lGsnbO with a single git push
The flaw in @github allowed unauthorized access to millions of repositories belonging to other users and organizations 🤯
@seanhn I am struggling to answer this question too. Aggressive testing would help preserving "software semantic", but the price to pay is almost certainly monstrous codebases and inevitable software enshittification...
@m19o__@tjbecker So, in my experience it does. Especially when flooded by dozens of warnings with no backing, the ones with a poc that crashes one of their harnesses get prioritized. And, yes, exploits are the ultimate proof against stubborness, but my argument is that they shouldnt be needed.
@m19o__@tjbecker "How effective security controls are" != "Which bugs should be prioritized".
E.g., a validated sanitizer crash in the jit is more than enough motivation for patching. Using a full exploit for this is too expensive and too sensitive to FN.
THIS. Weirdly enough, it seems like AIxCC was an "Eureka moment" for few people. I remember many colleagues calling our work "slop" and now I see them being totally shocked. Tbh, I think AIxCC could be one of the main reasons why current models are so good at security.
So, @AnthropicAI used 350 Claude's rounds to create a full exploit chain for CVE-2026-2796 (which is really cool).
https://t.co/2UK8snyYb6
But, I think the cool part of the story is that to confirm that bug, you only need 13minutes and $11 to get some real evidence :D
Sometimes I think "why would a software security company raise $100M, how can you even spend that much money on engineering?"
Then I go to RSA and learn the money is for what really matters: paying for booths, ice cream, car wraps, baristas, parties, action figures, billboards..