While it's certainly possible Mythos could have found the same bugs if its operator invested more time and focus on HtmlSanitizer, most defenders don't have access to Mythos. I sure didn't.
To be terse: Less FOMO, more resourecefulness.
If you're curious, the skills I used to find these bugs and more are still internal-only (for now), but they build atop Trailmark.
https://t.co/cVQyWmJz9k
RSA private keys biased toward 0 bits can be factored by swapping a hard math problem for an easy one: integer factorization becomes polynomial factorization.
We found hundreds of real-world keys vulnerable to this. Many traced to a type mismatch in CompleteFTP (now patched): each 32-bit limb got only 8 bits of randomness. We recovered 603 RSA and 74 DSA private keys. https://t.co/C2jcxVW9WG
Can't wait to show everyone what I've built atop this little tool :)
Trailmark is a cool foundational piece of code, but it's not the headliner. That will come later :D
.@obsdmd asked us to audit their Sync protocol. Our engineers delivered eleven findings.
Five went above and beyond the original scope and found system-level issues that weren't specific to Sync itself.
We see this pattern often with our clients. We respect scope as a delivery contract, but we have a professional obligation to surface what our engineers see.
Anything they catch is flagged, and the client decides what to do. When a finding warrants it, the report includes an Exploit Scenario, the path from observation to working exploit. We take an attacker's mindset, and exploit scenarios show our clients what a bug costs them.
With security-first teams like Obsidian, that meant five system-level findings that were either patched or explicitly acknowledged:
1. Math.random used for password and salt generation (High severity, medium difficulty)
2. Variable-time comparison of password-reset tokens and MFA recovery codes (High severity, high difficulty)
3. TOTP codes replayable within the validity window (High severity, high difficulty)
4. Plaintext storage of MFA secrets and recovery codes (High severity, medium difficulty)
5. Password reset without MFA (Medium severity, medium difficulty)
‼���🚨 Pwn2Own Berlin 2026 just hit a wall. For the first time in 19-years, ZDI rejected dozens of working zero-day RCE submissions because organizers ran out of contest slots.
Rejected hackers are now going public with PoC demos and direct vendor disclosures, breaking Pwn2Own's usual secrecy.
▪️ AI surfaces a massive wave of 0-day RCEs.
▪️ Submissions overwhelm ZDI past max capacity.
▪️ Slots run out. Researchers with working chains get rejected.
▪️ "Revenge disclosures" begin. ← we are here.
Confirmed casualties so far:
▪️ @xchglabs : 86 vulnerabilities prepared (PyTorch, NVIDIA, Linux KVM, Oracle, Docker, Ollama, Chroma, LiteLLM, llama.cpp). All rejected. Now reporting directly to vendors with writeups dropping as patches land.
▪️ @ggwhyp : full-chain Firefox RCE on Windows. Rejected. Publicly demoed (HTML page → cmd.exe → calc.exe). Responsibly disclosed to Mozilla.
▪️ @yunsu_dev : working RCE chain, rejected. Submitting elsewhere.
▪️ @ryotkak : tried to register for 3+ weeks. ZDI confirmed "at maximum capacity, can't add extra contest days." Considered canceling flight and hotel.
▪️ @anzuukino2802 : Claude Code RCE PoC. Rejected.
▪️ @desckimh : 0-day RCEs in Ollama and LM Studio. Rejected.
Reported impact: a community-estimated 150+ researchers tried to register. Accepted contestants are now being warned about collisions. Rejected vulnerabilities going to bug bounty programs may trigger pre-event patches that invalidate the work of those who got in.
ZDI has not publicly addressed the capacity issue. The event still runs May 14-16 in Berlin.
We beat Google's zero-knowledge proof of quantum cryptanalysis by exploiting bugs in their Rust ZKP code, then forged a proof with better metrics. Plus 11 new public reviews, Trailmark, MuTON and mewt, dimensional analysis, and more. May Tribune: https://t.co/1i8R9LmMVQ
@caseyjohnellis If you're using LLMs to generate vuln reports and you're not spending 95% or more of your effort on validating findings, writing/testing PoCs, and understanding the project's threat model and the potential impact of your finding, you are doing it wrong.
@caseyjohnellis The problem with naming/shaming is that it creates perverse incentives in unexpected ways.
Used to run a HackerOne program that immediately publicly disclosed bogus issues. Got spammed by trolls spewing racist memes hoping to get a signal boost.
The amount of squabbling over bugs, bug quality, AI bug extermination, how security is doomed/not doomed/unchanged/improved based on bugs… it’s ridiculous. Bugs are not the totality of cybersecurity.