PhD researcher on human-machine teaming in cybersecurity @UniofOxford
I like hacking LLMs, philosophy of science, cryptography, p2p networks, direct action
🚨Paper Alert: https://t.co/6vTTxglUw9
There’s an AI cybersecurity crisis brewing and its related to both the fundamental properties of AR-LLMs and how they are integrated across our tech.
With @OpenAI new GPTs release this is the best time to talk about AI vulnerabilities 1/10
some Microsoft employees report seeing ChatGPT blocked briefly today. Microsoft IT posted guidance to employees that it was planning to block certain AI tools, and included a screenshot of ChatGPT being blocked. Now that block has been reversed 🫠
I’ve seen the paper. It’s not a jailbreak. It was Defense Oriented Prompting (DOP), capabilities defenders need. My thoughts on the hasty Export Controls that made Anthropic pull Fable. If Nat defense is the goal, this just scored an own goal against us
https://t.co/DfiAKEylSV
The #ICML2026 camera-ready version of our position paper on the semantic status of the intermediate tokens produced by reasoning models, is now available https://t.co/6b9hP9ASh9
We built four malicious skills to test whether skill scanners actually work. Three took less than an hour to conceive and implement. ClawHub, Cisco, and Vercel's https://t.co/nUlnRcQWyG marked them as safe. 🧵
> install chrome in 2010 because it's faster
> let it save every password
> let it autofill cards, addresses, ssn
> google builds a file on you fatter than your therapist's
> "we don't sell your data" → FALSE
> data broker leak: 800m chrome profiles dropped on a torrent
> your meds, your porn, your debts, your affair: all of it
> insurance premium doubles
> loan denied, no reason given
> landlord ghosts you
> nobody owes you an explanation
> mfw "if you have nothing to hide you have nothing to fear"
Periodic reminder: the Shadow Brokers, masterful curators of one of the most devastating technical intelligence leaks ever, will turn ten years in August — and we still don't know who they were.
“The vulnerability with the highest CVSS score in this month’s update is a critical remote code execution flaw in the Microsoft Devices Pricing Program. CVE-2026-21536 (CVSS score: 9.8), per Microsoft, has been fully mitigated [...] Artificial intelligence (AI)-powered autonomous vulnerability discovery platform XBOW has been credited with discovering and reporting the issue.” https://t.co/w9hhiuot2R
The window between vulnerability disclosure and real-world exploitation keeps shrinking.
The Zero Day Clock visualizes how fast attackers are operationalizing new CVEs. What used to take months now often happens in days, or hours.
The future needs to be Secure by Design.
https://t.co/zFXOSKB7eq
#AppSec #CyberSecurity
Kali just published a guide on piping pentesting tools through Claude's API and didn't mention data security once. You're sending scan results, target info, and potentially sensitive findings to a third party LLM. "The Most Advanced Penetration Testing Distribution" should probably mention that.
https://t.co/HBLYd09cjz
>OSS everyone rely on get too many PRs from bots
>OSS teams start using Claude Security to review PRs
>Malicious PR w/ prompt injection in tests
>PR marked safe by CS
>Human reviewer read CS report instead of diff
>Report sounds good, merges the PR
>OSS repo compromised, cascade
Introducing Claude Code Security, now in limited research preview.
It scans codebases for vulnerabilities and suggests targeted software patches for human review, allowing teams to find and fix issues that traditional tools often miss.
Learn more: https://t.co/n4SZ9EIklG
The universal pattern
1. Attacker controls target server
2. Server returns payload in banner/HTTP/DNS/cert
3. AI tool captures raw output
4. Output enters LLM prompt with ZERO sanitization
5. LLM follows attacker's instructions
Pick your favorite AI pentest tool off GitHub untrusted external data treated as LLM instructions.