Needle in the haystack: LLMs for vulnerability research
I've distilled my experience of sending thousands and thousands of prompts for using LLMs to discover vulnerabilities into a single write-up.
These are the conclusions I came to..
(link in comment)
Arbitrary code execution in objdump -g
We have a thing for finding bugs in bug finding tools. IDA Pro, Ghidra, Binja Sidekick, or radare2. You name it we hacked it. Our friends were saying we should try objdump. So here we go.
Blog post: https://t.co/C8BgkW5KoE
AI-generated PoC and writeup: https://t.co/kWJnryHAtn
found a stack out-of-bounds read in the Linux kernel's nftables pipapo set backend (CVE-2026-43453, CVSS 7.1). I found it by looking for a specific pattern that I think is underhunted, so I want to talk about the methodology as much as the bug.
the pattern: function calls where one argument is a boundary-dependent expression and another argument is a flag that makes the callee skip using it. in C, this is a trap. the callee's early return makes every reviewer think the dangerous argument is inert. it is not. C evaluates all arguments at the call site before the function is invoked. the callee's control flow has no jurisdiction over argument evaluation. so you get these call sites that look safe, that have been reviewed and re-reviewed and look safe every time, because the question everyone asks is "is this value used?" and the answer is no. the question that matters is "is this value evaluated?" and nobody asks it because in most languages it's the same question.
so I started grepping function calls where an argument indexes an array, and a separate argument is a boolean that triggers an early return in the callee. the kind of code where someone wrote a guard clause and everyone downstream trusted it to cover the arguments too. it doesn't. it can't. the arguments are already computed.
pipapo_drop() in nft_set_pipapo.c:
pipapo_unmap(f->mt, f->rules, rulemap[i].to, rulemap[i].n, rulemap[i + 1].n, i == m->field_count - 1)
on the last iteration, i == field_count - 1. rulemap[i + 1].n reads past the end of a stack-allocated array of 16 entries. pipapo_unmap() checks is_last, returns immediately, never touches the value. the value is already read. the OOB is in the caller's scope. five years of this code in production and every review pass concluded "the function doesn't use it" which is true and also completely beside the point.
the reason I think this pattern is underhunted: static analyzers flag unused variables and unchecked return values but I haven't seen one that asks "is this argument expression legal in the caller's scope given that the callee might not use it?" the safety of the expression depends on the callee's behavior, but the evaluation of the expression doesn't. that gap is where bugs live for years. maybe decades. the callee being careful is what makes the bug invisible. the better the function handles its arguments, the longer the OOB at the call site survives review. that's perverse. the code's own correctness is camouflaging the bug.
when field_count is 16 (NFT_PIPAPO_MAX_FIELDS, the max), rulemap[16].n is real stack OOB. you're reading whatever the kernel left on the stack before your frame. smaller field counts get you uninitialized entries in your own array instead, which is a different flavor of wrong but still wrong. and this isn't some exotic race you trigger with three threads and a prayer. it's the normal path. every element expiration, every deletion. the kernel's own GC walks into it on a timer.
KASAN on 7.0.0-rc2 aarch64 confirmed it: Read of size 4 at addr ffff8000810e71a4. one stack object, [32, 160) 'rulemap', buggy address at offset 164. array is 128 bytes. read is 4 bytes past the end. rulemap[16].n. worked the offset math on paper beforehand.
PoC: pipapo set with NFT_SET_INTERVAL | NFT_SET_CONCAT | NFT_SET_TIMEOUT, 16 concatenated 4-byte fields. insert element, 1-second timeout. wait. insert another to trigger nft_pipapo_commit() → pipapo_gc() → pipapo_drop() → OOB. no heap shaping. no race. the kernel GC walks into it on a schedule.
reported to [email protected]. Willy Tarreau forwarded to netfilter maintainers. Florian Westphal reviewed, confirmed, asked for a readability tweak. the fix:
last ? 0 : rulemap[i + 1].n, last
I think there are more of these in the kernel. any function that takes a flag argument and an expression argument where the flag makes the expression unnecessary. every one of those call sites is a candidate for an OOB or an uninitialized read that no reviewer will catch because the callee's guard clause is too convincing. the code review feedback loop is broken for this pattern. the only reliable way to find them is to stop reading the callee entirely and ask whether every argument is legal to evaluate in the caller's scope, regardless of what the function plans to do with it.
patched in stable 5.10–6.19.
One of our security researchers demonstrated a local root shell on Linux using a page-cache poisoning primitive in AF_RXRPC’s RxGK path.
We call it DirtyCBC: a sibling to DirtyFrag in the broader CopyFail / DirtyFrag / Fragnesia family. The issue is fixed on mainline.
The candidate path was surfaced through Delphos’s agentic analysis workflow, then manually verified and exploited end to end.
AES-256 was not broken. It just wasn’t the boundary that mattered.
RxGK decrypted data in place before authentication completed.
Under the right conditions, that write could land in the page cache. The HMAC check still failed and the connection was aborted, but the page-cache mutation had already happened.
Two RESPONSE packets were enough to place a tiny ELF into the cached first page of a readable SUID-root binary. The file on disk stayed unchanged. The next exec produced a root shell.
Full writeup and PoC on the Delphos Labs GitHub.
https://t.co/gmCEub1v2t
This bug is a variant path that became active after one of the "Dirty Frag" patches, "f4c50a4034e6".
The actual window of vuln is only about "9 days", and creation of an unpriv userns is a prerequisite.
To distros and Linux users: the patch proposed in the referenced write-up still does not cover at least one variant, __pskb_copy_fclone, so applying that patch alone does not prevent LPE.
I have submitted a follow-up patch addressing this additional variant:
https://t.co/RznhP54CD7
I'm also accelerating further analysis and testing on my end. I'll post updates as more results come in.
💥 Introducing "Dirty Frag"
A universal Linux LPE chaining two vulns in xfrm-ESP and RxRPC. A successor class to Dirty Pipe & Copy Fail.
No race, no panic on failure, fully deterministic. ~9 years latent.
Ubuntu / RHEL / Fedora / openSUSE / CentOS / AlmaLinux, and more.
Even if you've applied the "Copy Fail" mitigation, your Linux is still vulnerable to "Dirty Frag". Apply the Dirty Frag mitigation.
Details:
https://t.co/9nqku4svkY
Pack2TheRoot - CVE-2026-41651
TOCTOU race in PackageKit's transaction handler. Any local unprivileged user can install arbitrary packages as root with no authentication
If your system has PackageKit (which almost every modern Ubuntu/Fedora/Debian desktop and many servers do) it’s vulnerable out of the box
Repo: https://t.co/1Lm6gSoFIH
Research: https://t.co/xlXPDjneXN
#cybersecurity #linux
Patch your Linux boxes!
https://t.co/VWOUDbLAn2 is a trivially exploitable logic bug in Linux, reachable on all major distros released in the last 9 years. A small, portable python script gets root on all platforms.
Found by the teams at @theori_io and @xint_official
More details below
https://t.co/9f6T96PvPX
In March, our pipeline discovered a critical vulnerability in the Linux kernel’s netfilter subsystem. We exploited this vulnerability and earned $10,050 in kCTF. In this post, we walk through the technical details of the vulnerability and the exploit. Link post below
One bit flip to corrupt it all:
Exploitation of an old Linux kernel vulnerability using PageJack, a modern technique to create Use After Free bugs.
Here @AzazheI shows you how
https://t.co/MLKX0pykhe
I'm excited to finally share Chronomaly, a kernel exploit for Android and Linux kernels 5.10.x using CVE-2025-38352.
As a reminder, please patch your Android devices if you haven't already!
I recommend getting some 🍿 before reading this post 👀
All links in the thread below:
Local Privilege Escalation in Fedora Linux (CVE-2025-12744).
Red Hat has made this public, but patches aren't out yet. Reliable root for any local user.
Mitigation: sudo systemctl disable --now abrtd
Writeup + PoC: https://t.co/iLURNVyppZ
I published a deep-dive blogspot about a Linux kernel bug #CVE-2022-0847 (#DirtyPipe), covering the root cause, exploit mechanics, and the patch. Feedback welcome.
https://t.co/ueLjLsAdrB
I analysed a recent KernelCTF 1-Day (CVE-2025-39965) which is a UaF in Linux XFRM subsystem, including the XFRM internals, the patch-fix, vulnerability analysis, along with a trigger PoC. Enjoy!
Blog: https://t.co/IhfKiGAmkZ
PoC: https://t.co/EHtvcD9gNn
Say hello to Eternal Tux🐧, a 0-click RCE exploit against the Linux kernel from KSMBD N-Days (CVE-2023-52440 & CVE-2023-4130)
https://t.co/Cbk9MBo91v
Cheers to @u1f383 for finding these CVEs + the OffensiveCon talk from gteissier & @laomaiweng for inspiration!
Here's my writeup for kinc from BlackHat MEA 2025 Quals.
It introduces Dirty Pageflags, a new exploitation technique inspired by Dirty Pagetable.
Big thanks to Dronex for the great discussions!
https://t.co/mXw6ujgDMr
Check out our teammate's retrospective on his kernel fuzzing summer: https://t.co/Gnl51l4j6F
It covers a net/sched fuzzing quirk (inspiring HangBuzz101 in corCTF), and summarizes the kernelCTF adventures of our team.
Another really cool exploit write-up will drop this Sunday 👀
My new article: "Kernel-hack-drill and a new approach to exploiting CVE-2024-50264 in the Linux kernel"⚡️
I tell a bug collision story and introduce my pet project kernel-hack-drill, which helped me to exploit the hard bug that received @PwnieAwards 2025
https://t.co/0DJzCJYEfm