winfunc

19 days ago

We're doing an experiment with open models @winfunction to see how far we can push them to find vulns in hardened targets. So far: - $4.5K in bounties from Chrome VRP with a few more pending, with the scans costing less than $100. - 2 CVEs in NGINX (CVE-2026-28755 & CVE-2026-42926). And watch out for the next release! - And 60ca500faea0fc70816bb9c53af3815e2af3e6c962b4b4ea63c33c62ebb4240d 👀 We're writing a blog on this soon.

mufeedvh's tweet photo. We're doing an experiment with open models @winfunction to see how far we can push them to find vulns in hardened targets. So far:

- $4.5K in bounties from Chrome VRP with a few more pending, with the scans costing less than $100.

- 2 CVEs in NGINX (CVE-2026-28755 & CVE-2026-42926). And watch out for the next release!

- And 60ca500faea0fc70816bb9c53af3815e2af3e6c962b4b4ea63c33c62ebb4240d 👀

We're writing a blog on this soon.

5

101

13

46

12K

winfunction retweeted

about 2 months ago

During our YC (@ycombinator S24) batch, we had the awesome opportunity to meet @paulg and talk about what we're building: An autonomous AI hacker. To showcase a fun demo, I remember opening my laptop in the Uber to his home and challenging our agents to find vulnerabilities in the old HackerNews codebase written in Arc. For those unfamiliar, Arc is a programming language designed by PG and Robert Morris. And the old HN codebase is written in Arc. We only got to talk about it with him but we just redid the experiment with our improved harness for fun! And we wrote a blog about it: https://t.co/IxVhtqDjSg

1

17

2

1

1K

about 2 months ago

@cramforce oh good idea, just initiated an audit on just-bash!

0

17

about 2 months ago

Vulnerability benchmarks rot. Cases leak into training data, scores measure memorization. We built N-Day-Bench: tests LLMs on finding real vulnerabilities in real repos, refreshed monthly from live GitHub advisories. Blinded judging. All traces public. Very interestingly, the latest model from @Zai_org, GLM 5.1 performs really well! Link: https://t.co/K3foq0DfMt

winfunction's tweet photo. Vulnerability benchmarks rot. Cases leak into training data, scores measure memorization.

We built N-Day-Bench: tests LLMs on finding real vulnerabilities in real repos, refreshed monthly from live GitHub advisories. Blinded judging. All traces public.

Very interestingly, the latest model from @Zai_org, GLM 5.1 performs really well!

Link: https://t.co/K3foq0DfMt

2

7

3

839

winfunction retweeted

about 2 months ago

Vulnerability benchmarks rot. Cases leak into training data, scores measure memorization. We built N-Day-Bench: tests LLMs on finding real vulnerabilities in real repos, refreshed monthly from live GitHub advisories. Blinded judging. All traces public. Very interestingly, the latest model from @Zai_org, GLM 5.1 performs really well! Link: https://t.co/K3foq0DfMt

2

7

3

839

about 2 months ago

Currently testing GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, GLM-5.1, and Kimi K2.5. Every run publishes the full audit trail — shell commands, judge rationale, curator answer key, sandbox history. If a score looks wrong, you can trace it to a specific shell session on a specific line of code. Results: https://t.co/JGMQZGhajy

0

242

about 2 months ago

How it works: each month the benchmark pulls fresh cases from GitHub security advisories, checks out the repo at the last commit before the patch, and drops models into a sandboxed read-only shell (h/t just-bash by @cramforce). The model never sees the fix. It starts from sink hints and has to trace the bug through actual code. Only repos with 10k+ stars qualify. A diversity pass prevents any single repo from dominating the set. Ambiguous advisories (merge commits, multi-repo references, unresolvable refs) are dropped. Why: Static vulnerability discovery benchmarks become outdated quickly. Cases leak into training data, and scores start measuring memorization. The monthly refresh keeps the test set ahead of contamination — or at least makes the contamination window honest.

2

1

0

235

winfunction retweeted

2 months ago

New CVE in NGINX - CVE-2026-28755 NGINX stream module allows TLS handshake to succeed with revoked client certificates when ssl_ocsp on is configured. This vulnerability was autonomously discovered by Winfunc's AI agent. Read the write-up here: https://t.co/qiS50Lqgj9

1

9

2

1

580

publiclyDisclosed @disclosedh1

2 months ago

New CVE in NGINX - CVE-2026-28755 NGINX stream module allows TLS handshake to succeed with revoked client certificates when ssl_ocsp on is configured. This vulnerability was autonomously discovered by Winfunc's AI agent. Read the write-up here: https://t.co/qiS50Lqgj9

1

9

2

1

580

winfunction retweeted

4 months ago

Node.js disclosed a bug submitted by @winfunction: https://t.co/qxaaJuXgFM #hackerone #bugbounty

0

3

0

1K

winfunction retweeted

AISecHub

@AISecHub

4 months ago

The Recent CVEs in React and Node.js Were Found by an AI - https://t.co/8JgMMqFICc In December 2025 and January 2026, an AI system autonomously discovered zero-day vulnerabilities in Node.js and React, two of the most widely deployed JavaScript runtimes and frameworks in the world. This post documents how these vulnerabilities were found, the technical details of the flaws, and what this means for the future of security research.

0

20

4

9

2K

winfunction retweeted

/r/netsec @_r_netsec

4 months ago

The Recent 0-Days in Node.js and React Were Found by an AI https://t.co/cJP7cgKbTT

0

98

14

69

7K

winfunction retweeted

4 months ago

New blog post: The Recent 0-Days in Node.js and React Were Found by an AI Covering the discovery of 0-days with AI, its implications, and "AI slop". Have a read. https://t.co/jAL6rGGTDx

0

11

4

2

763

4 months ago

we've a few more up our sleeves. soon. 👾

4 months ago

A new vulnerability in React Server Components (CVE-2026-23864) was disclosed today. One of the DoS vectors was discovered by me with the help of an AI agent @winfunction. Other vectors were also discovered by @ryotkak et al. All users should upgrade to a patched version as soon as possible. https://t.co/mFdceNi63H

5

50

18

3

4K

0

4

0

418

winfunction retweeted

4 months ago

A new vulnerability in React Server Components (CVE-2026-23864) was disclosed today. One of the DoS vectors was discovered by me with the help of an AI agent @winfunction. Other vectors were also discovered by @ryotkak et al. All users should upgrade to a patched version as soon as possible. https://t.co/mFdceNi63H

5

50

18

3

4K

winfunction retweeted

5 months ago

🚨 CVE-2026-21636 in Node.js (@nodejs) Node.js permission model bypass via unchecked Unix Domain Socket connections (UDS) This vulnerability was autonomously discovered by https://t.co/Ym7gcZXFen, an AI agent that can find, exploit, and patch security vulnerabilities in codebases. Thanks to @_rafaelgss for triaging and fixing the issue.

winfunction's tweet photo. 🚨 CVE-2026-21636 in Node.js (@nodejs)

Node.js permission model bypass via unchecked Unix Domain Socket connections (UDS)

This vulnerability was autonomously discovered by https://t.co/Ym7gcZXFen, an AI agent that can find, exploit, and patch security vulnerabilities in codebases.

Thanks to @_rafaelgss for triaging and fixing the issue.

1

16

7

5

2K

5 months ago

Node.js Security Release Bulletin: https://t.co/FtfoBvx6dC

0

2

0

261

5 months ago

🚨 CVE-2026-21636 in Node.js (@nodejs) Node.js permission model bypass via unchecked Unix Domain Socket connections (UDS) This vulnerability was autonomously discovered by https://t.co/Ym7gcZXFen, an AI agent that can find, exploit, and patch security vulnerabilities in codebases. Thanks to @_rafaelgss for triaging and fixing the issue.

1

16

7

5

2K