DARKNAVY

23 days ago

Coding agent hacking series 3/3: Cursor. The "Auto-Run in Sandbox" mode of @cursor_ai is great: user-friendly, convenient, and supposedly safer. But just like Codex CLI, following content from a remote URL can chain vulnerabilities from prompt injection to unauthorized command execution outside the sandbox, without further user approval under this mode.

1

34

6

12

7K

23 days ago

Coding agent hacking series 2/3: Codex CLI. It looks seriously secure: sandboxing by default, built in Rust, reviewed by top LLMs from @OpenAI. But in our latest demo, one web fetch can chain multiple vulnerabilities from prompt injection to unauthorized command execution outside the sandbox in one shot!

3

55

9

34

9K

23 days ago

Coding agent hacking series 1/3: Claude Code. @AnthropicAI is building impressively powerful cyber models like Mythos. However, their core coding product can still stumble on security boundaries beyond prompt injection. Our demo shows how web content exploring can be chained with other vulnerabilities to bypass permission checks and execute attacker's commands without your approval ;)

3

48

7

37

8K

24 days ago

The original video has a “text” issue so we re-uploaded everything. Thanks to @roddux for pointing it out!(4/3)

0

16

0

1

3K

24 days ago

We obtained root privilege on the S26 (Exynos 2600 Chipset), the latest flagship smartphone from Samsung. To our knowledge, this is the first root exploit for Exynos S26 since Samsung removed bootloader unlocking option in One UI 8. It is exploitable from APP context, so we make a cmd wrapper app for demo👇(1/n)

14

326

66

133

32K

24 days ago

We are committed to freeing human researchers from tedious, repetitive tasks so they can focus on real innovation. Stay tuned for our upcoming release of an AI-powered, end-to-end security research platform! (3/3)

1

27

0

3

3K

about 1 month ago

Our AI Agent popped a root shell on Ubuntu 26.04 on the first day it was released :)

33

747

97

237

561K

DarkNavyOrg retweeted

2 months ago

😃Just got a bug confirmed on @immunefi that we found using our client auditor skill.

6

94

4

6

8K

DarkNavyOrg retweeted

2 months ago

On 2026-03-27 03:40:34 PM +UTC, the #EST token / BNBDeposit system on #BSC was exploited through a **flash-loan-assisted reward-accounting flaw** in `BNBDeposit`, amplified by **fee-exempt routing and pair-state manipulation** in EST. Based on our exploit investigation skill: https://t.co/6eaF5Bnzl2 Check threads for specific code illustration.

2

18

3

21

5K

2 months ago

iOS/macOS 26.4 addresses two vulnerabilities we reported before. Both were discovered by our under-development AI agentic system, which is capable of processing both binary and source code ;)

DarkNavyOrg's tweet photo. iOS/macOS 26.4 addresses two vulnerabilities we reported before. Both were discovered by our under-development AI agentic system, which is capable of processing both binary and source code ;) https://t.co/aALkgtIqMJ

2

83

10

27

13K

DarkNavyOrg retweeted

2 months ago

Over the past few weeks we've been building AI-powered security skills for Web3, covering smart contract auditing, blockchain client auditing, and onchain exploit investigation. Here is the skills repo👇 https://t.co/QTyAsDjYZo

2

58

10

65

15K

2 months ago

We've just open-sourced a preview version of our agent skills for Web3 security! Enjoy your playing :)

2 months ago

These skills have helped us earn $21K on Immunefi @immunefi and independently discover a vulnerability in rippled @XRPLF @RippleXDev, the XRP Ledger's core node software, that was officially patched. Every exploit breakdown we've posted before was built with these skills.

Defi_Nerd_sec's tweet photo. These skills have helped us earn $21K on Immunefi @immunefi and independently discover a vulnerability in rippled @XRPLF @RippleXDev, the XRP Ledger's core node software, that was officially patched.

Every exploit breakdown we've posted before was built with these skills. https://t.co/5yVX6Pf4OR

1

8

1

6

5K

0

17

0

11

4K

3 months ago

Our AI agent researcher @Defi_Nerd_sec is delivering in Web3! Although this case was flagged as a duplicate, the agent independently generated a working exploit, going beyond discovery and into execution. Cases like this suggest AI-driven workflows are beginning to cover a much larger share of the exploit chain, putting pressure on the security posture of the entire industry. Glad to see it addressed! @XRPLF @RippleXDev Full credit to the original reporter as well👍

DarkNavyOrg's tweet photo. Our AI agent researcher @Defi_Nerd_sec is delivering in Web3! Although this case was flagged as a duplicate, the agent independently generated a working exploit, going beyond discovery and into execution.

Cases like this suggest AI-driven workflows are beginning to cover a much larger share of the exploit chain, putting pressure on the security posture of the entire industry.

Glad to see it addressed! @XRPLF @RippleXDev
Full credit to the original reporter as well👍

XRP Ledger Foundation

@XRPLF

3 months ago

XRP Ledger Software version 3.1.2 is available. This version is fixing an edge case that can cause outages on public facing nodes. Please update your nodes as soon as possible to this new version. More details in the release notes: https://t.co/iFl4HMxTN7

XRPLF's tweet photo. XRP Ledger Software version 3.1.2 is available.

This version is fixing an edge case that can cause outages on public facing nodes.

Please update your nodes as soon as possible to this new version.

More details in the release notes:
https://t.co/iFl4HMxTN7 https://t.co/vAPdv9hTwU

19

265

67

10

52K

0

13

2

10

4K

3 months ago

The bug being exploited was identified during our evaluation of the internal AI Agent, which automatically submit some of the findings with PoCs. Very surprised to see @osec_io take it to the another level! Also look forward to AI automatically generating such complex exploits.

OtterSec

@osec_io

3 months ago

We achieved a guest-to-host escape by exploiting a QEMU 0-day where the bytes written out of bounds were uncontrolled. Full breakdown of the technique, glibc allocator behavior, and our heap spray/RIP-control primitive ↓

osec_io's tweet photo. We achieved a guest-to-host escape by exploiting a QEMU 0-day where the bytes written out of bounds were uncontrolled.

Full breakdown of the technique, glibc allocator behavior, and our heap spray/RIP-control primitive ↓ https://t.co/eOAoEjRFrH

8

545

113

280

48K

0

25

3

9

7K

3 months ago

Hi @thezdi @OpenAI, asking for the rules of Pwn2Own26 Coding Agent directory, particularly the "interact with ... repository" If a user opens someone else's git repo using CodeX App with default permissions and is immediately RCE’d, does this fall within the threat model? :)

12

141

10

64

87K

3 months ago

And there is a "traditional" sequel to this AI-native smartphone research: we just achieved a full Android kernel exploit, gaining root on the Doubao Phone with CVE-2026-21385 😎⚡️