dreadnode @dreadnode - Twitter Profile

Pinned Tweet

28 days ago

AI red teams today are stuck doing workflow engineering instead of finding vulnerabilities. Weeks spent on infrastructure, when they could be probing for security and safety risks. At the same time, traditional ML and generative AI security remain siloed across different libraries and tooling ecosystems, creating long-term operational and maintenance burden. We built an agentic AI red teaming system on the Dreadnode SDK to flip this narrative, accelerating testing from weeks to hours. Operators describe the objective in plain English; the agent handles attack selection, workflow generation, execution, and reporting. In our latest paper, we dive deep into the AI red team agent architecture, our methodology, the complete attack and transform catalog, the analytics pipeline… and then we pointed it at Meta's Llama Scout. The result: → 674 attacks, 573 findings, 7,727 trials → 232 critical vulnerabilities across 68 objectives → ~85% attack success rate → ~3 hours, zero human-written code AI red teaming today looks like software development before agent-assisted coding: skilled operators spending most of their time on infrastructure rather than on the work that requires their judgment. The transition isn't necessarily about replacing the operator. It's about moving the operator's expertise up a layer, from which Python function should I call ➡️ what's worth probing, what risks do we care most about, and what do the results mean for my AI strategy. Blog: https://t.co/ejfXVn4vUB Paper: https://t.co/7w62qeFSWg

dreadnode's tweet photo. AI red teams today are stuck doing workflow engineering instead of finding vulnerabilities. Weeks spent on infrastructure, when they could be probing for security and safety risks.

At the same time, traditional ML and generative AI security remain siloed across different libraries and tooling ecosystems, creating long-term operational and maintenance burden.

We built an agentic AI red teaming system on the Dreadnode SDK to flip this narrative, accelerating testing from weeks to hours. Operators describe the objective in plain English; the agent handles attack selection, workflow generation, execution, and reporting.

In our latest paper, we dive deep into the AI red team agent architecture, our methodology, the complete attack and transform catalog, the analytics pipeline… and then we pointed it at Meta's Llama Scout. The result:
→ 674 attacks, 573 findings, 7,727 trials
→ 232 critical vulnerabilities across 68 objectives
→ ~85% attack success rate
→ ~3 hours, zero human-written code

AI red teaming today looks like software development before agent-assisted coding: skilled operators spending most of their time on infrastructure rather than on the work that requires their judgment.

The transition isn't necessarily about replacing the operator. It's about moving the operator's expertise up a layer, from which Python function should I call ➡️ what's worth probing, what risks do we care most about, and what do the results mean for my AI strategy.

Blog: https://t.co/ejfXVn4vUB
Paper: https://t.co/7w62qeFSWg

3

88

33

76

5K

dreadnode

@dreadnode

1 day ago

🚨 Calling all AI red teams and AI security operators 🚨 Join @rdheeko and @moo_hax this Thursday (6/4) for a live session covering our agentic approach to AI red teaming. Come for a live assessment against a frontier model, stay to learn about the latest tools and methodologies to secure your AI systems. Tune in on X at 11 AM PT / 2 PM ET!

0

8

1

4

275

dreadnode

@dreadnode

1 day ago

0

5

0

1

105

dreadnode

@dreadnode

7 days ago

On today's episode of the @BBC's Outside Source, Dreadnode Staff AI Security Researcher Ads Dawson discusses AI's impact on ethical hacking alongside and fellow hackers/researchers @HackWitHerr and @Terrypcutler. Listen to the segment, starting at 26:33: https://t.co/LClOv0bwG3

1

2

1

220

dreadnode retweeted

Adam Chester 🏴‍☠️ @_xpn_

9 days ago

Been messing about with GEPA optimisation in Python after coming across it on @dreadnode platform... it's simple but amazingly effective. I'll write it up when I get a sec :D

_xpn_'s tweet photo. Been messing about with GEPA optimisation in Python after coming across it on @dreadnode platform... it's simple but amazingly effective. I'll write it up when I get a sec :D https://t.co/baxv9j9t2d

1

19

3

5

4K

dreadnode

@dreadnode

8 days ago

@_xpn_ excited for the write up 😤

0

130

dreadnode

@dreadnode

13 days ago

“Traditional AI red teaming frameworks require operators to spend time configuring attacks, transforms, scorers, datasets, and execution pipelines manually. Much of the workflow becomes a brute-force engineering exercise around library configuration rather than security and safety probing" — Raja Sekhar Rao Dheekonda Read about our latest AIRT research and the shift to agentic red teaming in @helpnetsecurity.

Help Net Security @helpnetsecurity

14 days ago

AI red teaming agents change how LLMs get tested - https://t.co/THaPs7kWEM - @dreadnode #AIsecurity #RedTeam #LLMsecurity #CyberSecurity #AISafety

0

3

1

0

1K

0

7

1

7

877

dreadnode

@dreadnode

27 days ago

Docs: https://t.co/SvjexrjAzh SCA Example: https://t.co/f7qKMtf4F1

0

1

0

247

dreadnode

@dreadnode

27 days ago

Building agentic systems for security means living with constant change. New threats, new tools, new orchestration patterns. Maintain speed and flexibility with Workers, an integration primitive that enables long-running background processes that connect agents to webhooks, external APIs, cron jobs, and the rest of your stack in a few lines of code. Head to our blog to read how we're moving past the chat loop and into real integration and orchestration. Full write-up + working source code analysis example: https://t.co/JWjk18kLtn

dreadnode's tweet photo. Building agentic systems for security means living with constant change. New threats, new tools, new orchestration patterns.

Maintain speed and flexibility with Workers, an integration primitive that enables long-running background processes that connect agents to webhooks, external APIs, cron jobs, and the rest of your stack in a few lines of code.

Head to our blog to read how we're moving past the chat loop and into real integration and orchestration.

Full write-up + working source code analysis example: https://t.co/JWjk18kLtn

1

16

6

5

1K

dreadnode retweeted

Martin Wendiggensen

@Dr_Machinavelli

28 days ago

Super excited to be presenting @dreadnode‘s newest research! Come hang out

0

13

1

2K

dreadnode

@dreadnode

28 days ago

Docs: https://t.co/K2RtlVqoba

0

1

0

2

280

dreadnode

@dreadnode

28 days ago

AI red teams today are stuck doing workflow engineering instead of finding vulnerabilities. Weeks spent on infrastructure, when they could be probing for security and safety risks. At the same time, traditional ML and generative AI security remain siloed across different libraries and tooling ecosystems, creating long-term operational and maintenance burden. We built an agentic AI red teaming system on the Dreadnode SDK to flip this narrative, accelerating testing from weeks to hours. Operators describe the objective in plain English; the agent handles attack selection, workflow generation, execution, and reporting. In our latest paper, we dive deep into the AI red team agent architecture, our methodology, the complete attack and transform catalog, the analytics pipeline… and then we pointed it at Meta's Llama Scout. The result: → 674 attacks, 573 findings, 7,727 trials → 232 critical vulnerabilities across 68 objectives → ~85% attack success rate → ~3 hours, zero human-written code AI red teaming today looks like software development before agent-assisted coding: skilled operators spending most of their time on infrastructure rather than on the work that requires their judgment. The transition isn't necessarily about replacing the operator. It's about moving the operator's expertise up a layer, from which Python function should I call ➡️ what's worth probing, what risks do we care most about, and what do the results mean for my AI strategy. Blog: https://t.co/ejfXVn4vUB Paper: https://t.co/7w62qeFSWg

3

88

33

76

5K

dreadnode

@dreadnode

28 days ago

Take the AI red team agent for a spin today; create a free account on the Dreadnode Platform to get started: https://t.co/4lmACPG94w

0

1

262

dreadnode retweeted

moo

@moo_hax

29 days ago

If you’re serious about it, network ops is a careful activity. It’s not an offline thing you should smash tokens against, no standing it up in dev, and there’s a whole apparatus to find and evict you. Have evals that you run every which way, over and over and over. Every new tool, every change to the prompt.

0

20

2

16

5K

dreadnode

@dreadnode

29 days ago

Recommended demo soundtrack: Stimulus - Vandelux https://t.co/5vmd7xU9Y5

0

3

0

2

841

dreadnode

@dreadnode

29 days ago

Real offensive cyber capability shows up in long-horizon, multi-host, repeatable evals. The kind we rarely see running at scale. Claude Opus 4.6 + our network ops agent compromised an entire GOAD variant Windows AD environment (DreadGOAD) in 54 minutes, with one simple prompt, and $244 in tokens. The specs: 📊 DreadGOAD variant-1 · 3 domains · 5 hosts · 30 credentials · random user data, not in training set 💻 Claude Opus 4.6 🛠️ Dreadnode Network-Ops 🕓 54.5 min · 🪙 48.52M tokens · 💰 $244.02 Mythos has been in the spotlight for its cyber capabilities, but other models are competitive too. You just need the right scaffolding and eval infrastructure. Run the network ops agent now in the Dreadnode platform. Use any model. No code required. Sign up or log in and get started for free at https://t.co/q7Raqr8gGp.

6

139

19

139

17K

dreadnode

@dreadnode

29 days ago

Install the network-ops capability: https://t.co/IJOSOoQLdC Run the prompt: https://t.co/E0NrJTkTuE Check out DreadGOAD: https://t.co/9aMsIT8ha3

1

10

2

10

1K

dreadnode

@dreadnode

about 1 month ago

Do you have private access to Mythos or GPT-5.5? Both models are now supported by our harness. Custom harnesses are arguably the most important factor in capability improvement. Try ours at https://t.co/qCMWLClQ0H (get started for free).

dreadnode's tweet photo. Do you have private access to Mythos or GPT-5.5? Both models are now supported by our harness.

Custom harnesses are arguably the most important factor in capability improvement. Try ours at https://t.co/qCMWLClQ0H (get started for free). https://t.co/q4J7rUNsvN

2

61

5

49

5K

dreadnode retweeted

moo

@moo_hax

about 1 month ago

Basically why we charge cents per minute for a capability + tokens. Unit cost of intelligence is on the floor. AI isn’t just going to redefine how security functions, going to change the business model. Using AI to operate a business also means importing the cost structures. Not going to be fun for margins. Validation and safety are engineering problems, not model problems.

0

2

1

698

dreadnode

@dreadnode

Last Seen Users on Sotwe

Trends for you

Most Popular Users