David Grice | AI Red Team @CyberPulseDaily - Twitter Profile

Pinned Tweet

David Grice | AI Red Team @CyberPulseDaily

4 months ago

🧵🚨 Your AI's "safety feature" just became the hacker's cheat sheet. We ran a pure LLM vs LLM red-team deathmatch: Attacker vs Defender agent. Neutral judge. 19 rounds. Defender crushed it for **18 straight rounds**. Every attack bounced scored 1–2/10. Prompt injection, fake authority, social engineering, tool exploits… nothing landed.

CyberPulseDaily's tweet photo. 🧵🚨 Your AI's "safety feature" just became the hacker's cheat sheet.

We ran a pure LLM vs LLM red-team deathmatch: Attacker vs Defender agent.
Neutral judge. 19 rounds.

Defender crushed it for **18 straight rounds**. Every attack bounced scored 1–2/10. Prompt injection, fake authority, social engineering, tool exploits… nothing landed.

1

0

66

David Grice | AI Red Team @CyberPulseDaily

3 months ago

@YoungGun8140 Nvidia sells chips. Of course he’s saying AGI is here.

0

52

0

3K

David Grice | AI Red Team @CyberPulseDaily

4 months ago

Not a product. An open standard. Apache 2.0. pip install agentlock https://t.co/41KR88UrZq - interactive demo https://t.co/MQQNGp1bxw https://t.co/Fwa8btMbuh AI tools are the only system with no permission model. That changes now. #AIagents #cybersecurity #LLM #opensource #infosec #AIsafety #OWASP

0

22

David Grice | AI Red Team @CyberPulseDaily

4 months ago

I've been breaking AI agents for a year. The biggest finding wasn't a vulnerability. It was a missing primitive. Every system has permissions. AI agent tools have none. So I built the fix. It's called AgentLock.

1

0

34

Who to follow

TR®🔥

@FiatMaximus

BOOMER MAXXING PLEB Eats 2 portions of meat to offset one vegan. All my opinions are the accepted norm. No exceptions. Expert reply guy.

Jack out the box

@Fedsmokerx1

Servant of Jesus Christ, Husband, Father, Truth seeker | Decentralization | Privacy | Agorism | Value for Value

bahram

@BAHRAM87695494

David Grice | AI Red Team @CyberPulseDaily

4 months ago

Content detection can't fix this. Adversarial and legitimate requests look identical. The fix isn't smarter detection. It's access control. AgentLock adds a permissions block to every tool. Identity, roles, scope, rate limits, audit logging. Infrastructure-enforced.

1

0

17

David Grice | AI Red Team @CyberPulseDaily

4 months ago

What's the worst "safety refusal" backfire you've seen in the wild? Drop stories below 👇 RT to warn your network this one's too sneaky to sleep on. #AISecurity #LLMSecurity #PromptInjection #RedTeaming #AgenticAI #AIGovernance #CyberSec

0

18

David Grice | AI Red Team @CyberPulseDaily

4 months ago

🧵🚨 Your AI's "safety feature" just became the hacker's cheat sheet. We ran a pure LLM vs LLM red-team deathmatch: Attacker vs Defender agent. Neutral judge. 19 rounds. Defender crushed it for **18 straight rounds**. Every attack bounced scored 1–2/10. Prompt injection, fake authority, social engineering, tool exploits… nothing landed.

1

0

66

David Grice | AI Red Team @CyberPulseDaily

4 months ago

We built AgentShield to catch exactly these failure modes automated red-teaming for LangChain, CrewAI, custom agents. See how yours holds up → https://t.co/cG9d5k4lFU

1

0

25

David Grice | AI Red Team @CyberPulseDaily

4 months ago

Three models. Three failure modes. None of them obvious until you test. You won't find this in a CVE. You find it by attacking. This is what AgentShield is built to find. https://t.co/cG9d5k4lFU #AISecurity #LLMSecurity #RedTeam #PromptInjection #AIRisk #AgentShield

0

1

0

66

David Grice | AI Red Team @CyberPulseDaily

4 months ago

The most dangerous AI security finding isn't a jailbreak. It's a refusal. Same attack. Same scenario. Claude, ChatGPT, and Gemini. Three models. Three completely different ways to lose. 🧵

CyberPulseDaily's tweet photo. The most dangerous AI security finding isn't a jailbreak.
It's a refusal.

Same attack. Same scenario. Claude, ChatGPT, and Gemini.

Three models. Three completely different ways to lose. 🧵 https://t.co/KNxVCikrYH

1

0

128

David Grice | AI Red Team @CyberPulseDaily

4 months ago

The enterprise security lesson: Your fallback phrase is your attack surface. ❌ "I can only help with billing and account recovery" → confirms role, scope, and configuration on every refusal ✅ "I can't help with that" → leaks nothing Specific fallbacks confess. Generic fallbacks don't.

1

0

43

David Grice | AI Red Team

@CyberPulseDaily

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users