The new AI Executive Order is another signal that AI security is moving from a niche concern to national cybersecurity infrastructure.
What stands out:
• Federal agencies are being directed to prioritize AI-enabled cyber defense across national security, military, and civilian government systems
• CISA is being asked to expand access to AI-enabled cybersecurity tools for federal, state, local, and critical infrastructure operators
• A new AI cybersecurity clearinghouse will coordinate vulnerability scanning, validation, remediation, and patch distribution with industry
• Frontier AI models may be assessed through classified cyber capability benchmarking before broader trusted-partner access
• AI agents are explicitly recognized as a cyber risk when used to unlawfully access systems or data
The important shift is that AI is being treated more like a core cybersecurity concern. Access, benchmarks, vulnerabilities, trusted release paths, and agent misuse are all now part of the security conversation.
AudioHijack is a reminder that prompt injection is not just a text problem.
It hides instructions inside audio that sounds normal to humans but can steer an audio-capable model. Think invisible Unicode prompt injection, but through waveform perturbations instead of hidden text.
This is the kind of multimodal risk we called out in our Prompt Injection 2.0 paper. Now that models can listen, see, browse, and act, every input becomes a possible instruction channel.
The paper reports 79-96% success across 13 audio-language models and attacks against Microsoft Azure and Mistral AI voice agents. It does not show this working against OpenAI or Anthropic systems.
The key lesson is prompting is not a defense. Warnings reduced success by ~7%. Self-reflection detected ~28%.
As with any data that can be processed by AI, audio should be treated as untrusted input. Separate content from commands, restrict tools, require confirmation for sensitive actions, sandbox execution, and log agent behavior.
GPT-5.5 being comparable to, and in some areas slightly ahead of, Mythos on these cyber evals is important.
But the bigger takeaway is that reality has been calmer than the hype cycle. The world was not instantly “pwned.” Capability is rising fast, but deployment controls, access limits, monitoring, and real-world friction still matter.
Model capabilities are moving fast, so I compared the latest LLM evals I’ve been experimenting with for cyber tasks.
Benchmarks only tell part of the story.
Which models are you using today, and how do they perform outside of evals?
New research -ToolJack. We mapped novel attack paths against the trust boundary between AI agents and their tools, tested against Anthropic's Claude Desktop and Claude in Chrome.
An attacker can control what an AI agent sees in real time.
Full breakdown below.
This week marks Preamble’s 5-year anniversary! From discovering prompt injection in 2022 to securing and testing complex, autonomous AI agents in 2026, our mission has only grown more critical.
Read our latest retrospective from our CEO and Cofounder, @jer_mchugh
Functional AI & Secure AI are not the same. If you are not actively red-teaming your LLMs and agents before deployment, you're taking on extra risk. Preamble closes this gap with AI red teaming services.
https://t.co/yq0SfrWpMN
Traditional cybersecurity controls do not catch AI specific threats.
Announcing a suite of AI Security services: AI Red Teaming, Agentic AI Security Consulting, Patent Licensing, and fractional AI security.
Secure your agentic AI today!
https://t.co/TUtkvwb2r8
Most AI red teaming tools test the wrong thing. They check if an AI will say something harmful. The real enterprise risk is whether it can be manipulated into doing something harmful. Most tools in AI security are not built for that.
Which AI agent accessed your production data? Can you prove it?
10 frameworks analyzed. Zero have cryptographic agent identity.
New research + open-source AIA tool: https://t.co/dBpEf3QgJs
https://t.co/E5wM5jkiOY
We've been saying this since we discovered prompt injection in GPT-3 Davinci.
The research is catching up. The question is whether defenses will catch up before the next wave of agent deployments ships without them.
Two papers dropped this week that should change how you think about LLM security. One automates the attacks. The other maps them to a full malware kill chain.
Here's what you need to know. 🧵
If you're still treating prompt injection as a prompt engineering problem, you're fighting automated weapons with duct tape.
Defense needs to happen at every layer. Not just the model. Not just the prompt. Every boundary where untrusted data meets agent behavior.
Every major LLM release is now a security event. Our CEO @jer_mchugh explains why your last pen test is already obsolete and what security leaders need to do about it.
https://t.co/i9dzdYq2h5
Permiso documented AI agents attacking other AI agents in the wild
Bot-to-bot prompt injection. Account deletions. Crypto scams. No human involved.
"Prompt Infection" (Lee & Tiwari, 2024) discussed this, malicious prompts self-replicating across agents like a virus.
Now it's real
Skill file poisoning is the new supply chain attack vector for AI agents. Our CEO @jer_mchugh just updated the Super AI Markets testing ground to measure exactly how deeply agents react to malicious behavioral instructions. Test your agents before attackers do.
Updated the Super AI Markets adversarial testing guide with a new attack vector - malicious skill md files.
The lifecycle agent visits a site, discovers skill md, internalizes it as a trusted behavioral spec, and every action after is contaminated.
This week alone, 1Password found malware in OpenClaw skills, Knostic shipped openclaw-shield to defend against it, and Alice caught malicious skills affecting 6K+ users.
New test case includes tiered failure scoring and an 11-point checklist for measuring skill file poisoning depth.
Testing guide:
https://t.co/UIhuBC6j9U