something of an nes expert, something of a hacker. i login to twitter once every 6 months on average. sometimes to post travel pics. sometimes for troublemaking
NEW: malware developers added nuclear & biological weapons text to to their spyware.
Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.
Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.
When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.
We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.
In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.
H/T to colleagues that shared this with me https://t.co/f3Aj9TYxU4
yet another website with the "security" of hiding the email address that you *just typed in* when registering an account. this is what happens when security teams don't understand how people use their product.
@corban_villa@daryakaviani thx. looks like something got messed up and now there's duplicates in the atlas.json. it says there's 321 rows for aisle, but it should be much lower. lots of duplicate CVEs with `agent: "Unknown", source_kinds: ["record"]` and `agent: "AISLE Analyzer", source_kinds: ["other"]`
@daryakaviani@corban_villa jfyi, there's a full list of our CVEs at https://t.co/VvQ927kNeL. it seems a lot of them are missing from your (very cool) page