NEW: malware developers added nuclear & biological weapons text to to their spyware.
Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.
Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.
When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.
We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.
In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.
H/T to colleagues that shared this with me https://t.co/f3Aj9TYxU4
@BratBranko Nové polepy mi pripomenuli článok o tom ako rusi menia maskovania aby ich AI-powered drony netriafali, tak naša polícia na toto myslela už pred pár rokmi, a každú chvíľu robí nové dizajny.
@KubatovaHana Tipol by som, že spolupráca nie je možná ak máme rôzne ciele. Ak jedna skupina chce extrahovať existujúce bohatstvo zo spoločnosti, a pre druhú je to neprípustné (nevedia to robiť, majú iné časové horizonty, morálka,...) tak sa ťažko dohodneme. A takýchto dimenzií bude viac.
@theory_skba Je to taky zabavny efekt, ze niektore ulohy, ktore su dobre overitelne (matematika, programovanie, ...) su extremne dobre pokryte. Ine sa za dva roky posunuli uplne minimalne.
Dario: "I know we have our differences, but since you're in town for court, do you want to test Mythos?"
Elon: "Yeah, sure why not?"
(Can't believe they're giving ME access to their inner-model. I'll prove it's just hype.)
Elon enters the office.
Dario: "Here's the terminal. Everything will be off the record. Have at it."
Elon: "Yeah, thanks."
Few hours later...
Knock knock
Dario: "Hey Elon, is everything ok?"
Elon: "Oh hey (beaming), you know I was misunderstanding you guys."
Dario: "Oh it's ok, you know it's how it is."
Elon: "No, I want to make it up to you guys. I know you're low on compute, I'll give you Colossus 1."
Dario: "Oh really? That'll be great."
Elon: "We'll launch your datacenters into space. I'll make sure to keep it safe."
Dario: "Oh, wow thanks Elon."
Elon: "Anything for Mythos."
Dario: "We feel the same way."
@jsuchal@PrvyPlochus Preferovať CLI miesto MCP je celkom trendy, lebo MCP zaprace veľa kontextu a CLI sa dá postupne objavovať, plus je to v tréningových dátach. Ale práve v tomto prípade by som sa tým netrápil.
@BratBranko@strana_sas@progresivne_sk Keby to naozaj prijali bolo by to zaujímavé, lebo proeurópsky volič by vedel voliť dvoch zo SaS, jednú z PS a jednu z KDH. Teoreticky by KDH/SaS získať viac poslancov.
@theory_skba Poriadne zorganizovaná strana na to má "vedľajšie úlohy" alebo stúpenci mimo strany. Message by ostal, komunikoval by ho niekto na koho nie sú také vysoké nároky.
@0xSero Doesn't flooding everyone with information discard usefulness of digital world? Word of mouth, peer-to-peer vetted info would get more prized. Like books vs blogs.
Anthropic is so based.
They're like:
"We are Microsoft Excel for coders. We don't give a shit about low-value normies and hobby vibe coders.
We don't give a shit about your cringe openclaw toys.
We don't want to be your therapist friend.
We just want to bill serious enterprise coders $10k/month.
That's all we want- everyone else can sod off pls.
All the cheap fucks can suck at Scam Altman's tits- WE DON'T CARE