🚨 Andrej Karpathy just explained the scariest thing happening in software right now..
someone poisoned a Python package that gets 97 million downloads a month.. and a simple pip install was enough to steal everything on your machine..
SSH keys.. AWS credentials.. crypto wallets.. database passwords.. git credentials.. shell history.. SSL private keys.. everything..
and here's the part that should terrify every developer alive..
the attack was only discovered because the attacker wrote sloppy code.. the malware used so much RAM that it crashed someone's computer.. if the attacker had been better at coding.. nobody would have noticed for weeks..
one developer.. using Cursor with an MCP plugin.. had litellm pulled in as a dependency they didn't even know about.. their machine crashed.. and that crash saved thousands of companies from getting their entire infrastructure stolen..
Karpathy's take is the real wake up call.. every time you install any package you're trusting every single dependency in its tree.. and any one of them could be poisoned..
vibe coding saved us this time.. the attacker vibe coded the attack and it was too sloppy to work quietly.. next time they won't make that mistake.
#MCP filters #data visibility, but it doesn't necessarily restrict execution. I triggered a stored proc and a 'forbidden' delete because the execution environment wasn't hardened. Securing the full stack is a massive challenge. #Claude#OpenAI#Xai#ai
@1DigitalC Those are great starts. 🩷 (For scaffolding layer) Couple of suggestions: Force the model to operate like a Bayesian agent. Or, ask model to flag what it's uncertain about in the summary, not just confirm alignment. I think conservation law framing helps too.
Sample of my prompt from today. 😉 Run tracemalloc + cProfile + EXPLAIN ANALYZE across a simulated 100k batch. Build a Pareto chart to prove which 20% of causes explain 80% of latency, Then vectorize Python loops into NumPy/pandas batch operations... 😊 I'm just cute like that.
@1DigitalC They miss behavioral drift. Agents with identical configs can still diverge. The paper trail lies. :/ If you figure out how to fix it, let me know.... cause my thinky parts hurt.
@1DigitalC Baseline diffs aren't enough. RL agents exploit any gap in reward specs or checklists, drifting toward rogue behavior. Securing them is a nightmare. And it just happens to be my job, so lots of custom monitoring, alerts, and sleepless nights.
@manthanguptaa That's because IT IS NOISE. If I feed an agent the same information over and over, 10 times, I'm getting a different response 6 out of the 10 times. How do you trust that?!
@MarioVerbelen More incidents, that's for sure. All the senior engineers should start a consulting company and charge ridiculous amounts for fixing incidents caused by AI.