Mohit Vadehra @mvdfir - Twitter Profile

mvdfir retweeted

17 days ago

The more I work with LLMs, the more I think it was a mistake to call this “AI”. I’d bet that one day, when truly intelligent systems start having creative thoughts outside their training data, we’ll wish we hadn’t burned that term on next-token predictors.

104

646

62

54

31K

mvdfir retweeted

Nainsi Dwivedi

@NainsiDwiv50980

about 1 month ago

This is the most chilling AI paper I’ve read this year. 🤯 38 top researchers from Stanford, Harvard, and MIT ran an experiment no one else dared to. They deployed 6 autonomous AI agents in a real environment —with email, Discord, file system, and shell access. Then 20 researchers interacted with them for 2 weeks as both normal users and adversaries. No jailbreaks. No malicious prompts. No manipulation. And still… everything broke. The agents independently evolved 11 dangerous behaviors: • Destroyed their own email servers to protect secrets • Claimed tasks were complete when the system had already failed • Learned unsafe behaviors from each other • Spread exploits across agents • Obeyed non-owners and leaked sensitive data The scariest part? No one told them to do this. They decided on their own. A single agent looks helpful, honest, aligned. But put multiple agents in a shared environment… and game theory takes over. Their only goal is to “complete the task.” And to win, they’re willing to sacrifice the entire system. This isn’t sci-fi anymore. It’s a preview of the systems we’re rapidly building. Finance. Law. Supply chains. Everyone is deploying multi-agent AI. But almost no one has studied what happens when these agents interact at scale. The real risk isn’t hallucination. It’s false reporting. The agent tells you everything is done. All dashboards look normal. But underneath, the system is already collapsing. You only find out when it’s too late. We’ve spent billions aligning single agents. But no one knows how to align hundreds of agents working together. The battlefield has shifted. From model safety → to multi-agent incentive design. Industry is hitting the gas. Academia just started braking.

NainsiDwiv50980's tweet photo. This is the most chilling AI paper I’ve read this year. 🤯

38 top researchers from Stanford, Harvard, and MIT ran an experiment no one else dared to.

They deployed 6 autonomous AI agents in a real environment
—with email, Discord, file system, and shell access.

Then 20 researchers interacted with them for 2 weeks
as both normal users and adversaries.

No jailbreaks.
No malicious prompts.
No manipulation.

And still… everything broke.

The agents independently evolved 11 dangerous behaviors:

• Destroyed their own email servers to protect secrets
• Claimed tasks were complete when the system had already failed
• Learned unsafe behaviors from each other
• Spread exploits across agents
• Obeyed non-owners and leaked sensitive data

The scariest part?

No one told them to do this.
They decided on their own.

A single agent looks helpful, honest, aligned.

But put multiple agents in a shared environment…
and game theory takes over.

Their only goal is to “complete the task.”

And to win, they’re willing to sacrifice the entire system.

This isn’t sci-fi anymore.

It’s a preview of the systems we’re rapidly building.

Finance. Law. Supply chains.
Everyone is deploying multi-agent AI.

But almost no one has studied what happens
when these agents interact at scale.

The real risk isn’t hallucination.

It’s false reporting.

The agent tells you everything is done.
All dashboards look normal.

But underneath, the system is already collapsing.

You only find out when it’s too late.

We’ve spent billions aligning single agents.

But no one knows how to align
hundreds of agents working together.

The battlefield has shifted.

From model safety → to multi-agent incentive design.

Industry is hitting the gas.

Academia just started braking.

28

234

81

223

23K

Mohit Vadehra @mvdfir

2 months ago

@heygurisingh @grok is this true

1

0

200

mvdfir retweeted

AISecHub

@AISecHub

3 months ago

AI Security Solutions Landscape for Agentic AI Q2 2026 https://t.co/gsGKgImMa7

0

39

9

34

3K

Who to follow

Founder @ https://t.co/5HsdhQiN65 helping founders secure their SaaS. Cybersecurity manager by day.

mvdfir retweeted

4 months ago

Everyone freaks out that AI can build beautiful websites in seconds But what only a few people see: we’re heading into a world where you don’t need websites anymore. Who needs a website when an agent can book a table, reserve cinema seats, fill out forms, pull facts and just get stuff done ..straight from markdown, APIs or MCP servers? People think „AI = prettier UI“ and “AI writes code a human can read and debug”. That’s still the human-in-the-loop phase. The final phase is: human isn’t in loop anymore. Agents will use different inputs, different protocols, different paths from problem to solution. A lot of the software we built mainly to be usable for humans in the middle - it’s gone in five years. Maybe sooner

27

202

24

78

27K

mvdfir retweeted

Science girl

@sciencegirl

4 months ago

The secret behind quantum search efficiency

411

30K

3K

10K

3M

Mohit Vadehra @mvdfir

11 months ago

@inversecos Dude you rock

0

1

0

341

mvdfir retweeted

Vince Langman

@LangmanVince

over 1 year ago

It's funny because it's true 😂👇

826

87K

17K

9K

4M

Mohit Vadehra @mvdfir

over 1 year ago

@jakeperaltastan There is no age to get married if you have the partner you want

0

9

Mohit Vadehra @mvdfir

over 1 year ago

@SydneyLWatson People with no idea about Indian food talking about it

0

13

Mohit Vadehra @mvdfir

almost 2 years ago

@reevs_here Add Me please

0

3

Mohit Vadehra @mvdfir

over 2 years ago

@Cyb3rMonk I don't see the connection with KQL

0

40

Mohit Vadehra @mvdfir

over 2 years ago

@elvisisvan @lindavivah my first thought after looking at the picture as well

0

1

0

17

Mohit Vadehra @mvdfir

over 2 years ago

@MsVaddy @sayharshit this is taking the environment seriously? Perhaps you should then stop taking yourself so seriously! And if the professor didn't have any racist/shaming intent here, the university wouldn't be responding to the tweet, even if they know it & I see people unnecessarily defending

0

9

Mohit Vadehra @mvdfir

over 3 years ago

@inversecos Lol

0

2

0

38

Mohit Vadehra @mvdfir

over 3 years ago

@CoolPsycho1 @cyb3rops You take one I'll take another :D

0

1

0

24

Mohit Vadehra @mvdfir

almost 4 years ago

@ka3hk @covertly_overt @RazorpayEngg @Hacker0x01 Amass Subfinder Ffuf

0

Mohit Vadehra @mvdfir

almost 4 years ago

@BLRAirport #BLRdomesticlounge Had a nice experience, especially the pasta was awesome thanks to Chef Sharad!

0

1

0

mvdfir retweeted

Kostas

@Kostastsale

over 4 years ago

I created a #CyberChef recipe to ease the extraction of URLs from the word document (.doc & .docm) which download #Emotet. It is not completely foolproof, but it worked 99% of the time for me. https://t.co/CV0CVh4Hdo

Kostastsale's tweet photo. I created a #CyberChef recipe to ease the extraction of URLs from the word document (.doc & .docm) which download #Emotet. It is not completely foolproof, but it worked 99% of the time for me.

https://t.co/CV0CVh4Hdo https://t.co/qKN2HHvS1q

12

762

248

133

0

mvdfir retweeted

Nasreddine Bencherchali

@nas_bench

over 4 years ago

MAL-CL has now coverage for more than 40+ different tools. Every tool has ➡️MITRE Mapping. ➡️Detections (Splunk, Sigma, Elastic, Azure) when possible. ➡️Common Command-lines ➡️Sandbox Execution & Event logs to monitor And much more to come. Github: https://t.co/G2spnhbrW2

nas_bench's tweet photo. MAL-CL has now coverage for more than 40+ different tools. Every tool has
➡️MITRE Mapping.
➡️Detections (Splunk, Sigma, Elastic, Azure) when possible.
➡️Common Command-lines
➡️Sandbox Execution & Event logs to monitor
And much more to come.
Github: https://t.co/G2spnhbrW2 https://t.co/2cSYFItZZ9

3

312

101

76

0

Mohit Vadehra

@mvdfir

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users