India’s IT ministry banned Telegram for one week because some users shared leaked exam questions.
This punishes 150M+ ordinary Telegram users in India — not the insiders who leaked the exam materials.
And the ban hasn't stopped anything. The leaks just moved to other apps.
Excited to share “Poisoned Wells,” which presents the largest point-in-time study of website blocking in India to date. I tested the blocking of 294 million apex domains across six Indian ISPs, sending 1.76 billion DNS queries in total.
Holy shit… Your anonymous internet identity can now be unmasked for $1 😳
Not by the FBI. By anyone with access to Claude or ChatGPT and a few of your Reddit comments.
ETH Zurich and Anthropic just dropped a paper called “Large-Scale Online Deanonymization with LLMs” and the results are the most alarming privacy research I’ve read this year.
They built an automated pipeline that takes your anonymous posts, extracts identity signals, searches the web, and figures out who you are.
No human investigator needed. Fully autonomous. Works on Hacker News, Reddit, LinkedIn, even redacted interview transcripts.
Here’s how bad the numbers are.
On Hacker News users: 67% identified correctly.
When the system made a guess, it was right 90% of the time.
On Reddit academics posting under pseudonyms: 52%.
On scientists whose interview transcripts were explicitly redacted for privacy: 9 out of 33 still got unmasked.
The pipeline works in four steps they call ESRC. Extract identity signals from your posts using LLMs.
Search for candidate matches using embeddings across thousands of profiles.
Reason over top candidates with models like GPT-5.2. Calibrate confidence so when it does guess, it’s almost never wrong.
The classical deanonymization method from the famous Netflix Prize attack? Nearly 0% recall across every test.
LLMs didn’t just improve on old techniques. They made old techniques look like toys.
When they scaled to temporally split Reddit profiles, matching a user’s old posts to their newer ones across a full year gap, the pipeline hit 67% recall at 90% precision and 38% recall at 99% precision.
Meaning even a year of changed interests and different conversations wasn’t enough to hide.
More reasoning compute = better deanonymization. High reasoning effort doubled recall at 99% precision in some tests. As frontier models get smarter, this attack strengthens automatically. Every model upgrade is a privacy downgrade.
What makes it nearly impossible to defend against: the pipeline splits into subtasks that all look benign. Summarize a profile.
Compute embeddings. Rank candidates. No single API call screams “deanonymization.” The researchers themselves say they’re pessimistic that safety guardrails or rate limits can stop it.
Their conclusion is blunt: “Users who post under persistent usernames should assume that adversaries can link their accounts to real identities.” And it extrapolates.
Log-linear projections suggest roughly 35% recall at 90% precision even at one million candidates.
Every throwaway account. Every anonymous forum post. Every “nobody will connect this to me” comment.
It’s all searchable micro-data now. And the cost to run the full agent on one target is less than a cup of coffee.
Practical anonymity on the internet just died. The paper killed it with math.
she was beautiful, like code that compiles on the first try
but also you just knew that there was something deeply wrong with her, like code that compiles on the first try