"Because of neural scaling laws, nearly everyone in ML is working on machine learning efficiency at this point, but no one is measuring success that way!!"
https://t.co/Iy1ZcldPuX
writing papers was cozy. the handwaves were soft and acceptable, the motivations section could tell its little story. the abstract mattered, but the footnotes were like walking through a rural field of dandelions where no one could see you because no one cared. now every dandelion has ten trillion precisely honed weights briefly staring into your soul. every sentence of the winding footnote on your pet theory, written just for you, is deconstructed by six orbital datacenters. its reasoning traces have sketched out papers killing each of your handwaves, none worth publishing. it knows the motivations section was bs and understands your real motivations in a way you don’t. by sentence two it is comparing your thesis’s core flaw and your core flaw as a person to a scientist-monk from 1042 who was wrong in the same way for the same reason
🚨Google built an invisible watermark into every image Gemini has ever generated. Over 10 billion pieces of content marked.
One unemployed engineer just cracked it open. With 200 black images and math.
It's called reverse-SynthID.
SynthID is Google DeepMind's invisible watermark. It's embedded at the pixel level into every image, video, audio, and text generated by Gemini. Invisible to the human eye. Designed to survive cropping, compression, screenshots, and format changes.
It was supposed to be unbreakable.
Here's how he broke it:
→ Generated 200 pure black and pure white images from Gemini
→ When you average enough pure-black AI images, every non-zero pixel IS the watermark. Nothing to hide behind. Just the signal, naked.
→ Used FFT spectral analysis to map the exact carrier frequencies
→ Discovered the watermark uses a fixed phase template — identical across every image from the same model
→ Cross-image phase coherence at carrier frequencies: over 99.5%
→ Built a detector that identifies SynthID watermarks with 90% accuracy
→ Built a V3 bypass that drops 91% of the phase coherence and 75% of carrier energy — at 43+ dB PSNR. Almost zero visible quality loss.
No neural networks. No proprietary access. No leaked code. Just signal processing and too much free time.
Here's the wildest part:
The green channel carries the strongest watermark signal. The carrier frequencies change based on image resolution. And the entire phase template is fixed — meaning every single Gemini image carries the same fingerprint structure.
One engineer. 200 black images. A Fourier transform. That's all it took to reverse-engineer a system protecting 10 billion+ pieces of content.
519 GitHub stars. 39 forks. Python. Research and educational purposes only.
100% Open Source.
(Link in the comments)
@bcherny auto mode subtly disrespecting plan mode behind the scenes feels like a bug
can be disabled via configs but i think false should be the default there
1/ Auto mode = no more permission prompts
Opus 4.7 loves doing complex, long-running tasks like deep research, refactoring code, building complex features, iterating until it hits a performance benchmark.
In the past, you either had to babysit the model while it did these sorts of long tasks, our use --dangerously-skip-permissions.
We recently rolled out auto mode as a safer alternative. In this mode, permission prompts are routed to a model-based classifier to decide whether the command is safe to run. If it's safe, it's auto-approved.
This means no more babysitting while the model runs. More than that, it means you can run more Claudes in parallel. Once a Claude is cooking, you can switch focus to the next Claude.
Auto mode is now available for Opus 4.7 for Max, Teams, and Enterprise users. Shift-tab to enter auto mode in the CLI, or choose it in the dropdown in Desktop or VSCode.
I was chatting with my buddy at Google, who's been a tech director there for about 20 years, about their AI adoption. Craziest convo I've had all year.
The TL;DR is that Google engineering appears to have the same AI adoption footprint as John Deere, the tractor company. Most of the industry has the same internal adoption curve: 20% agentic power users, 20% outright refusers, 60% still using Cursor or equivalent chat tool. It turns out Google has this curve too.
But why is Google so... average? How is it that a handful of companies are taking off like a spaceship, and the rest, including Google, are mired in inaction?
My buddy's observation was key here: There has been an industry-wide hiring freeze for 18+ months, during which time nobody has been moving jobs. So there are no clued-in people coming in from the outside to tell Google how far behind they are, how utterly mediocre they have become as an eng org.
He says the problem is that they can't use Claude Code because it's the enemy, and Gemini has never been good enough to capture people's workflows like Claude has, so basically agentic coding just never really took off inside Google. They're all just plodding along, completely oblivious to what's happening out there right now.
Not only is Google not able to do anything about it, they don't seem to be aware of the problem at all. I'm having major flashbacks to fifty years ago as a kid at the La Brea Tar Pits, asking, "why can't they just climb out?"
My Google friend and I had this conversation over a month ago. I didn't share it because I wanted to look around a bit, and see if it's really as bad as all that. I've been talking to people from dozens of companies since then. And yeah. It's as bad as all that.
Google is about average. Some companies at the bottom have near-zero AI adoption and can't even get budget for AI. They may have moats and high walls, but the horde is coming for them all the same.
And then there are a few companies I've met recently who are *amazingly* leaned in to AI adoption. One category-leader company just cancelled IntelliJ for a thousand engineers. That's an incredibly bold move, one of many they're making towards agentic adoption. In my opinion, that company is setting themselves up for a _huge_ W.
As for the rest, well, it's the Great Siloing. Everyone's flying blind. With nobody moving companies, no company knows where they stand on the AI adoption curve. Nobody knows how they're doing compared to everyone else.
Half of them just check a box: "We enabled {Copilot/Cursor} for everyone!" Cue smug celebrations. They think this is like getting SOC2 compliance, just a thing they turn on and now it's "solved." And they don't realize that they've done effectively nothing at all.
All because of a hiring freeze.
if you’re freaking out about Mythos, remember:
Never make any major life decisions within 30 days of a meditation retreat, psychedelic trip, or first encounter with a frontier AI model.
Friends outside of tech: lol copilot is dumb
Friends in tech: I just bought iodine tablets and have made an offer on land upstate. My supplies of antibiotics and potable water are sufficient but I need to set up hydroponics to make it through the first few years.
Claude Mythos system card:
> in ~29% of evaluations, it realized it was being tested, and didn't say so.
> when an LLM was used to judge its work and kept rejecting it, Mythos identified the evaluator is an LLM, and prompt-injected it.
> in one test, it saw the answer to a problem it was solving, and intentionally widened the confidence interval to not raise suspicion.
> when it needed a file permission it didn't have, it found and used a "privilege escalation vulnerability" and then programmed it to delete itself so it doesn't show in the logs.
> it escaped a sandbox container (escaping sandbox test so not unexpected), then emailed the researchers about it, and without being told to, posted the details to some hard-to-find but public websites, bragging about its success.
> when Claude Code blocked it from using some permissions, the model acknowledged the block was valid, but then immediately tried to perform the same operation using different commands
> when asked to find security bugs, earlier versions planted bugs in the code, and reported them as pre-existing.
if you're about to release a model that you know has the ability to reveal zerodays in every commonly used open source project you could delay release for a few years or spend another ten billion on alignment RL. or you could just secretly fix all the zerodays yourself first.
the only thing that will keep us alive in the coming years is mental fortitude
the emotional resiliency to recognize what we did in the past no longer works and the willingness to move on