We need new evals AND more skepticism, of a different kind, for AI-rendered information.
What's a canonical AI image failure you'd bet is still broken?
For 3 years, image AI had a set of canonical failures we joked about:
– Analog clocks always at 10:10
– Wine glasses never to the brim
– Counting past 6 objects breaks
In May 2026, both SOTA models — gpt-image-2 and Nano Banana Pro — have quietly solved them.
Not perfectly. But the canonical failure modes are gone.
So where's the uncanny valley now?
Not in pixels. Not in counts. Not in clocks.
The new uncanny valley is in reasoning made visible:
– Diagrams that look correct but encode wrong algorithms
– Recipe cards with confidently-wrong nutrition
– Charts with plausible-but-hallucinated data
Pretty + wrong, instead of ugly + obviously fake.