Excited to share that The Agentic De-Evolution will be presented at the CVPR AI Art Gallery 2026. In this work, we visualize how multimodal agents fail to self-evolve in the absence of human supervision.
https://t.co/4fvqeTNx2r
Thanks Luba Elliott @elluba for the curation. Please also check out the other exciting works in the gallery!
Ahead of the AI Art Gallery (June 5-7), more details related to this work will also be presented in two workshop papers at CVPR 2026:
๐๐ฎ๐ป๐ฎ๐ป๐ฎ๐ญ๐ฌ๐ฌ: ๐๐ฟ๐ฒ๐ฎ๐ธ๏ฟฝ๏ฟฝ๐ป๐ด ๐ก๐ฅ-๐๐ค๐ ๐ ๐ฒ๐๐ฟ๐ถ๐ฐ๐ ๐ฏ๐ ๐ญ๐ฌ๐ฌ ๐๐๐ฒ๐ฟ๐ฎ๐๐ถ๐๐ฒ ๐๐บ๐ฎ๐ด๐ฒ ๐ฅ๐ฒ๐ฝ๐น๐ถ๐ฐ๐ฎ๐๐ถ๐ผ๐ป๐ ๐๐ถ๐๐ต ๐ก๐ฎ๐ป๐ผ ๐๐ฎ๐ป๐ฎ๐ป๐ฎ ๐ฃ๐ฟ๐ผ
Kenan Tang, Praveen Arunshankar, Andong Hua, Anthony Yang, Yao Qin
๐ Room 205 A & Hall D, June 3
๐๐ ๐ฝ๐ฟ๐ฒ๐๐๐๐ฑ๐ถ๐: ๐๐ฎ๐๐ ๐๐ฑ๐ถ๐๐ถ๐ป๐ด ๐ผ๐ณ ๐ฆ๐๐๐น๐ถ๐๐ฒ๐ฑ ๐๐ฎ๐ฐ๐ถ๐ฎ๐น ๐๐ ๐ฝ๐ฟ๐ฒ๐๐๐ถ๐ผ๐ป๐ ๐๐ถ๐๐ต ๐๐ถ๐ณ๐ณ๐๐๐ถ๐ผ๐ป ๐ ๐ผ๐ฑ๐ฒ๐น๐ ๐ถ๐ป ๐ฃ๐ต๐ผ๐๐ผ๐๐ต๐ผ๐ฝ
Kenan Tang, Jiasheng Guo, Jeffrey Lin, Yao Qin
๐ Room 105, June 4
A huge shout-out to all the amazing undergraduates who contributed to these projects: Praveen Arunshankar, Anthony Yang, Jiasheng Guo, and Jeffrey Lin.
#creativeAI #AIart #CVPR #CVPR2026
The #CVPR2026 Art Gallery is now live ๐ฅณ
114 artworks using or about computer vision, presented online and as videos and installations at the @CVPRConf in Denver between 5-7 June next week ๐
Check it out https://t.co/XTGoCWaDV8
#creativeAI#AIart
AI agents can lead to an irreversible de-evolution of human knowledge๐
As shown in the video, agentic models drive a cycle of decay: when they edit images iteratively, they introduce invisible noise that accumulates until qualityโand future modelsโcollapse.
To quantify this decay, we built Banana100.
Constructed using Nano Banana Pro, this dataset contains 28,000 2K-resolution images tracking the gradual destruction of image content across 100 consecutive edits.
Witness the collapse firsthand.
๐https://t.co/QomOr9PskE
#AI #GenerativeAI #ModelCollapse #NanoBananaPro #DataScience #Google
It is a good thing to gather figs and also not to pass over in silence the figs in this picture. Purple figs dripping with juice are heaped on vine-leaves; and they are depicted with breaks in the skin, some just cracking open to disgorge their honey, some split apart, they are so ripe. Near them lies a branch, not bare, by Zeus, or empty of fruit, but under the shade of its leaves are figs, some still green and โuntimely,โย some with wrinkled skin over-ripe, and some about to turn, disclosing the shining juice, while on the tip of the branch a sparrow buries its bill in what seems the very sweetest of the figs. All the ground is strewn with chestnuts, some of which are rubbed free of the burr, others lie quite shut up, and others how the burr breaking at the lines of division. See, too, the pears on pears, apples on apples, both heaps of them and piles of ten, all fragrant and golden. You will say that their redness has not been put on from outside, but has bloomed from within. Here are gifts of the cherry tree, here is fruit in clusters heaped in a basket, and the basket is woven, not from alien twigs, but from branches of the plant itself. And if you look at the vine-sprays woven together and at the clusters hanging from them and how the grapes stand out one by one, you will certainly hymn Dionysus and speak of the vine as โQueenly giver of grapes.โย You would say that even the grapes in the painting are good to eat and full of winey juice. And the most charming point of all this is: on a leafy branch is yellow honey already within the comb and ripe to stream forth if the comb is pressed; and on another leaf is cheese new curdled and quivering; and there are bowls of milk not merely white but gleaming, for the cream floating upon it makes it seem to gleam.
One example of Nano Banana Pro (left) outperforming Nano Banana (right). The prompt is "grapes in a basket made from grape vines," inspired by Xenia in Imagines. Nano Banana fails to generate grape vines with a realistic diameter, although it appears prompt-compliant at first glance.
Still, the grape vine structure and texture from Nano Banana Pro are not fully reasonable.
#AIart๏ธ #nanobanana #nanobanana2 #Google
With Nano Banana, if you don't mind using multiple rounds of generation, you can first generate the word "bottlenose" in the first turn of the dialog. In the second turn of the dialog, ask the model to change "nose" into "neck". This still has a low success rate (but at least non-zero).
Prompt sensitivity, where small wording tweaks shift results, has long been seen as a key issue.
We find much of it stems from evaluation artifacts, such as rigid regex extraction.
Using LLM-as-a-Judge greatly reduces variance across prompts and yields more consistent rankings.
Can't attend EMNLP in person, but our poster โFlaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMsโ will be presented at Session 4, Wed, November 5, 14:30โ16:00 ๐ง
Stop by and chat with our collaborators about the challenges of prompt sensitivity in LLM evaluation!
Recruiting PhDs & postdocs for:
๐ค agents "taking over" science (https://t.co/lF1eKxyarG and ๐)
๐งช Real scientists โก๏ธAI (e.g., materials, chem, physics)
๐ Theory + incentives for H-AI collab & credit (e.g., formalizing tacit knowledge)
new adventures for me, ๐ if you can! ๐
Generative AI isn't a benign tool, it's a malignant cancer that actively harms the people whose work it was trained on; people who gave no consent and receive no credit or compensation.
You have a choice to make: Will you exploit others, or will you stand up to those who do?
Text or Pixels? It Takes Half โ On the Token Efficiency of Visual Text Inputs in Multimodal LLMs
(๐ Accepted to EMNLP 2025 Findings!)
๐คWe ask a simple question: If we render long text as a single image and feed it to an off-the-shelf multimodal LLM, can we cut decoder tokens while keeping performance?
๐Answer: Yes! Even with untuned, general-purpose VLMs (e.g., GPT-4.1-mini, Qwen2.5-VL-72B-Instruct)โnot models specialized for OCRโtext-as-image consistently reduces decoder tokens by ~ยฝ with no accuracy loss, acting as an implicit compression layer!
๐Results
- RULER S-NIAH (single needle-in-a-haystack): 97โ99% accuracy with up to โ58% fewer decoder tokens.
- CNN/DailyMail summarization: at matched compression, text-as-image matches or beats token-pruning baselines (Select-Context, LLMLingua-2).
โจTakeaway: Rendering context as an image is a drop-in, modality-shifted compression strategy. No finetuning, just fewer decoder tokens!
Huge thanks to my wonderful coauthors: Zixuan Lan (UChicago) and my advisor Jiawei Zhou (@jzhou_jz). ๐
Paper: https://t.co/NJ3OV2HRLc
Code: https://t.co/r3tUQSZQqs
#LLM #Multimodal #NLP
Our paper has been accepted to NeurIPS 2025 Creative AI Track!
Huge thanks to my amazing collaborators @YanhongLi2062 and Anthony, and to my advisor @YaoQin_UCSB for the incredible support.
Our image editing method SPICE unifies structural editing and detail restoration. It runs on a single consumer GPU, handles any resolution and hundreds of consecutive edits, achieving near lossless quality. Try it out here: https://t.co/BUeq2T2sRx
We also wrote a blog post showing how to apply SPICE to reveal weaknesses of Nano Banana. Check it out! https://t.co/XR7ihOToG1
#AIart #NanoBanana #AIใคใฉในใ #ๆฑๆนproject