Sup*

@mostlyaboutai

Casted a spell so only cool people see this profile. Did it work? I post about AI, evals, hackathon projects, vibe coding and things that make me laugh.

India

Joined August 2024

440 Following

106 Followers

2.5K Posts

Pinned Tweet

Sup*

@mostlyaboutai

2 months ago

https://t.co/fYopxOTbx6

Sup*

@mostlyaboutai

about 6 hours ago

This is a leap, believe it or not !

Ideogram @ideogram_ai

about 9 hours ago

Introducing Ideogram 4.0: the best open image model in the world. Think it. Make it. Own it. Download the weights, fine-tune on your own data, and run it on your hardware. Live on every Ideogram plan and the API today.

274

483

819K

about 6 hours ago

about 7 hours ago

New model YAYYyyyy!........

Sup*

@mostlyaboutai

about 8 hours ago

@contraben @ideogram_ai @GeminiApp @imagine @bfl_ai Contra Labs has assembled quite the team to test out and benchmark the models. This clearly shows that. This update tempts me to go back to Ideogram now ;)

171

Sup*

@mostlyaboutai

about 10 hours ago

This is worth noticing. Diffusion models routinely fail at mutually exclusive prompts. Red or blue balloons yield purple blends or dual objects. Cat or dog produces hybrids. Object counts collapse entirely. FineGRAIN benchmark already shows near-zero success on hard constraints. I am calling it the Ice Cream Effect: the model outputs a single statistical average of multiple legal interpretations, satisfying none of the explicit constraints while preserving high aesthetic quality. Formalize it with Constraint Preservation Score (CPS) as the fraction of prompt constraints met and Aesthetic Preservation Score (APS) for realism. Run either-or tests versus single-alternative controls across Stable Diffusion variants, GANs, and autoregressive models. Paired t-tests on CPS drops with stable APS confirm the averaging. If you are building evals or prompting pipelines, this is the artifact worth stress-testing right now. Thoughts?

$mostlyaboutai's tweet photo. This is worth noticing. Diffusion models routinely fail at mutually exclusive prompts. Red or blue balloons yield purple blends or dual objects. Cat or dog produces hybrids. Object counts collapse entirely. FineGRAIN benchmark already shows near-zero success on hard constraints. I am calling it the Ice Cream Effect: the model outputs a single statistical average of multiple legal interpretations, satisfying none of the explicit constraints while preserving high aesthetic quality. Formalize it with Constraint Preservation Score (CPS) as the fraction of prompt constraints met and Aesthetic Preservation Score (APS) for realism. Run either-or tests versus single-alternative controls across Stable Diffusion variants, GANs, and autoregressive models. Paired t-tests on CPS drops with stable APS confirm the averaging. If you are building evals or prompting pipelines, this is the artifact worth stress-testing right now. Thoughts?$

Sup*

@mostlyaboutai

about 19 hours ago

Opus 4.8 is great.

Sup*

@mostlyaboutai

1 day ago

Funny how the providers themselves are not the OGs in the industry. The last paragraph speaks volumes. Every designer who stayed relevant while the world shouted about AI killing design. Every developer who felt dispensable after AI hit. Guess what? The best ones flourished anyway. Bolt too will flourish. It is about progressing in the right direction.

987

Sup*

@mostlyaboutai

1 day ago

@AravSrinivas Would be cool to see Perplexity add privacy layers.

129

Sup*

@mostlyaboutai

1 day ago

@gregisenberg This is more of a UX problem imo 👀

Sup*

@mostlyaboutai

1 day ago

This is worth noticing. The key takeaway is the level of precision one can get just by focusing on task driven prompts. Models over time have shown how simple prompts can still deliver really effective results. Gemini NB for me has always been the tweaking tool, and this study justifies its place in my workflow.

Sup*

@mostlyaboutai

2 days ago

There are stages of creativity and production. If you expect AI to skip them for you, you are playing the wrong game.

Sup*

@mostlyaboutai

5 days ago

@cjzafir My observations have been a little different. What kind of a dataset are you using?? The token burn reduction is impressive and evident.

Sup*

@mostlyaboutai

6 days ago

Honestly, this is both the most impressive thing happening right now and the most frightening use of AI.

frostzy

@lmkifiwin

8 days ago

> be Bowie Knife99 > just a guy with an Xbox > buy Forza Horizon 6 on launch day > drive like you always do - like a lunatic > ram, swerve, and pit-maneuver strangers into walls > the Drivatar system silently records every felony > uploads an AI clone of your driving to the cloud > deploys hundreds of you into other people's races, 24/7 > within days, thousands cry about you on Reddit, X, and Steam > the community calls you "the Herobrine of Forza" > official Xbox UK tweets "Happy Bank Holiday Monday to everyone except bowie knife99" > a fan opens a fake X account in your name to taunt your victims thousands of players are at war with hundreds of clones of you. you don't know any of this is happening. happy bank holiday.