Ray Villalobos ✝️

@planetoftheweb

Helping the smartest people thrive past the Age of AI. LinkedIn and Stanford University. Humans will never be abstracted.

iPhone: 28.459488,-81.306122

Joined May 2007

273 Following

6.7K Followers

9.9K Posts

Ray Villalobos ✝️

@planetoftheweb

1 day ago

Spent almost a week with Opus 4.8 and it looks like a small change, but it's bigger than you think. Spent hours with a problem Codex couldn't solve because it was approaching it as an engineer, not a systems analyst. That's the difference and it won't show up in any benchmark. Check this video out, I go through how I used it to upgrade my new project https://t.co/N9UcazSkjk to refactor a freemium model, new features and more over 4 days and 48 commits. Took a problem couldn't figure out and immediately solved it. See how I use it for user testing, cowork and lots more. It's a jam packed 5 minutes. You can throw away the benchmarks, and it's not even their best model (Come at me Mythos). Check out the review. https://t.co/c8biGLOFDl

Ray Villalobos ✝️

@planetoftheweb

1 day ago

Added github access and push to https://t.co/x0qgBqRiaz, a 4 file contract process for starting your agentic and vibe coding projects with a promise.

planetoftheweb's tweet photo. Added github access and push to https://t.co/x0qgBqRiaz, a 4 file contract process for starting your agentic and vibe coding projects with a promise. https://t.co/J9mirAuVdd

Ray Villalobos ✝️

@planetoftheweb

3 days ago

You don’t need a better prompt. You need a contract. When people vibe code, they usually know what they want generally. But AI tools need specifics: - what to build - what not to build - how it should behave - what design rules to follow - how to stay aligned when the build gets messy That’s why I’ve been moving from “prompting” to “contracting.” A good AI build contract gives the tool durable context it can keep checking against. For MVPunk, I use 4 files: 1. PRD.md What are we building and why? 2. AGENTS.md How should the AI behave while working? 3. CLAUDE.md How should Claude/Cursor/Codex orient inside the project? 4. DESIGN.md What should the experience look and feel like? It isn’t about bundling paperwork. It's reducing drift. AI tools are incredibly capable, but they will happily build a feature-heavy mess if you don’t give them boundaries. Prompts start the conversation. Contracts guide the work.

planetoftheweb's tweet photo. You don’t need a better prompt.

You need a contract.

When people vibe code, they usually know what they want generally.

But AI tools need specifics:
- what to build
- what not to build
- how it should behave
- what design rules to follow
- how to stay aligned when the build gets messy

That’s why I’ve been moving from “prompting” to “contracting.”

A good AI build contract gives the tool durable context it can keep checking against.

For MVPunk, I use 4 files:

1. PRD.md
What are we building and why?

2. AGENTS.md
How should the AI behave while working?

3. CLAUDE.md
How should Claude/Cursor/Codex orient inside the project?

4. DESIGN.md
What should the experience look and feel like?

It isn’t about bundling paperwork. It's reducing drift.

AI tools are incredibly capable, but they will happily build a feature-heavy mess if you don’t give them boundaries.

Prompts start the conversation. Contracts guide the work.

105

Ray Villalobos ✝️

@planetoftheweb

5 days ago

I feel like the only thing I'm really afraid of is that tick that makes you not want to eat meat anymore. I think I can handle everything else. ;)

Who to follow

Harry Roberts

@csswizardry

🚀 Independent Web Performance Consultant • Speaker • Google Developer Expert, Web Performance • Accepting new clients • https://t.co/dfDc69bpeD

Being Punekar

@beingpunekar1

Pune stories worth sharing! Follow for viral content, trending topics,news, entertainment and food.

Ilya Grigorik

@igrigorik

Distinguished Engineer, @Shopify. On a mission to make commerce better for everyone.

Ray Villalobos ✝️

@planetoftheweb

9 days ago

Some of the latest models are pretty good. I've been using Mimo 2.5 Pro and it was great enough to run Otis (my bot) for three weeks without errors. I recently moved to ChatGPT's 5.5 because their $20/month is subsidized and started to have to use Claude Code instead of cursor for the same reason. Cursor's new model (Composer 2.5) is shockingly good. I was pretty surprised. Not quite better than O4.7, but at least as good as 4.5 and rapidly getting smarter. Look for this to be the coding model to beat now that they have the deal with XAI for compute. Qwen is supposed to be a good designer. Probably my next AI Model Trends target after Gemini Flash 3.5 releases. Maybe we need a course on Token-maxing. Or the opposite thereof. The problem is local hosting isn't the same as cloud hosting. The infrastructure is completely different and people don't have the H100s to provide a similar experience. If they try to host, they'll find they need to spend all types of money and in the long run, they'll just go back to cloud hosting. The Chinese models are so cheap, that it's just better to use them instead of Claude. But better isn't best and the Claude experience is much more than jus the model. Connectors, skills, plugins, memory, MCP support. Those are all things that have to be added to make a Claude. The model is a small part of the harness that makes a great experience possible.

planetoftheweb's tweet photo. Some of the latest models are pretty good. I've been using Mimo 2.5 Pro and it was great enough to run Otis (my bot) for three weeks without errors.

I recently moved to ChatGPT's 5.5 because their $20/month is subsidized and started to have to use Claude Code instead of cursor for the same reason.

Cursor's new model (Composer 2.5) is shockingly good. I was pretty surprised. Not quite better than O4.7, but at least as good as 4.5 and rapidly getting smarter. Look for this to be the coding model to beat now that they have the deal with XAI for compute.

Qwen is supposed to be a good designer. Probably my next AI Model Trends target after Gemini Flash 3.5 releases. Maybe we need a course on Token-maxing. Or the opposite thereof.

The problem is local hosting isn't the same as cloud hosting. The infrastructure is completely different and people don't have the H100s to provide a similar experience. If they try to host, they'll find they need to spend all types of money and in the long run, they'll just go back to cloud hosting.

The Chinese models are so cheap, that it's just better to use them instead of Claude. But better isn't best and the Claude experience is much more than jus the model. Connectors, skills, plugins, memory, MCP support. Those are all things that have to be added to make a Claude. The model is a small part of the harness that makes a great experience possible.

Ray Villalobos ✝️

@planetoftheweb

11 days ago

I literally can't believe what programming is like today. I'd have never thunk it'd be this way.

176

Ray Villalobos ✝️

@planetoftheweb

15 days ago

Composer 2.5 is an excellent model, unfortunately I don't think the new flash was really meant for coding though. It might be useful for other things. I gave it a design task and it just quit...at least it did it quickly.

Ray Villalobos ✝️

@planetoftheweb

20 days ago

I really liked Claude Design, but it does have issues since it burns so many tokens. That should improve (same with memory prices) over time, but it will take a bit. Meanwhile I made a nice collection with all kids of design resources including some of my own vibe coded projects like Vibe Glossary, Claude Design competitors like Stitch and Open Design and tons of inspiration sites. https://t.co/dttfwOhs1A

Ray Villalobos ✝️

@planetoftheweb

27 days ago

I had been running GLM-5 for a while and it was decent, but still made some errors I wasn't pleased with. Mainly misunderstandings managing my Content Pipeline Kanban Board. Been on Mimo 2.5 Pro for almost half a month now and I gotta say, the problems went away. I was expecting more savings, but as you can see, there was virtually no difference. As a teacher, I have to try different models all the time, but it's working so good, I really don't want to. I'll give Grok and Kimi 2.6 (I was on 2.5 before which I remember being pretty good). I would really love to run Gemma 4 locally for free...crossing fingers that my machine can handle it). Will report l8r

planetoftheweb's tweet photo. I had been running GLM-5 for a while and it was decent, but still made some errors I wasn't pleased with. Mainly misunderstandings managing my Content Pipeline Kanban Board.

Been on Mimo 2.5 Pro for almost half a month now and I gotta say, the problems went away. I was expecting more savings, but as you can see, there was virtually no difference.

As a teacher, I have to try different models all the time, but it's working so good, I really don't want to. I'll give Grok and Kimi 2.6 (I was on 2.5 before which I remember being pretty good). I would really love to run Gemma 4 locally for free...crossing fingers that my machine can handle it).

Will report l8r

136

Ray Villalobos ✝️

@planetoftheweb

about 1 month ago

You know...I really liked @Comet and had been recommending it for years, but I'm out. I don't know why anyone would think of removing slash commands in their assistant and making any skills that I create virtually unusable. When some idiot thinks that removing the most useful feature they've ever had for no reason whatsoever, it means the company is not thinking straight. I have to wait for a while since I made the mistake of paying for a year subscription, but I'm uninstalling it and finding a different solution. There was a time when this was the best option, but now the Claude Extension is better, I had even started using that Claude Extension in Comet since it had gotten so bad. I'm out and uninstalling this disgrace.

128

Ray Villalobos ✝️

@planetoftheweb

about 1 month ago

I gotta say @comet, removing slash commands from the sidebar assistant is just dumb. Easily my most used feature, now totally gone...and for what? Now I have to find a browser that doesn't do ridiculous things like that.

Ray Villalobos ✝️

@planetoftheweb

about 1 month ago

@leohuynh139 Thats the thing, for more aliveness I have to go with Image2

Ray Villalobos ✝️

@planetoftheweb

about 1 month ago

I've been running tests all night between the new GPT Image 2 and Nano Banana Pro and I'm sort of undecided. The one with the wilder tittle font are Image 2. Google's look a little more corporate and less 'fun', but the fidelity is great. I do like the larger resolution of I2. The interface is from my own website/open source project called BrandoIt. Besides adding the new models, I added a comparison slider, etc. It's probably the most used thing I've ever built. Go check it out or give it a star on GitHub or clone it or whatever. Sorry, you're going to need to provide your own keys until Google buys me out or I hit the Lotto or something. Website: https://t.co/Tg5JvgGHEj Repo: https://t.co/YoVipReIeN

planetoftheweb's tweet photo. I've been running tests all night between the new GPT Image 2 and Nano Banana Pro and I'm sort of undecided. The one with the wilder tittle font are Image 2. Google's look a little more corporate and less 'fun', but the fidelity is great. I do like the larger resolution of I2.

The interface is from my own website/open source project called BrandoIt. Besides adding the new models, I added a comparison slider, etc. It's probably the most used thing I've ever built. Go check it out or give it a star on GitHub or clone it or whatever.

Sorry, you're going to need to provide your own keys until Google buys me out or I hit the Lotto or something.

Website: https://t.co/Tg5JvgGHEj
Repo: https://t.co/YoVipReIeN

212

Ray Villalobos ✝️

@planetoftheweb

about 1 month ago

@RichKleinAI For what I do, which is create these illustrations yes. I can’t decide which I like better, but I think I might actually stick with NB2.

Ray Villalobos ✝️

@planetoftheweb

about 1 month ago

@perplexity_ai Why you remove commands from the browser assistant. I had a few critical skill commands and now they’re worthless

Ray Villalobos ✝️

@planetoftheweb

about 1 month ago

@testingcatalog Can't wait for Google Deep Max Ultra Pro Times Infinity

367

Ray Villalobos ✝️

@planetoftheweb

about 2 months ago

@descript I exist! ;)

Ray Villalobos ✝️

@planetoftheweb

about 2 months ago

Work in progress. One thing I didn't realize is how much language designers/developers have been using that sounds alien to new users. Sheets, Drawers, Switch, Toast, Dropzone, Masonry. You can also copy the prompt/code so that you can send notes to your vibe coding platform.

Ray Villalobos ✝️

@planetoftheweb

about 2 months ago

My beginner students in my Stanford Vibe Coding class were having some trouble learning some of the terminology for things they needed to build, so I created this Vibe Glossary, which has now expanded with learning paths, scaffolding code, progress, quiz mode, etc. I gotta take a break until my tokens renew or go use Cursor for a while. Claude Code for Desktop is a Blast. https://t.co/hGunCkLi2K https://t.co/f9Hgw30Vwr Stars always welcome, MIT licensed open-source. I've got 44 items and once I get more tokens, I'll add some more. It's actually a lot of fun.

planetoftheweb's tweet photo. My beginner students in my Stanford Vibe Coding class were having some trouble learning some of the terminology for things they needed to build, so I created this Vibe Glossary, which has now expanded with learning paths, scaffolding code, progress, quiz mode, etc.

I gotta take a break until my tokens renew or go use Cursor for a while. Claude Code for Desktop is a Blast.

https://t.co/hGunCkLi2K
https://t.co/f9Hgw30Vwr

Stars always welcome, MIT licensed open-source. I've got 44 items and once I get more tokens, I'll add some more. It's actually a lot of fun.

152

Ray Villalobos ✝️

@planetoftheweb

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users