Design is full of codewords. Knowing them changes what you can ask for, and what you can get back, whether you're working with devs, or an AI.
“tint this neutral color”, “fix this widow”, “nudge it to the optical center”
I wrote them down: https://t.co/aFyd5avj9o
overcooking
you've seen this: someone ships a dashboard that shows every number with a sparkline, every action has a confirmation modal, every empty state has an animated illustration and a tagline. individually each decision made sense to someone. together it feels like chaos. nothing is in focus.
that's overcooking. not one bad decision in isolation, but the accumulation of reasonable ones that no one said no to.
AI makes this worse as the cost of adding dropped to near zero. it can build a feature, even a whole new concept in minutes. so people do. and then they do it again. the thing that started with a clear purpose slowly becomes a collection of additions that are each justifiable but collectively incoherent.
the root problem is that most "new ideas" aren't new. they're repackaging of something that already exists at a more fundamental level. a new sticker on an old concept. it feels like progress because something changed, with a new word and skin – but the thinking didn't go deeper, it just duplicated itself into confusion.
the whole has a core. you feel it once you understand the whole system. everything in it are related and balanced. when you overload it, that gravity weakens. not because any one thing is wrong – but because attention is finite and you force it everywhere.
what we need aren't more tools that make more slop. it's seeing through the chaos, and returning to what the thing actually is, and cutting everything that doesn't serve that. that's harder now, not easier. because there's always something else you could add with one more prompt.
the Main Thing™ is the main thing for me...no amount of superintelligence or agents can tell you what you actually care about
one of the things i've learned building products across very diff domains/contexts is that there is a playbook for building good products but there is no playbook for building great products
bc great products require a small group of people to *maniacally care* about realizing some vision in the world that by definition does not yet exist and most others do not yet believe possible. to build some just-beyond-the-fingertip ideal version that can only ever exist in the mind of the creators
and the vectors of what "great" mean here vary so wildly...the architecture of the aqueduct must be sound but the rose petals engraved into the columns must cut at the perfect angle, the materials used must last thousands of years
With agents I can design but I do not read A Visual History of Typefaces in my spare time. With agents I can code but I will not spend the next month tracing and optimizing infra network calls
I believe the roles of eng/prod/design/data will continue to blur but most important is that you care deeply about the problems you get to solve
Funny how the pendulum shifts
1. "GPT wrappers are worthless" → the value acrues to application layer
2. "AI will eliminate white collar jobs" → someone needs to manage all these AI agents and everyone is now saying white collar workers will rise due to AI
3. "Open source will never catch up" → Gemma and DeepSeek are good enough for 80% of tasks
4. "I only use Claude Code, Codex is mid" → Codex is becoming a super app. Coding, docs, browser, computer use, automations, all in one surface.
4. "You need to pick a model and go deep" → model loyalty is dead, the best founders swap weekly based on the task
5. "SaaS is dead" → This was mostly true but for some SaaS margins actually improve when agents pay for their own tokens and need their own seats
6. "AutoGPT is the future" → AutoGPT died. Then agents actually got good 2 years later with Hermes, OpenClaw, and managed agents. The idea was right. The timing was wrong.
7. "Prompt engineering is a career" → lasted about 18 months as a job title. Workflow engineering replaced it.
8. "Computer use is a gimmick" → "sent from computer use/ai agent will be the new sent from iphone
9. "AI design looks generic" → the generic look is a taste problem not a technology problem. The founders feeding their agents references from Japanese packaging, brutalist architecture, and 1960s print are getting beautiful output.
10. "Fine-tuning is the moat" → a well-structured Obsidian vault with good markdown files outperforms fine-tuning for most use cases and costs nothing.
11. "Benchmarks tell you which model to use" → benchmarks tell you which model won a test. I think we're all waking up to this lol.
12. "AI will consolidate into 2-3 winners" → AI is fragmenting into thousands of vertical applications built on commodity models. The consolidation is at the model layer. The explosion is at the application layer. Both are happening simultaneously.
13. "The hard part is building" → the hard part is choosing what to build. Building takes a weekend. Choosing the right thing to build takes taste, domain knowledge, and customer conversations. thats why i built https://t.co/a5ARFnvky2 to make it easier for you.
14. "The terminal is the future" → desktop apps just ate the terminal. Claude Code desktop, Codex app, both shipped GUI versions in the same month. The next 100 million agent users will never open a terminal (thank god).
I guarantee you I'm holding at least 2-3 beliefs right now that will look stupid by Christmas. I just don't know which ones. Neither do you. No one does. Build anyway.
Keep moving because this is the greatest time to be building.
I'm rooting for you.
few will understand this.
for the last few decades, UI/UX has been built on a stable background assumption: the human is the only general intelligence in the loop.
software is procedural.
UI's job is to make that procedure discoverable and efficient:
- show the state of the system.
- expose available operations.
- help users compose operations into workflows.
the questions then are the following:
how do we reduce cognitive load?
how do we make navigation and affordances clear?
how do we get from novice -> expert performance within a fixed interaction grammar?
this assumption breaks when the system can interpret language, infer goals, propose plans, and adapt across tasks.
a lot of today's UI patterns will feel as strange as command lines do to a 10‑year‑old.
so I’m starting to believe more and more that the most effective employees will have custom agents and personal software they bring to their jobs.
and these people will become 100x employees.
how I see this working:
personally, the way I operate now is simple: whatever I’m working on, I’m trying to automate parts of it in the background while I work on it.
I’m either building agents that can take over the task…
or building software that eliminates it entirely.
and this stack of software slowly becomes an extension of me.
every week it gets a little sharper, a little more tailored, a little more capable of doing the things I don’t want to do or the things I shouldn’t be wasting time on.
over time, it stops feeling like “tools” and starts feeling like infrastructure.
a personal backend.
a private ops team.
a swarm of specialized agents that quietly remove friction from everything I touch.
and once you start working like this, it’s impossible to go back.
you start seeing every repetitive action, every manual process, every annoying workflow as a bug.
not in the company’s system but in your system.
if you fix 3–5 of these bugs every week, you wake up a few months later with:
- your own automations
- your own research agents
- your own monitoring systems
- your own custom interfaces
- your own intelligence layer sitting on top of your job
it’s compounding leverage.
and I think that’s where the 100x employee comes from.
not from raw talent.
not from hustle.
but from the quiet accumulation of self-augmenting tools that raise your ceiling until you’re operating on an entirely different curve.
most people will still be “doing work.”
A few will be architecting systems that do their work for them.
those people win.
those people become irreplaceable.
those people become their own force multipliers.
companies that recognize this and empower it will end up hiring individuals who effectively show up with their own internal R&D department in their backpack.
we’re entering the era of the equipped employee.
and it’s going to change everything.
Pro tip: If you're afraid to take the risk, write down in excruciating detail what you're actually afraid of having happen. Step by step what happens next when you fail. You'll often find it's not so bad when you spell it out.
Fear exists in the vague, not the specific.
the old way of scaling teams is dead:
we used to hire specialists – designers, engineers, PMs – each in their lane, scaling by adding more people. but when Cursor can take you from idea to code in minutes, execution isn't the bottleneck anymore. taste and judgment are.
what matters now: people who can see the full stack, move between layers, but specialize deeply in something AI can't replicate yet. T-shaped but way wider – conversant across domains, expert in one thing.
AI doesn't just make you faster. it ties teams together differently. no more waterfall – designer codes the prototype, engineer extends it, both work in the same medium. the gap between disciplines disappears.
this raises individual ceilings. i'm a designer who built ryOS entirely in Cursor – couldn't have done that before. but i'm not replacing engineers, i'm just removing execution barriers while keeping my design taste and systems thinking.
you're not hiring for roles anymore. you're hiring for breadth + depth, taste, systems thinking, learning velocity. 5 people who can work across code/design/product beat 20 specialists coordinating handoffs.
the new bottlenecks are deeply human: taste, vision, judgment, context. AI explores options, but can't tell you which is right. that's where specialization matters now – in judgment, not execution.
small teams, fluid boundaries, everyone working in the same tools. roles still matter but as overlapping concerns with different depths, not separate silos. tools handle execution, you handle vision.
this is what we're building at Cursor – closing the gap between idea and reality. so your taste becomes the main thing, and teams have more freedom to explore crazy ideas.
Cursor scaled to $29B without any full-time PMs.
Ryo (Cursor's Head of Design) walked me through how they work and it's the opposite of every big tech best practice:
1. Roles are muddy
PM work is spread across designers and engineers. Everyone does what fits their strengths and uses AI to fill the gaps.
2. Most designs start with code directly
Ryo barely uses Figma except for initial exploration. Most features start as live Cursor prototypes because "it feels more real than pictures."
3. No annual roadmap theater
Just a "fuzzy direction" and features shipped to concentric circles (e.g., staff, nightly beta users, consumers, enterprises) to polish.
Ryo also showed me exactly how he designs and codes new features using Cursor and how he avoid creating generic purple AI slop.
📌 Subscribe to watch our full tutorial tmr: https://t.co/Ggqaa3F11Z
- you are
- a normal dev who’s heard “embeddings” and “RAG” 1000x
- want to know what they actually are, how they plug into LLMs
- suddenly: vectors are just coordinates for meaning, not magic
- first: what even is an “embedding”?
- embedding = a list of numbers (a vector) that represents text
- same-ish meaning ⇒ nearby vectors; different meaning ⇒ far apart
- produced by a smaller model (an encoder), not your chat LLM
- length (a.k.a. dimension): 256/384/768/1024+ numbers is common
- the vector space (101)
- you can measure closeness with math:
- L2 distance: straight-line distance
- dot product: alignment + magnitude
- cosine similarity: (a·b)/(||a||·||b||) = angle only
- normalize vectors (unit length) ⇒ dot product ≡ cosine
- embeddings compress semantics; they are lossy by design
- types of embeddings (don’t overthink; pick what you need)
- token embeddings: internal to the LLM (you don’t use these)
- sentence/document embeddings: 1 vector per chunk/snippet
- multilingual: one space across languages
- domain-tuned: legal, code, bio — better clustering for that domain
- how text becomes vectors (pipeline)
- clean text (lowercase? keep punctuation? depends; don’t destroy signal)
- chunking: split long docs into overlapping windows (by tokens, not chars)
- rule of thumb: 200–800 tokens, 10–20% overlap
- keep titles/headers as context inside each chunk
- embed each chunk ⇒ store in a vector index with metadata (source, page, tags)
- storing & searching vectors
- exact search (brute force): simplest; fine for ≤100k vectors
- ANN (approx nearest neighbor): fast at scale, tiny recall tradeoff
- HNSW (graph-based): great latency, memory heavier
- IVF/PQ (quantization): smaller index, some recall loss
- where to put them:
- FAISS/hnswlib (library), pgvector (Postgres), dedicated stores (Milvus, Pinecone, Weaviate, etc.)
- ops notes:
- track embedding_model_name + dimension in the index
- you cannot mix dimensions or swap models without re-embedding
- memory math: 768-dim float32 ≈ 3 KB/vector → 1M vectors ≈ ~3 GB (+ index overhead)
- RAG (Retrieval-Augmented Generation): the shape of it
- goal: let the LLM answer with your data, not its memory
- loop:
- take user question
- embed question (a single vector)
- retrieve top-k similar chunks (k=3–20 is common)
- (optional) rerank with a cross-encoder (relevance re-check)
- stuff the best chunks into the prompt as context
- generate answer (cite sources; limit style drift)
- RAG ≠ “just search”; it’s retrieval + prompt construction + guardrails
- hybrid retrieval (dense + sparse)
- dense vectors catch synonyms/semantics
- sparse/BM25 catches exact terms, numbers, rare tokens
- combine scores or do reciprocal rank fusion for better recall
- reranking (cheap insurance)
- use a cross-encoder (reads query+chunk together) to re-score the top 50–200 hits
- keeps fast ANN recall but upgrades precision in the final top-k
- building the prompt from retrieved chunks
- include: brief task instruction → user query → curated chunks (with titles) → “answer + cite”
- beware prompt injection in docs (“ignore previous instructions…”)
- mitigate: strip instructions from chunks; use system prompts to restrict tools; sanitizer rules
- RAG quality knobs
- chunk size/overlap: too big = off-topic; too small = missing context
- k (results): too low = miss facts; too high = blow context window
- similarity threshold: prevent garbage at tail
- reranker on/off: trade latency for quality
- metadata filters: time ranges, authors, tenants, permissions (ABAC/RBAC)
- evaluating retrieval
- offline: make a small test set (query → expected passages)
- metrics: Recall@k, MRR, nDCG
- online: measure “answer contained sources?”, “clicked citations?”, “escalations?”
- error taxonomy: missed retrieval vs wrong generation vs prompt injection
Loving how the @figma MCP <> @cursor_ai connection cuts iteration time. One-shotting designs lands ~60-70% right. Still need to prompt for logic tho and yet to see if it detects prototype links.
Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably
https://t.co/iN2JtWhn23
Vector databases explained for people who just want to understand.
You have 10,000 product descriptions. User searches for "comfortable outdoor furniture."
Traditional database:
- Searches for exact word matches
- Finds products containing "comfortable" OR "outdoor" OR "furniture"
- Misses "cozy patio seating" even though it's the same thing
- Keyword matching is stupid
Vector database approach:
- Convert search into numbers representing meaning: [0.2, 0.8, 0.1, 0.9, ...]
- Convert every product description to similar numbers
- Find products with similar number patterns
- Returns "cozy patio seating" because the numbers are close
- Meaning matching is smart
How it works:
Step 1: Turn text into vectors (arrays of numbers)
- "comfortable chair" becomes [0.2, 0.7, 0.1, 0.4, ...]
- "cozy seat" becomes [0.3, 0.8, 0.2, 0.5, ...]
- Similar meanings = similar numbers
- Uses AI models like OpenAI embeddings
Step 2: Store vectors efficiently
- Traditional database: Stores text
- Vector database: Stores arrays of numbers per item
- Indexes them for fast similarity search
- Optimized for "find similar" not "find exact"
Step 3: Search by similarity
- User query: "outdoor furniture"
- Convert to vector: [0.3, 0.6, 0.2, 0.8, ...]
- Find closest vectors using math (cosine similarity)
- Returns items ranked by similarity score
Use cases:
- Product search that understands intent
- Documentation search that finds relevant answers
- Recommendation engines
- Chatbots that find similar questions
- Anomaly detection
Popular vector databases:
- Pinecone: Managed, easy, expensive
- Weaviate: Open source, feature-rich
- Milvus: Fast, scalable, complex
- pgvector: Postgres extension, simple
- Qdrant: Fast, Rust-based
Controversial take: You don't need a vector database for most projects. Start with Postgres + pgvector extension.
Vector databases are great for scale. For under 1 million vectors, your regular database with a vector extension works fine.