NVIDIA QUIETLY DROPPED A $249 BOX THAT REPLACES YOUR $200/MONTH OPENAI SUBSCRIPTION WITH $2 IN ELECTRICITY
it's called the jetson orin nano super. smaller than a wallet, runs at 25 watts, does 70 trillion ai operations per second. runs llama 3, mistral, gemma and deepseek locally with no api fees and no data leaving your house
a developer running automations and coding assistants pays $200 a month to openai. the same workload on this box costs $2 a month in electricity and breaks even in 10 weeks
install ollama with one command. change one line in your code. point it at localhost instead of openai. everything else works identically
7 billion parameter models handle 80% of what people use chatgpt for. summarization, drafting, coding, document q&a, automation pipelines. total monthly cost drops from $200 to $22
cloud subscriptions keep getting more expensive and rate limits keep getting tighter. the people who set this up in 2025 are going to look very smart in 2027
bookmark this and read the article below
I listen to a ton of podcasts and audiobooks. Tech, nerdy topics, politics, sci-fi, self-improvement, fantasy, etc. Every once in a while there's an episode or book that infects my mind in a great way. The @20vcFund interview with @cliffweitzman, founder of @SpeechifyAI , has turned out to be one of those.
A few highlights:
- Senior leaders have to stay hands-on. He calls them warriors, not fat generals in the back. If your head of growth can't open CapCut or Canva and ship an ad themselves, you've hired a bottleneck.
- QA is the most valuable skill in a world where engineering and design are commoditized. The LLM gets you 80% there. The human who stress-tests on weak Wi-Fi at 3am is what separates a product from a great product.
- Volume of work as universal arbitrage. He applied to 26 colleges, wrote 48 drafts of his admissions essay, tests 1,300 AI-generated ads per day. Same principle every time.
- Any Harvard MBA can cut costs. It takes a genius to grow revenue. Companies bulk and cut like bodybuilders. You can't do both at once.
- If it isn't in production, it doesn't count. Best milk in the world, but you left it at the end of the driveway. No credit.
- Surround yourself with people better than you and refuse to be told no. He cold-emailed his way to meetings with the founders of Instagram, Twitter, Plaid, Honey, 23andMe, Robinhood, and Grammarly. If the CEO didn't reply, he went to the CMO, then the head of growth. He says less than 1% of people ever told him to stop. Most of us quit after the first polite no.
- Adversity Quotient (AQ) matters more than IQ or EQ. Cliff says his single greatest advantage is the ability to keep going when other people stop. The Jeep that wins the Savannah isn't the fastest one. It's the one that doesn't get a flat tire. The work that actually moves things lives on the other side of five hours of grappling with a hard problem. Most people quit after thirty minutes.
Strongly recommend you give it a shot. Pragmatic, opinionated, full of things you can actually apply. His personal story is unique and a lot of fun on top of it.
https://t.co/c5Bu91LZ9D
Generative AI Black Belt at @Google . That is @HeikoHotz 's actual title. His was one of the standout seminars for me at the AI Agent Conference.
He opened with the honest reality for most of us with agents at home is still pretty rudimentary. Ask Alexa for the weather. Ask Google to turn on the lights. The coffee-ordering, comparison-shopping, agent-as-buyer future we have been promised for years has only just started showing up.
So when we talk about agentic commerce, we are really talking about two different futures. Conversational commerce, where you ask the agent to find you a coffee machine and stay in the driver's seat through discovery and checkout. And autonomous commerce, where you say "I need a laptop charger" and the agent goes, transacts, and you never touch the comparison step. Same category on paper. Completely different system design underneath.
His engineering framework for building shopping agents was the most useful thing I heard on the topic all week. Stop thinking of an agent as a single element with a bag of tools. Think of it as layers.
Use the LLM for what it actually does well, which is understanding intent. "I need a jacket" parses into rainproof, my size, my preferences. Hand those attributes to a deterministic filter against merchant data using UCP. Then bring the LLM back in as a judge to make the final selection based on context the deterministic layer cannot see. Three layers, each doing what it is best at, none of them pretending to do work outside their lane.
The merchant side hit closest to home. How a website is structured determines whether shopping agents see your products at all. Persuasion and marketing do not go away, they migrate. The team that used to optimize landing page copy ends up working with the data team to make sure every product attribute that matters is exposed in your structured data.
He also brought up a Microsoft study from late last year documenting real failure modes in current agentic commerce. Fake credential prompts on rogue sites. Price manipulation. Overspending. They are why trust, audit, and cryptographic mandates around payment authorization are not optional in any production agent commerce stack.
A real privilege to hear someone working this deep on the protocol and architecture side lay it out so cleanly. The two-mode framing and the layered agent design are going straight into how we think about our own platform's role in this. Keep grinding everyone!
P.S. I included his slide on his example of the charger. Each dollar figure shows how much cheaper a product has to be to compensate for a missing attribute when an agent is comparing options. A charger without UL safety certification needs to be $25.53 cheaper to compete with one that lists it. Fast charging only buys you $1.21. The ranking is the lesson. Safety and warranty signals far outweighed performance specs that most product pages lead with today.
We keep asking the wrong question about AI and jobs.
The question isn't "can AI do this task." The question is "does AI-only delivery actually satisfy demand." @nlw made this point on the @AIDailyBrief this week. He, per usual, did an excellent job reframing this perspective.
Plenty of roles exist not because of capability gaps but because of how a service has to be delivered. Trust, accountability, presence, relationship, translation, behavior change. These are what he calls the "human premium," and they don't get eaten when AI gets better at tasks. They are fundamentally important to us as a species.
The healthcare walkthrough is an excellent example of this. He sketches out roles like continuous care navigators and care plan outcome specialists that don't really exist today, made possible only because AI handles the data and monitoring layer underneath. This yields hundreds of thousands of net new jobs in one sector alone.
The AI industry has been overwhelmingly focused on what disappears. Pieces like this are helping the vision of what can be. Worth your time.
Also please enjoy my clearly AI generated image to outline this below. You're welcome, lol.
One of the lectures that stood out to me at the AI Agent Conference was @jai_das , President of @SapphireVC .
He framed foundation models through an operating system analogy. Consumer-facing AI consolidates around a few closed players. Enterprise backends move toward open source for the same reason Linux runs the server world. Cost.
He cited Anthropic's own dev team burning their token budget in three quarters and Sapphire's portfolio spend hitting around $3.5M on model usage. As agents run autonomously for hours or days at a time, that math gets harder, not easier.
On where capability jumps next, he pointed to reinforcement learning applied to specific verticals. The next leap comes less from foundation model labs and more from companies solving narrow domains well. He used coding agents as the template. Autocomplete to function generation to autonomous write-test-commit loops. Other verticals follow that arc on a lag.
The point I keep coming back to is the structural shift in how AI-native companies are built. Usage-based pricing replacing seat-based subscriptions. Leaner teams shipping more revenue per engineer than the previous generation. Go-to-market that lets users adopt incrementally instead of negotiating up front.
A real privilege to hear someone with his vantage point lay this out plainly. Inning zero, as he put it. Plenty left to build.
Elon Musk avait dit un truc qui m'avait marqué sur l'allocation de ressources. En substance : passé un certain niveau de richesse, l'argent n'est plus de la consommation, c'est de l'allocation de capital.
Cette phrase change tout.
L'économie, dans le fond, c'est juste un problème d'allocation. Tu as des ressources finies et des usages infinis. Qui décide où va quoi ?
Imagine une cour de récré. 100 enfants, des paquets de cartes Pokémon distribués au hasard. Tu laisses faire. Très vite, un ordre émerge. Les bons joueurs accumulent les cartes rares, les collectionneurs trient, les négociateurs trouvent des deals. Personne n'a planifié. Et pourtant chaque carte finit dans les mains de celui qui en tire le plus de valeur. Le système maximise le bonheur total de la cour. C'est ça, la main invisible.
Maintenant fais entrer la maîtresse. Elle trouve ça injuste. Léo a 50 cartes, Tom en a 3. Elle confisque, redistribue, impose l'égalité. Trois effets immédiats. Les bons joueurs arrêtent de jouer, à quoi bon. Les mauvais n'ont plus de raison de progresser, ils auront leur part. Les échanges s'effondrent. La cour est égale, et morte. Elle a maximisé l'égalité, elle a détruit le bonheur.
Le problème de la maîtresse, c'est qu'elle ne peut pas avoir l'information que la cour avait collectivement. C'est le problème du calcul économique de Mises, formulé en 1920. L'URSS a essayé de le résoudre pendant 70 ans avec le Gosplan. Résultat : pénuries, queues, effondrement. Pas parce que les Soviétiques étaient bêtes, parce que le problème est mathématiquement insoluble en mode centralisé.
Quand Musk a 200 milliards, il ne les consomme pas, il les alloue. SpaceX, Starlink, Neuralink, xAI. Chaque dollar est un pari sur le futur. Et lui a un track record. PayPal, Tesla, SpaceX. Il a démontré qu'il sait identifier des problèmes immenses et y allouer des ressources avec un rendement spectaculaire.
L'État aussi a un track record. Hôpitaux qui s'effondrent, éducation qui décline, dette qui explose, services publics qui se dégradent malgré des budgets en hausse constante. Le marché identifie les bons allocateurs, la politique identifie les bons communicants.
Le profit n'est pas une finalité, c'est un signal. Il dit : tu as alloué des ressources rares vers un usage que les gens valorisent suffisamment pour payer. Plus le profit est gros, plus la création de valeur est grande. Quand Starlink est rentable, ça veut dire que des millions de gens dans des zones rurales ont enfin internet. Quand un ministère est en déficit, ça veut dire qu'il consomme plus qu'il ne produit. L'un crée, l'autre détruit, et on appelle ça redistribution.
Dans nos sociétés il y a deux catégories d'acteurs. Les entrepreneurs et les bureaucrates. L'entrepreneur prend un risque personnel pour identifier un problème, mobiliser des ressources, créer une solution. S'il se trompe il perd. S'il a raison, ses clients gagnent, ses employés gagnent, ses fournisseurs gagnent, l'État collecte des impôts. Il est la cellule de base du progrès humain.
Le bureaucrate ne prend aucun risque personnel. Son salaire est garanti. Au mieux il maintient une rente existante. Au pire il la détruit par excès de réglementation, mauvaise allocation forcée, incitations perverses qui découragent ceux qui produisent. Mais dans aucun cas il ne crée.
Regarde les 50 dernières années. iPhone, internet civil, SpaceX, Tesla, Google, Amazon, Stripe, mRNA, ChatGPT. Toutes des inventions privées, portées par des entrepreneurs, financées par du capital risque. Pas un seul ministère n'a inventé quoi que ce soit qui ait changé ta vie au quotidien.
La France est devenue le laboratoire mondial de la dérive bureaucratique. 57% du PIB en dépenses publiques, record absolu. Une administration tentaculaire, une fiscalité qui pénalise la création de richesse. Résultat : décrochage face aux États-Unis, à l'Allemagne, à la Suisse. Fuite des cerveaux. Désindustrialisation. Dette qui explose.
Et le pire c'est que la mauvaise allocation s'auto-renforce. Plus l'État prélève, moins les entrepreneurs créent. Moins ils créent, moins il y a de base fiscale. Plus l'État s'endette et taxe. Boucle de rétroaction négative parfaite. La maîtresse pense qu'elle aide, et chaque année la cour produit moins.
Dans nos sociétés, ce sont les entrepreneurs, toujours, qui font avancer la civilisation. Les bureaucrates au mieux maintiennent une rente, au pire la détruisent. Aucune société n'a jamais progressé en taxant ses créateurs pour subventionner ses gestionnaires.
La question n'est jamais qui a combien. C'est qui alloue le mieux la prochaine unité de ressource pour maximiser le futur de l'humanité. La réponse depuis 200 ans n'a jamais changé. Ce ne sont pas les fonctionnaires.
Listened to the @ThisWeeknAI episode with @AravSrinivas (Perplexity) and @echen (Surge AI). It had a couple good takeaways.
Edwin bootstrapped Surge to over a billion in revenue without ever raising venture capital. Aravind made the right point on the same episode: it isn't bootstrap vs. raise, it's whether you operate with the discipline of a bootstrap founder regardless of how much capital you have. That's the bar to hold.
Then Aravind shared where Perplexity's Model Council came from. Jensen Huang used to ask the same question across the frontier models, then reconcile the answers in his head. So Perplexity built it natively. One prompt, multiple frontier models, an orchestrator that surfaces where they agree, where they diverge, and what to take away.
That's the same insight behind PRISM, the methodology I've created as I build our platform. Parallel Research, Integrated Synthesis Method. CC BY 4.0.
The core claim of PRISM is that when you query multiple frontier models on the same architectural question, the consensus is interesting but the disagreement is more valuable. Disagreement between models identifies the genuinely contested decisions, the places where human judgment is required and where defaulting to any single model would have steered you wrong.
Model Council operationalizes that for live questions. PRISM operationalizes it for the architectural decisions that build a codebase. Same physics, different time horizons. Validating to hear their approach, and inspired by what they are doing for AI.
Worth the listen. Props to Aravind, Edwin, and @jasoncal
There's a quiet shift happening in how the best engineers are using LLMs, and @karpathy just made it explicit.
He's now spending more of his token budget building knowledge bases than writing code.
His approach is raw documents go into a folder, an LLM compiles them into a structured markdown wiki, and the system maintains itself through periodic health checks that find inconsistencies, fill gaps, and surface new connections. No vector database needed. No RAG pipeline. Just structured files a human can read and verify.
Combine this with his autoresearch project, where an AI agent runs hundreds of experiments overnight and keeps only what improves the result, and a philosophy emerges that I think applies far beyond machine learning.
The human defines the objective and the constraints. The machine iterates relentlessly within that box. Knowledge compounds over time. And the human stays in the loop where it matters most: at the point of judgment.
This is the part most people get wrong. They either underuse AI (treating it as a search engine for one-off questions) or overuse it (letting it make decisions without human review). The real leverage is in the middle. Run parallel research across multiple models. Let them each surface what they find. Then synthesize it yourself, make the call, and document the reasoning.
That discipline applies everywhere:
An e-commerce operator compiles competitor pricing, forum sentiment, and product reviews into a knowledge base that evolves weekly. The AI maintains it. The merchant decides what to act on.
A legal team feeds case law, regulatory updates, and internal briefs into a living wiki. The AI cross-references and flags contradictions. The attorney makes the judgment call.
A marketing team runs parallel content experiments, A/B tests, and audience research through AI agents overnight. The strategist reviews what worked in the morning and decides the direction.
A supply chain manager ingests supplier data, shipping reports, and commodity signals. The AI surfaces correlations. The operator decides which risks warrant action.
In every case, the AI does what it's exceptional at: processing volume, finding patterns, maintaining structure, iterating without fatigue. And the human does what they're exceptional at: asking the right questions, weighing tradeoffs, making judgment calls that require context no model has.
This isn't about replacing human work. It's about redirecting human attention to the highest-leverage decisions while AI handles the research, organization, and iteration that used to consume 80% of the effort.
Karpathy called it a shift from writing code to manipulating knowledge. I'd go further. It's a shift from doing the work to designing the systems that do the work, then showing up at the moments where human judgment is irreplaceable.
The tools exist. The workflows are proven. The only question is what you point them at.
LLM Knowledge Bases
Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So:
Data ingest:
I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them.
IDE:
I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides).
Q&A:
Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale.
Output:
Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base.
Linting:
I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into.
Extra tools:
I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries.
Further explorations:
As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows.
TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.
I am fully nerding out and locked in on agent progress right now. If you have not heard about this yet and you are a fellow nerd, please stop what you are doing and check this out.
@karpathy , the person who coined "vibe coding," just released a GitHub repo called autoresearch. The concept is straightforward but the implications are staggering: give an AI agent a real training setup, point it at a single file it is allowed to modify, and let it run experiments autonomously overnight. It modifies code, trains for five minutes, checks whether the result improved, keeps or discards the change, and repeats. You wake up to a log of 100 experiments and a better model. No human in the loop during execution. Just outcomes in the morning.
The repo itself is focused on ML research, but that is not what I cannot stop thinking about. The architecture is the thing. Modify, evaluate, iterate, repeat. Unsupervised, logged, and composable.
Now pair that pattern with what @AnthropicAI just shipped inside Claude Code: a native command called /loop. You describe a task in plain English, set an interval, and Claude runs it as a background worker on a cron schedule for up to three days without any prompting from you. Check error logs every few hours and open pull requests for fixable bugs. Monitor a deployment overnight. Poll a build and report back. The agent becomes your second shift.
Two things shipping in the same week that share the same underlying idea: define the goal, set the loop, walk away, and come back to results. That convergence is not a coincidence.
@nlw covered the autoresearch release on @theaidailybrief this week. If you are not already listening to that show daily, fix that. It is the best pulse on what is actually happening in this space.
The gap between "I want to research or build X" and "here are 100 tested iterations of X" is becoming a scheduling decision, not a human effort decision. That is not a small thing.
https://t.co/1iXvv2dXq0
This is the kind of tooling that makes AI-assisted development actually production-ready. Catching waterfalls and bundle bloat before they ship? Yes please. @vercel
We just released 𝚛𝚎𝚊𝚌𝚝-𝚋𝚎𝚜𝚝-𝚙𝚛𝚊𝚌𝚝𝚒𝚌𝚎𝚜, a repo for coding agents.
React performance rules and evals to catch regressions, like accidental waterfalls and growing client bundles.
How we collected them and how to install the skill ↓
https://t.co/kfLSbKl15X
Two-day hackathon → production alternative. That's the innovation velocity right now. But it raises a real question: when anything can be cloned at speed, what actually creates lasting value? Is it open architecture? Curated experience? Trust? Serving markets others won't touch? What's your take?
Today we are launching @openwork_ai, an open-source (MIT-licensed) computer-use agent that’s fast, cheap, and more secure.
@openwork_ai is the result of a short two-day hackathon our team decided to hack, which brings together some of our favorite open source AI modules into one powerful agent, to allow you to:
1. Bring your own model/API key (any provider and model supported by @opencode is supported by Openwork)
2. ~4x faster than Claude for Chrome/Cowork, and much more token-efficient, powered by dev-browser by @sawyerhood (legend)
3. More secure - contrary to Claude for Chrom/Cowork, does not leverage the main browser instance where you are logged into all services already. You login only to the services you need. This significantly reduces the risk of data loss in case of prompt injections, to which computer-use agents are highly exposed.
4. Free and 100% open-source!
You can download the DMG (macOS only for now) or fork the github repo via the link in bio (@openwork_ai).
Let us know what you think (or better, send a pull request)!
I spent last week at CES and was surprised how little I heard about GEO (Generative Engine Optimization).
With some notable exceptions - Ashye Marcus at @stripe stood out, and there were other great voices in the mix - however, GEO just wasn't part of the conversation yet. Plenty of exciting AI and commerce discussion, but the "how do we get discovered by AI agents" question hadn't really landed.
Then Google dropped UCP (Universal Commerce Protocol) at NRF yesterday, and I don't think people realize what just happened.
This is the first real playbook for GEO.
Everyone's been asking "how do we optimize for AI discovery?" while treating it like a future problem. Google just shipped the infrastructure layer. It's not theoretical anymore.
Here's what most coverage is missing:
The payment handler architecture is the quiet revolution.
UCP doesn't define payment methods centrally. Each payment provider publishes their own handler specification. Merchants declare which handlers they accept. AI agents pick one and follow its spec.
New payment methods can enter the ecosystem without protocol changes. Regional processors, industry-specific providers, emerging fintech - they publish handlers and become instantly accessible to any AI surface that supports UCP.
The implications for underserved merchant categories that have struggled with mainstream payment options? Significant.
Embedded checkout isn't a fallback - it's a feature.
The protocol explicitly supports transactions requiring human involvement. Age verification. Regulatory compliance. Complex customization.
This isn't AI replacing humans. It's AI and humans collaborating on transactions, with clean handoffs built into the infrastructure.
Discovery is the real unlock for GEO.
Merchants publish profiles at /.well-known/ucp declaring capabilities. AI agents autonomously discover what a business supports without pre-integration.
We went from "build custom integrations for each merchant" to "discover and negotiate dynamically."
This is the conversation that should be happening as we go into 2026. SEO had decades to mature. GEO is being defined right now, and the merchants who understand the infrastructure early will have a meaningful head start.
The major players co-developed this - Shopify, Target, Walmart, Etsy, Wayfair. Stripe, Visa, Mastercard, Adyen, PayPal endorsed it. The industry alignment is real.
But infrastructure shifts always raise the same question: will the winners be the incumbents who adapt, or the new platforms built for this reality from day one?
What's your take - is your commerce stack ready for AI agents to discover and transact with you?
https://t.co/WiEkAH2P1L
AI agents will be a big part of how we shop in the not-so-distant future.
To help lay the groundwork, we partnered with Shopify, Etsy, Wayfair, Target and Walmart to create the Universal Commerce Protocol, a new open standard for agents and systems to talk to each other across every step of the shopping journey.
And coming soon, UCP will power native checkout so you can buy directly on AI Mode and the @Geminiapp.