This is great to see. The scientific community & the general public should be in the habit of celebrating amazing scientific achievements with standing ovations 🎉
@levie Agree, tokenomics will dominate staff meetings. A few things might help 1) It’s important to own your intelligence and not rent 2) two (or so) tiers of inference (frontier models for 10% of tokens and open models for then rest 3) leverage AI routers for governance and control
Billionaire Michael Milken joked “if a US company replaces the US-born CEO with a CEO born in India, I buy the stock”
But he reveals he hasn’t backtested the idea.
So we did.
In the last 15yrs, that would’ve 50x’d your money: 7.5x more $$ and >2x IRR vs S&P500: 30% vs 14%!
The thesis is simple: the future belongs to individuals who build compounding AI systems, not to individuals who use corporate-owned centralized AI tools.
I'm trying to build these in open source so you can have them for free. That's what GBrain is.
An update regarding the future at @Cloudflare. I’ve shared my full message to the team and details on the support we're providing those departing here: https://t.co/8djT55aVSP
Coding agents are accelerating different types of software work to different degrees. When we architect teams, understanding these distinctions helps us to have realistic expectations. Listing functions from most accelerated to least, my order is: frontend development, backend, infrastructure, and research.
Frontend development — say, building a web page to serve descriptions of products for an ecommerce site — is dramatically sped up because coding agents are fluent in popular frontend languages like TypeScript and JavaScript and frameworks like React and Angular. Additionally, by examining what they have built by operating a web browser, coding agents are now very good at closing the loop and iterating on their own implementations. Granted, LLMs today are still weak at visual design, but given a design (or if a polished design isn’t important), the implementation is fast!
Backend development — say, building APIs to respond to queries requesting product data — is harder. It takes more work by human developers to steer modern models to think through corner cases that might lead to subtle bugs or security flaws. Further, a backend bug can lead to non-intuitive downstream effects like a corrupted database that occasionally returns incorrect results, which can be harder to debug than a typical frontend bug. Finally, although database migrations can be easier with coding agents, they’re still hard and need to be handled carefully to prevent data loss. While backend development is much faster with coding agents, they accelerate it less, and skilled developers still design and implement far better backends than inexperienced ones who use coding agents.
Infrastructure. Agents are even less effective in tasks like scaling an ecommerce site to 10K active uses while maintaining 99.99% reliability. LLMs' knowledge is still relatively limited with respect to infrastructure and the complex tradeoffs good engineers must make, so I rarely trust them for critical infra decisions. Building good infrastructure often requires a period of testing and experimentation, and coding agents can help with that, but ultimately that’s a significant bottleneck where fast AI coding does not help much. Lastly, finding infrastructure bugs — say, a subtle network misconfiguration — can be incredibly difficult and requires deep engineering expertise. Thus, I’ve found that coding agents accelerate critical infrastructure even less than backend development.
Research. Coding agents accelerate research work even less. Research involves thinking through new ideas, formulating hypotheses, running experiments, interpreting them to potentially modify the hypotheses, and iterating until we reach conclusions. Coding agents can speed up the pace at which we can write research code. (I also use coding agents to help me orchestrate and keep track of experiments, which makes it easier for a single researcher to manage more experiments.) But there is a lot of work in research other than coding, and today’s agents help with research only marginally.
Categorizing software work into frontend, backend, infra, and research is an extreme simplification, but having a simple mental model for how much different tasks have sped up has been useful for how I organize software teams. For example, I now ask front-end teams to implement products dramatically faster than a year ago, but my expectations for research teams have not shifted nearly as much.
I am fascinated by how to organize software teams to use coding agents to achieve speed, and will keep sharing my findings in future posts.
[Original text: https://t.co/rnnVWqebVe ]
@AndrewYNg Hi @AndrewYNg great writeup. I feel we need a best practice guide for operating these coding agents. key issues 1) intent formalization 2) verify more often based on how low in the stack? Seems like a good PhD topic in PLDI+AI
Hello. How are you? Thank you. I love you. Please.
Some of the most frequently translated phrases of the past 20 years!
Google Translate began twenty years ago with a mission to help people understand one another, regardless of the language they speak. What started as a small experiment has become a global tool that helps over 1 billion users every month.
In that time Translate has evolved from simple pattern matching to true understanding. In 2006, it relied on statistical machine learning to look for patterns in small word clusters. By 2016, we pioneered a shift to neural networks to move beyond literal word-for-word translations, and today we’re using our powerful Gemini models to make Translate even more helpful.
We are moving from text to fluid, real-time conversations. With our latest models, you can even use your headphones as a personal interpreter that preserves your original tone and cadence - it’s an amazing experience!
One of the interesting things about AI is that as we make progress, we begin to take it for granted. If you met a person who could translate across a hundred languages faster than any human can, you would be so impressed. Today, one product does that for nearly 250 languages, and we kind of just shrug.
Being able to say thank you in 250 languages is not something I take for granted. So to the 1 billion who use Google Translate - merci, dhanyavaad, arigatō, gracias, and thank you! Let’s see what the next 20 years will bring.
If you want to make sense of where the AI inference stack is heading, look at three releases in the last few days.
GLM-5.1 - open weights, 744 billion parameters. Beats the king (aka Claude Opus 4.6) on several coding tasks.
Gemma 4 - open weights, runs on a single GPU or on-device. The 31B model ranks #3 among all open models globally, while smaller variants run natively on consumer hardware.
Mythos - closed, restricted, and a dramatic change in performance from everything currently available. Priced 10x more than its open counterparts.
Three releases. Three completely different bets on what AI should be.
What we are likely going to see is bifurcation across every dimension. Smaller models will keep shrinking while getting smarter. Open-weight models will keep scaling toward the frontier. Closed frontier models keep breaking the ceiling and become more restrictive.
In many ways, this will be similar to the computing infra that has led to this moment. Except much faster and roughly at the same time.
Mobile, cloud, and PCs didn't kill each other. Each expanded the niche it occupied and created new markets. The model landscape is heading in the same direction with extreme diversity instead of consolidation. In retrospect, we will find the whole current discussion of open vs closed vs on-device to be as stupid as debating mobile vs cloud vs PCs.
Mobile banks on connectivity, web and individual empowerment.
PC users are more prosumers and creatives doing professional work and creating value from the computing infra.
Cloud quietly forms the backbone on top of which modern society runs and comes with more capability and security.
I assume a similar dynamic will happen across these different classes of models.
This is good for the ecosystem and opens a slew of opportunities for companies building on top of it. By combining them in different ways, you end up with a much bigger combinatorial space of systems. The systems that win won't be the ones that bet on open or closed, big or small. They'll be the ones who can move fluidly across the frontier as it shifts.
I burned 84 million tokens on February 28th. Researching companies, drafting memos, running agents.
That’s running Kimi K2.5, a serverless model via API. At Claude or OpenAI rates — roughly $9 per million tokens blended — equivalent usage would cost $756 for a single day’s work. My peak days hit 80 million tokens. My average days run 20 million. Cloud inference at frontier-model pricing adds up fast.
we're making @blocks smaller today. here's my note to the company.
####
today we're making one of the hardest decisions in the history of our company: we're reducing our organization by nearly half, from over 10,000 people to just under 6,000. that means over 4,000 of you are being asked to leave or entering into consultation. i'll be straight about what's happening, why, and what it means for everyone.
first off, if you're one of the people affected, you'll receive your salary for 20 weeks + 1 week per year of tenure, equity vested through the end of may, 6 months of health care, your corporate devices, and $5,000 to put toward whatever you need to help you in this transition (if you’re outside the U.S. you’ll receive similar support but exact details are going to vary based on local requirements). i want you to know that before anything else. everyone will be notified today, whether you're being asked to leave, entering consultation, or asked to stay.
we're not making this decision because we're in trouble. our business is strong. gross profit continues to grow, we continue to serve more and more customers, and profitability is improving. but something has changed. we're already seeing that the intelligence tools we’re creating and using, paired with smaller and flatter teams, are enabling a new way of working which fundamentally changes what it means to build and run a company. and that's accelerating rapidly.
i had two options: cut gradually over months or years as this shift plays out, or be honest about where we are and act on it now. i chose the latter. repeated rounds of cuts are destructive to morale, to focus, and to the trust that customers and shareholders place in our ability to lead. i'd rather take a hard, clear action now and build from a position we believe in than manage a slow reduction of people toward the same outcome. a smaller company also gives us the space to grow our business the right way, on our own terms, instead of constantly reacting to market pressures.
a decision at this scale carries risk. but so does standing still. we've done a full review to determine the roles and people we require to reliably grow the business from here, and we've pressure-tested those decisions from multiple angles. i accept that we may have gotten some of them wrong, and we've built in flexibility to account for that, and do the right thing for our customers.
we're not going to just disappear people from slack and email and pretend they were never here. communication channels will stay open through thursday evening (pacific) so everyone can say goodbye properly, and share whatever you wish. i'll also be hosting a live video session to thank everyone at 3:35pm pacific. i know doing it this way might feel awkward. i'd rather it feel awkward and human than efficient and cold.
to those of you leaving…i’m grateful for you, and i’m sorry to put you through this. you built what this company is today. that's a fact that i'll honor forever. this decision is not a reflection of what you contributed. you will be a great contributor to any organization going forward.
to those staying…i made this decision, and i'll own it. what i'm asking of you is to build with me. we're going to build this company with intelligence at the core of everything we do. how we work, how we create, how we serve our customers. our customers will feel this shift too, and we're going to help them navigate it: towards a future where they can build their own features directly, composed of our capabilities and served through our interfaces. that's what i'm focused on now. expect a note from me tomorrow.
jack
Bought a new Mac mini to properly tinker with claws over the weekend. The apple store person told me they are selling like hotcakes and everyone is confused :)
I'm definitely a bit sus'd to run OpenClaw specifically - giving my private data/keys to 400K lines of vibe coded monster that is being actively attacked at scale is not very appealing at all. Already seeing reports of exposed instances, RCE vulnerabilities, supply chain poisoning, malicious or compromised skills in the registry, it feels like a complete wild west and a security nightmare. But I do love the concept and I think that just like LLM agents were a new layer on top of LLMs, Claws are now a new layer on top of LLM agents, taking the orchestration, scheduling, context, tool calls and a kind of persistence to a next level.
Looking around, and given that the high level idea is clear, there are a lot of smaller Claws starting to pop out. For example, on a quick skim NanoClaw looks really interesting in that the core engine is ~4000 lines of code (fits into both my head and that of AI agents, so it feels manageable, auditable, flexible, etc.) and runs everything in containers by default. I also love their approach to configurability - it's not done via config files it's done via skills! For example, /add-telegram instructs your AI agent how to modify the actual code to integrate Telegram. I haven't come across this yet and it slightly blew my mind earlier today as a new, AI-enabled approach to preventing config mess and if-then-else monsters. Basically - the implied new meta is to write the most maximally forkable repo and then have skills that fork it into any desired more exotic configuration. Very cool.
Anyway there are many others - e.g. nanobot, zeroclaw, ironclaw, picoclaw (lol @ prefixes). There are also cloud-hosted alternatives but tbh I don't love these because it feels much harder to tinker with. In particular, local setup allows easy connection to home automation gadgets on the local network. And I don't know, there is something aesthetically pleasing about there being a physical device 'possessed' by a little ghost of a personal digital house elf.
Not 100% sure what my setup ends up looking like just yet but Claws are an awesome, exciting new layer of the AI stack.
Helpful update for students, you can now take full practice SATs for free in the @GeminiApp.
It uses vetted content from @ThePrincetonRev and gives you feedback straight away. Starting with the SAT today, but more tests are on the way!
I'm Boris and I created Claude Code. Lots of people have asked how I use Claude Code, so I wanted to show off my setup a bit.
My setup might be surprisingly vanilla! Claude Code works great out of the box, so I personally don't customize it much. There is no one correct way to use Claude Code: we intentionally build it in a way that you can use it, customize it, and hack it however you like. Each person on the Claude Code team uses it very differently.
So, here goes.
I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year and a failure to claim the boost feels decidedly like skill issue. There's a new programmable layer of abstraction to master (in addition to the usual layers below) involving agents, subagents, their prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations, and a need to build an all-encompassing mental model for strengths and pitfalls of fundamentally stochastic, fallible, unintelligible and changing entities suddenly intermingled with what used to be good old fashioned engineering. Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it, while the resulting magnitude 9 earthquake is rocking the profession. Roll up your sleeves to not fall behind.