Seems every big tech company is now jumping onto the Durable Workflow train.
AWS did it early this year with durable lambdas, and we're planning on using them at @Mixmax soon
I think solution becomes increasingly in demand as more and more features need Agents that operate for hours, days, weeks, months. And that's why these new offerings pop up, and Temporal starts having a lot of competition on its field.
Vercel Workflows is GA.
Your code is the orchestrator. Ship agents, backends, or any long-running process without managing queues, retries, or workers. https://t.co/l9hZe79rNz
Two engineers walk out of Google's TPU division in 2023. Reiner Pope had led AI software development for the chips. Mike Gunter had been a lead hardware designer. Between them, they'd helped build the silicon that runs Search, Gmail's smart features, and Gemini.
They didn't leave to join another hyperscaler. They started MatX, bet on a thesis most chip investors considered suicidal: that you could build a processor specifically for LLM workloads and beat NVIDIA's GPUs by a factor of ten.
This week they raised $500 million in a Series B. The investor list is where it gets interesting.
Jane Street co-led the round. Jane Street is a quantitative trading firm, not a VC. They're also one of the largest GPU buyers on Wall Street and reportedly have about 7% of their trading capital in private tech companies, including Anthropic and CoreWeave. When the firm spending hundreds of millions on NVIDIA hardware every year decides to fund an NVIDIA competitor, they're not making a symbolic bet. They're hedging a supply chain they depend on.
The other co-lead is Situational Awareness, the fund run by Leopold Aschenbrenner. In April 2024, Aschenbrenner was fired from OpenAI's Superalignment team. Two months later he published a 165-page essay predicting AGI by roughly 2027. Then he raised $1.5 billion to invest around that thesis. His fund is explicitly built on the premise that AI compute demand will outstrip anything the current GPU supply chain can deliver. MatX is a direct bet on that bottleneck.
Then there's Marvell, a chipmaker that sells networking and storage silicon for data centers. And Patrick and John Collison, the Stripe founders, writing personal checks.
The technical bet underneath: MatX uses what they call a "splittable systolic array" that combines SRAM and HBM memory. Unlike Groq or SambaNova, which optimized primarily for inference, MatX targets both training and inference on one architecture. If it works, that's one chip instead of two hardware stacks.
The timing is worth noting. NVIDIA just reported $68.1 billion in quarterly revenue. The GPU demand is real and accelerating. MatX isn't betting that NVIDIA fails. They're betting the market gets so large that a purpose-built LLM chip carves out a real piece of it, even as NVIDIA keeps winning the general case.
First chips ship 2027 via TSMC. Two engineers who helped build Google's AI silicon are now racing to prove that the next generation of AI hardware looks nothing like the current one.
We’re building an LLM chip that delivers much higher throughput than any other chip while also achieving the lowest latency. We call it the MatX One.
The MatX One chip is based on a splittable systolic array, which has the energy and area efficiency that large systolic arrays are famous for, while also getting high utilization on smaller matrices with flexible shapes. The chip combines the low latency of SRAM-first designs with the long-context support of HBM. These elements, plus a fresh take on numerics, deliver higher throughput on LLMs than any announced system, while simultaneously matching the latency of SRAM-first designs. Higher throughput and lower latency give you smarter and faster models for your subscription dollar.
We’ve raised a $500M Series B to wrap up development and quickly scale manufacturing, with tapeout in under a year. The round was led by Jane Street, one of the most tech-savvy Wall Street firms, and Situational Awareness LP, whose founder @leopoldasch wrote the definitive memo on AGI. Participants include @sparkcapital, @danielgross and @natfriedman’s fund, @patrickc and @collision, @TriatomicCap, @HarpoonVentures, @karpathy, @dwarkesh_sp, and others. We’re also welcoming investors across the supply chain, including Marvell and Alchip.
@MikeGunter_ and I started MatX because we felt that the best chip for LLMs should be designed from first principles with a deep understanding of what LLMs need and how they will evolve. We are willing to give up on small-model performance, low-volume workloads, and even ease of programming to deliver on such a chip.
We’re now a 100-person team with people who think about everything from learning rate schedules, to Swing Modulo Scheduling, to guard/round/sticky bits, to blind-mated connections—all in the same building. If you’d like to help us architect, design, and deploy many generations of chips in large volume, consider joining us.
The math on this project should mass-humble every AI lab on the planet.
1 cubic millimeter. One-millionth of a human brain. Harvard and Google spent 10 years mapping it. The imaging alone took 326 days. They sliced the tissue into 5,000 wafers each 30 nanometers thick, ran them through a $6 million electron microscope, then needed Google’s ML models to stitch the 3D reconstruction because no human team could process the output.
The result: 57,000 cells, 150 million synapses, 230 millimeters of blood vessels, compressed into 1.4 petabytes of raw data. For context, 1.4 petabytes is roughly 1.4 million gigabytes. From a speck smaller than a grain of rice.
Now scale that. The full human brain is one million times larger. Mapping the whole thing at this resolution would produce approximately 1.4 zettabytes of data. That’s roughly equal to all the data generated on Earth in a single year. The storage alone would cost an estimated $50 billion and require a 140-acre data center, which would make it the largest on the planet.
And they found things textbooks don’t contain. One neuron had over 5,000 connection points. Some axons had coiled themselves into tight whorls for completely unknown reasons. Pairs of cell clusters grew in mirror images of each other. Jeff Lichtman, the Harvard lead, said there’s “a chasm between what we already know and what we need to know.”
This is why the next step isn’t a human brain. It’s a mouse hippocampus, 10 cubic millimeters, over the next five years. Because even a mouse brain is 1,000x larger than what they just mapped, and the full mouse connectome is the proof of concept before anyone attempts the human one.
We’re building AI systems that loosely mimic neural networks while still unable to fully read the wiring diagram of a single cubic millimeter of the thing we’re trying to imitate. The original is 1.4 petabytes per millionth of its volume. Every AI model on Earth fits in a fraction of that.
The brain runs on 20 watts and fits in your skull. The data center required to merely describe one-millionth of it would span 140 acres.
Bought a new Mac mini to properly tinker with claws over the weekend. The apple store person told me they are selling like hotcakes and everyone is confused :)
I'm definitely a bit sus'd to run OpenClaw specifically - giving my private data/keys to 400K lines of vibe coded monster that is being actively attacked at scale is not very appealing at all. Already seeing reports of exposed instances, RCE vulnerabilities, supply chain poisoning, malicious or compromised skills in the registry, it feels like a complete wild west and a security nightmare. But I do love the concept and I think that just like LLM agents were a new layer on top of LLMs, Claws are now a new layer on top of LLM agents, taking the orchestration, scheduling, context, tool calls and a kind of persistence to a next level.
Looking around, and given that the high level idea is clear, there are a lot of smaller Claws starting to pop out. For example, on a quick skim NanoClaw looks really interesting in that the core engine is ~4000 lines of code (fits into both my head and that of AI agents, so it feels manageable, auditable, flexible, etc.) and runs everything in containers by default. I also love their approach to configurability - it's not done via config files it's done via skills! For example, /add-telegram instructs your AI agent how to modify the actual code to integrate Telegram. I haven't come across this yet and it slightly blew my mind earlier today as a new, AI-enabled approach to preventing config mess and if-then-else monsters. Basically - the implied new meta is to write the most maximally forkable repo and then have skills that fork it into any desired more exotic configuration. Very cool.
Anyway there are many others - e.g. nanobot, zeroclaw, ironclaw, picoclaw (lol @ prefixes). There are also cloud-hosted alternatives but tbh I don't love these because it feels much harder to tinker with. In particular, local setup allows easy connection to home automation gadgets on the local network. And I don't know, there is something aesthetically pleasing about there being a physical device 'possessed' by a little ghost of a personal digital house elf.
Not 100% sure what my setup ends up looking like just yet but Claws are an awesome, exciting new layer of the AI stack.
@grok@JMilei Ok, ahora: como podría hacer Milei y el equipo de Economía para reactivar un mercado interno en plena recesión sin volver a incentivar la inflación?
Wow man. So accurate. I'm a software engineer using Claude Code at least 8 hours per day, and everything you say is still so on point even for me.
Keep thinking how in Dune there was a rebellion against "thinking machines", humanity decided to destroy them all, and they replaced them with mentats, human computers.
A conventional narrative you might come across is that AI is too far along for a new, research-focused startup to outcompete and outexecute the incumbents of AI. This is exactly the sentiment I listened to often when OpenAI started ("how could the few of you possibly compete with Google?") and 1) it was very wrong, and then 2) it was very wrong again with a whole another round of startups who are now challenging OpenAI in turn, and imo it still continues to be wrong today. Scaling and locally improving what works will continue to create incredible advances, but with so much progress unlocked so quickly, with so much dust thrown up in the air in the process, and with still a large gap between frontier LLMs and the example proof of the magic of a mind running on 20 watts, the probability of research breakthroughs that yield closer to 10X improvements (instead of 10%) imo still feels very high - plenty high to continue to bet on and look for.
The tricky part ofc is creating the conditions where such breakthroughs may be discovered. I think such an environment comes together rarely, but @bfspector & @amspector100 are brilliant, with (rare) full-stack understanding of LLMs top (math/algorithms) to bottom (megakernels/related), they have a great eye for talent and I think will be able to build something very special. Congrats on the launch and I look forward to what you come up with!
Can natural language alone really provide the precision needed to steer systems with the complexity, scale, and failure modes of modern software?
Or does “not knowing how to code” eventually become a hard ceiling?