building AI & robots, created consumer/enterprise products for 10+million users; built data and AI teams at 2 Fortune 500 companies; health startup; biology PhD
logical reasoning is one of the key capabilities AI should have before it could do useful works for us reliably. in feb I benchmarked the top line AI models with Zebra Logic benchmarks and o3-mini-high was the only model that could solve the 6*6 puzzles. 3 months past, a lot new models have been released, how are they fare against the Zebra Logic benchmark?
I have tested o3, o4-mini, Gemini-2.5-Pro and QWen 3 235B (released yesterday), and their results are ...
@OpenAI o3-mini-high is the only model that can solve the 6*6 Einstein's Riddle aka Zebra Logic Puzzle so far! not DeepSeek R1 not Google Gemini 2.0 Flash Thinking Experimental 01-21
a standing ovation for daraxonrasib at asco. over 40k oncologists, entrepreneurs, investors, and patient advocates together celebrating revmed's breakthru in the fight against pancreatic cancer. u never forget these moments. it's what innovation is all about.
Unpopular opinion with Tesla fans:
I expect @Tesla_Optimus to slip into next year and maybe even 2028. Maybe even further. And I am talking about the big version 3 unveil that Elon was hyping up over the last year.
Tesla execs watched Apple and Google growing up.
Steve Jobs’ magic? Wait for everyone to make their plays. Think they have the market sewn up.
Then fix all the problems and ship a better product.
Jobs killed the phone project several times because it wasn’t good enough.
I have talked to enough people who are Tesla fans that I know expectations are super high. If Elon shows a robot that isn’t generalized a lot will be lost.
Ask AltaVista or Nokia how that worked out. Or MySpace. Being first doesn’t really matter. Kodak invented the digital camera.
Truth is the pieces aren’t all there yet to make a great product, either inside, or outside Tesla. Even the Chinese haven’t figured it out.
The world models aren’t ready yet.
I won’t be shocked if @elonmusk keeps delaying it until true generalized world models are ready. Even the most aggressive I have talked to say that is at least 18 months away.
Many in robotics say five years.
The next time we see a new Optimus it will set in place beliefs that will last a decade.
Wait until it is stunning, I say.
But the factories are being built so pressure is on.
We we see how much discipline Tesla has by its choices.
1/5
I'm a cardiologist. I have spent twenty years watching cholesterol destroy arteries, trigger heart attacks, and kill people I care about.
Today, Eli Lilly presented data that may begin to end that era.
VERVE-102. A single infusion. One dose. It uses base editing to permanently turn off the PCSK9 gene in your liver.
Presented today at the European Atherosclerosis Society Congress:
88% reduction in PCSK9.
62% reduction in LDL cholesterol.
Sustained up to 18 months.
No treatment-related serious adverse events.
One infusion. Not daily pills you forget to take. Not monthly injections. One dose — and your cholesterol may stay low for the rest of your life.
A startup idea that only works if there are already a significant number of people using it is not a valid startup idea. There has to be some subset of users who need what you're making so desperately that they'll use it even if no one else is.
Everyone building AI agents is focusing on building the prefrontal cortex. Planning. Reasoning. Multi-step chains. There's value here. CEO-stuff.
But also, a reframe: there is value in building the cerebellum. It's offloading boring tasks into reflex so the complex thought can focus.
Your mortgage gets paid by a standing order, not a committee. The things that are not fun, not interesting, but have to be done? Done. Most agent frameworks will fail because they treat all cognition as high cognition.
The winners will nail the boring stuff first.
It's not a popular opinion here, but politics tells you almost nothing about a person. I've known racist hard-rights who are pillars of their community, soft leftists who are family-destroying monsters, and everything in-between.
'Politics' can be a fig leaf. Actions matter.
We need heat shields to protect us, since we use the air to slow us down as we return to Earth.
From orbital speed, it gets to 1650°C / 3000°F. From the Moon: 2750°C / 5000°F.
For yesterday's Starship suborbital test flight, peak was 1450°C / 2600°F. Great to see the @SpaceX progress over the last 3 flights. Making them truly reusable is complex and necessary for permanent, cheap space access.
image compilation: @niccruzpatane
This is by far one of the best examples of how a droid would navigate a battle in real life.
Every shot is on target, there is no wasted motion, and the only limiting factors on its kill potential are the firing rate of its blasters and how much ammunition it has. It does not run or move unnecessarily quickly; it is just measured and precise.
Some inspired Ukrainians built an interactive website showing how long it takes for the world's slowest creatures to reach Kupiansk in contrast with the Russian army. Complete with David Attenborough quotes and set to "Kalinka." https://t.co/kFcqGZ6weo