LocalCan 3.0 beta.6 is out.
Your AI agent can now drive LocalCan over MCP:
→ "expose port 3000" and it hands back a Public URL
→ "why did this webhook fail?" and it reads the actual request your app received
→ pause, resume, or remove URLs when it's done
Works with Claude Code, Codex, Cursor, and any MCP host. Agents are read-only until you flip a switch, and secrets are redacted from what they see by default.
Also new: `localcan traffic` to read captured requests from the terminal, and one-click copy of any request as Markdown, cURL, HAR, or JSON from the inspector.
there's a risk we'll get used to slop as a society, i don't want that. i want a world of creativity, nuance, greatness. won't stop til we make that happen!
thanks for an epic chat @julianweisser@solofounders x @taste_ai_
New blog post on harness optimization. We hit Sonnet 4.6 performance with a 7x cost improvement.
Fable 5 was the first frontier model release that evaluated on legal tasks. It only scored 13%, the worst performance among all benchmarks evaluated.
@Harvey released this benchmark called Legal Agent Benchmark (LAB) just a month prior. It contains a set of realistic legal matters. Each task gives the agent a closed workspace of documents (contracts, emails, spreadsheets, slide decks) and asks for a concrete deliverable: a diligence memo, an issue list, a redline, a draft. An LLM judge grades the deliverable against a long rubric containing 61 distinct binary criteria each on average.
Many frontier models such as Gemini 3.1 Pro don't surpass 0% all-pass rate (all rubric criteria passed). With automatic harness optimization, we manage to push DeepSeek V4 Pro from 0% to 5% all-pass rate, achieving parity with Sonnet 4.6 for 1/7 of the price.
Read the blog post for the details: https://t.co/kBrWrQkgJW
We just added Astro.js as a new framework filter.
Go explore what's being built with it! 👀
Every website in Save Design is carefully analyzed before it reaches the feed.
Frameworks. Fonts. Animation libraries. And much more.
Found a website you love?
Run it through our Analyzer and instantly discover the tech behind it.
Save design inspiration in a new, better way.
Be inspired! 💫
🔗 https://t.co/fK0j44F0Vn
Most agents are stateless prototypes that hallucinate and lose context. To move into production, they need persistent memory.
Watch it now if you missed it, and learn how to build an "Enterprise Knowledge Layer" that grounds intelligent systems.
@lyonwj covers:
The 3 Pillars of Memory
Context graph
Domain-Specific Ontologies
https://t.co/fkRA0N1jiX
Simon Eskildsen @Sirupsen w/ @GergelyOrosz at @aiDotEngineer World's Fair
"I hate benchmarks...this should take 10ms in napkin math"
Napkin Math = raw material costs
> bandwidth to DRAM / NVME / EBS volume
> roundtrip to s3 time + cost
> 1GB of memory, S3, disk
> spot vs contract
starting with the raw material costs allowed @turbopuffer to create a 10X cheaper vector store
this is the same first-principles approach @elonmusk uses at @SpaceX
"what is an rocket actually made of?"
> aluminum alloys
> titanium
> copper
> carbon fiber
> fuel
raw materials were only 2% of the selling price of a rocket, the rest was markup + inefficiency
falcon 9 was 10X cheaper than traditional rockets
start with first principles, challenge everything, and build from the bottom up
Booster Robotics has launched Booster Studio, the first IDE built specifically for embodied AI.
> Booster Studio features code editing, high-precision simulation, real-robot debugging, and real-world deployment.
> It shortens the path from the first idea to the working hardware, with the build, test, and deployment loop living in a single environment rather than scattered across separate tools.
Vibe Robotics 👀
Fuser Apps is live on @ProductHunt today 🚀
If you're an artist, designer, or creator, this might be your new favorite way to build.
App generations are free for the next month.
Drop an upvote ↓
🚨 Top mathematicians just issued a clear warning about AI: Don't believe the hype.
Over 2,300 mathematicians, including Fields Medal winners Terence Tao and Peter Scholze, have signed the Leiden Declaration on Artificial Intelligence and Mathematics. Endorsed by the International Mathematical Union, it is the most significant collective response from a major academic discipline evaluating frontier AI impact.
The core message is straightforward: current AI tools have real constraints when applied to complex work, and commercial incentives are pushing claims beyond what the technology can reliably deliver.
Read the full declaration here: https://t.co/hKSXoSt4Tr
Why this matters beyond mathematics
The declaration identifies five threats that apply to any field deploying AI:
1) Plausible but unreliable outputs.
AI produces arguments that "look" correct but contain subtle errors. In high-stakes work, human verification is critical and costly.
2) Attribution collapse.
Models trained on published work don't properly cite sources. Training data was often obtained by exploiting licenses or violating copyright protections.
3) Distorted incentives.
AI use becomes incentivized for its own sake, warping hiring, funding and recognition.
4) Press release science.
Results announced "on market timelines" before community evaluation can take place. Commercial incentives drive firms to "overstate the capabilities of their products."
5) Loss of autonomy.
Research priorities shift toward what is automatable rather than what is significant.
The leap: chatbots → agentic AI → software → research
We have moved from chatbots to agentic systems. Now AI is solving 80-year-old mathematical conjectures. The declaration is not about toy problems. It is about frontier systems being deployed in contexts where correctness matters.
What this means for your industry
The same risks apply wherever AI is used in high-stakes work: law, medicine, finance, engineering. The declaration's core insight is simple:
AI generates narrative, not truth. Verification cannot be automated away. Human accountability is non-negotiable.
💼 I’ve written a more detailed breakdown of how these risks show up in practice and what organisations are doing about them. It’s available for subscribers.
☕️ What have you observed in your industry? Have verification or hidden costs issues already appeared in your AI deployments?
Today, we are releasing Rampart: a 14.7MB machine learning model designed to protect citizens’ privacy by redacting personal information directly in your browser before it gets sent to any server
The human brain is strikingly modular, with distinct networks for language, formal reasoning, social reasoning, and physical reasoning. Is this a fundamental principle of how intelligent systems are built, or an accident of biological evolution?
In our latest preprint, we find that a similar modular organization emerges in Large Language Models, another class of intelligent system.
Brains and LLMs are shaped by entirely different kinds of optimization (biological evolution vs. gradient descent). That they arrive at the same modular design anyway suggests modularity may be a fundamental property of intelligent systems.
🌐 Web: https://t.co/ZKrnTSSuSf
📄 Paper: https://t.co/ZibBXz3PUy
💻 Code & data: https://t.co/uBo5iOYNjy
Using circuit analyses across 46 tasks spanning four cognitive domains, we find:
1️⃣ Tasks that draw on the same network in humans recruit overlapping units in LLMs, while tasks drawing on different networks recruit distinct units.
2️⃣ These units are causally linked to model behavior. Ablating the units critical for one domain impairs performance in that domain (−26% accuracy) but barely touches the others (−2.5%).
This project has been in the works for a while :) Huge thanks to my advisors @jacobandreas@ev_fedorenko@devarda_a, and to @Nancy_Kanwisher for valuable conceptual input and feedback throughout. #MIT
James Gosling's vision for Java was to create a blue collar language. Find out why and how.
The Story of Java is coming to YouTube on July 17th. Join us and Gosling in the live chat.
Announcing the first production robot navigation framework on $500 hardware
Explore the world once → your robot agent will relocalize and build a persistant, spatial memory across sessions
SLAM, relocalization, loop closure, map i/o, planning, control
No ROS. Open source.