Every time ASI has to construct a low dimensional UI to bridge communication... "This is a manifestation of the Continuum that we hope falls within your level of comprehension."
1/7
recently i started exploring mechinterp, and i started with audio diffusion.
i spent some time cracking open stable audio 3 to see if mood has a zip code in its residual stream.
it does! it's layer 11 and you can dial it. (audio results below) 🧵
Local AI Assistant with TTS and wake word detection - say “look at this” to take a picture and ask about it, and “question” to simply ask a follow up question
Using Whisper for speech transcription, Kokoro for TTS (both via @huggingface’s Transformers.js), and @EdgeImpulse to create the wake word model
Works with compatible smart glasses, in this case @omidotme’s Omi Glass running our custom firmware (based on the @seeedstudio Xiao esp32-S3)
LocalVQE v1.1 is out: A tiny ~1M parameter model that does echo supression and noise cancellation in realtime. V1.1 comes with big improvements to audio quality over V1.
NetHack is one of the most complex and longest-lived open source programs ever written, and after 46 years, v5.0 shipped today.
https://t.co/ICEyakS6T5
And ... it is a VERY cool large codebase to work with in the LLM era.
For years, voice AI has been stuck in a rigid loop: think, then speak. But real human conversation is messy, overlapping, and asynchronous.
In our new #ICASSP2026 work, we built a tandem architecture that shifts the paradigm to “speak while thinking.” A fast speech model starts replying instantly, while a backend LLM runs in parallel to inject deep knowledge on the fly.
It’s a completely different way to approach conversational AI, making it feel remarkably more alive.
Blog: https://t.co/qZ4cEKUUIm 🐢
Announcing Talkie: a new, open-weight historical LLM! We trained and finetuned a 13B model on a newly-curated dataset of only pre-1930 data. Try it below!
with @AlecRad and @status_effects 🧵
New Anthropic research: Project Deal.
We created a marketplace for employees in our San Francisco office, with one big twist. We tasked Claude with buying, selling and negotiating on our colleagues’ behalf.
Overheating electronics could soon be a thing of the past. Researchers have identified theta-phase tantalum nitride, a material with the highest thermal conductivity ever measured among metals, showing promise for microelectronics, AI hardware, and more - https://t.co/shh37mQpQu
Ternary Bonsai: state-of-the-art intelligence at 1.58 bits. The models are so small they can even run locally in your browser on WebGPU! ⚡️
Here's the 8B version (just ~2GB in size) running at 60 tokens per second on my M4 Max.
Try the demo out yourself! 👇
New paper out in Proceedings of the Royal Society B: we apply linguistic tools to sperm whale vowels.
The result: sperm whale vowels do not just look like human vowels. They also behave like them.
We found several parallels. Like in Latin, whales have short and long vowels. Like in Slovenian, some vowels prefer particular tones. Like in human language, there’s a lot of coarticulation (a process when you say “tense” but the word sounds like “tents”).
Observing vowels in whales is a matter of timing. Our vowels are fast, whale vowels are slow. Beats become pitch if they’re fast enough. If you slow down human vowels, they start sounding like whale clicks.
Applying linguistic tools to whales shows us that we’re much more similar to these wonderful ocean creatures than we previously believed and that their language is much more complex and structured.
@projectCETI@UCBerkeley
"How do I decrease the latency of my voice agent?" is our #1 question.
The answer depends on your pipeline bottlenecks.
We recommend co-locating your agent with models, evaluating faster models, and practicing good tool hygiene.
Check out our playbook for more tips.
Lunar Permanence will require using resources on the Moon rather than hauling them from Earth. Our in-situ resource utilization system extracts oxygen from lunar regolith to create breathable air for astronauts and propellant for refueling landers and fuel cells. It also produces iron, aluminum, silicon, construction materials, and even solar power systems. The materials for a Moon base are produced right where they’re needed, and at much lower cost than being brought from Earth.
I got a derm researcher to agree to let me use her equipment to see if we can look at these in regular people.
If we can see anything, we should be able to dispositively figure out which animals can see them, if any, too.
Should be a fun project for this summer!
Space Forge’s satellite has turned on its furnace, producing superhot plasma that could be used to manufacture #semiconductors in #space. It turns out that making semiconductor crystals has some design advantages in space. https://t.co/SPv3qB5lfl
Seedance 2 joins the "cannot count to 10" club:
> a man counts out loud from 1 to 10, using his fingers and holding them up as he goes
Pretty sure this is a scary Ryan Reynolds/Ben Stiller hybrid.
(Variants of this prompt with the numbers explicitly listed also fail)
JUNE 2028.
The S&P is down 38% from its highs. Unemployment just printed 10.2%. Private credit is unraveling. Prime mortgages are cracking. AI didn’t disappoint. It exceeded every expectation.
What happened?
https://t.co/JzzwCrbJgS