Today, we’re excited to introduce Miso One, the most emotive voice model in the world.
Miso One is an 8-billion-parameter text-to-speech model for highly expressive speech generation. It emotes like a human and responds faster than a human, with just 110 milliseconds of latency.
We’ve open-sourced the model weights, with API access coming soon.
Hear how Miso One sounds in the thread below.
Meet Gemma 4 12B!
A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license.
Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇
🚀 Gemma 4 12B is here!
We partnered with @GoogleDeepMind to bring and optimize their new dense and unifed multimodal model for Apple Silicon.
◈ 12B dense · 256K context
◈ Thinking mode (built-in reasoning)
◈ Vision: dynamic res, OCR, UI + charts
◈ Native audio: ASR + speech translation
◈ Function calling for agents
◈ Text + image + audio, interleaved
Runs local. Get started now ⚡
> uv pip install -U mlx-vlm
https://t.co/7BvnEuzKvj
OpenClaw is awesome, but I don't remember any other software I ever tried where the upgrade experience is this unstable.
It is a total lottery every time.
It's like 25% or less chance of going smooth, the rest of the time it just completely breaks everything.
I haven't tried Hermes a lot (only briefly played around with it). Is this much better with Hermes?
🦔GitHub Copilot switched to token-based billing this morning and users are already out of credits. Pro+ subscribers paying $39 a month are reporting 60% of their credits gone in two hours of normal use. One user lost 20% of their allowance from a single file review with no code changes. Another hit their monthly cap before the calendar even flipped to June.
Orgs with shared token pools have no way to see individual usage, so entire teams get cut off when one person runs a heavy prompt. Users are canceling and moving to Claude Code and Codex. GitHub community forums are on fire.
My Take
Flat-rate AI subscriptions were always subsidized. Everyone in the industry knew it. Today the subsidy ran out for a few million developers at once. The problem is a lot of companies already restructured around these tools. They cut headcount and told remaining engineers to lean on Copilot instead of building skills internally. Those companies now depend on a tool whose cost just became unpredictable and whose usefulness completely changes when you have to ration prompts to stay under budget.
The developers moving to Claude Code and Codex will hit the same wall eventually. Every AI provider faces the same unit economics. Anthropic filed its S-1 this morning, and the durability of its revenue depends on whether customers stick around once real pricing kicks in everywhere. If a $39 subscriber cancels after one day because the tool became unusable, multiply that across millions of seats and the churn risk becomes very real.
Today showed what happens when AI pricing meets reality. The companies that built their workflows around cheap tokens just discovered the tokens aren't cheap anymore and the people who knew how to do the work without them are already gone.
Hedgie🤗
huge shoutout to @quarqlabs for this.
many things look good here:
- layered memory: semantic, episodic, procedural
- hybrid search: uses both keyword and semantic search with well-thought-out filtering
- self-querying: generating semantic- and keyword-oriented queries
- agentic search: the system may decide whether to deepen the query and run for another turn
combining all this makes the system use various RAG techniques, leverage their advantages, and reduce flaws.
it's good to know that second place now on LongMemEval belongs to the @mastra team, and their Observational Memory may be used together with what we have here.
at the same time, we're still fragile when it comes to things that fall into the "unknown unknowns" category, non-obvious connections between memories, and more.
nonetheless, great work!
Today we're shipping our biggest MLX-VLM release yet: v0.6.0
...and we are raising 💸
This one's about turning your Apple devices into real local agent machines. From your desk to your pocket.
What's new:
⚡ Speculative decoding everywhere — Gemma 4 EAGLE3 + DFlash, Qwen MTP, DeepSeek V4 MTP. Faster tokens, less waiting.
🤖 Agent-ready server — native Anthropic /v1/messages API, stateful /v1/responses, tool calls, Codex context budgets. Plug Claude Code & Codex straight into local models.
👁️ New models galore — DeepSeek V4, ZAYA1-VL, MiniCPM-V 4.6, LFM2 MoE, Step-3.7 Flash, Laguna + more.
🎨 Image gen & editing — FLUX.2 (base + klein), PrismML Bonsai.
🔊 Audio in — Qwen3 Omni, Gemma 4 audio, base64 chat audio.
🧮 TurboQuant KV cache — RHT-correct fast paths for leaner memory.
📦 Modular server, better metrics, cleaner streaming.
Run real agents on the hardware already in your hands.
Github: https://t.co/1T06ur6LU5