Introducing LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: two multilingual retrieval models built for ultra-fast and accurate search across 11 languages.
> End-to-end retrieval latency as low as 1.5ms with our enterprise stack! 🚀
> Consistently best-in-class multilingual and cross-lingual performance across Arabic, German, English, Spanish, French, Italian, Japanese, Korean, Norwegian, Portuguese, and Swedish.
🧵
Introducing the Open Knowledge Format (OKF), an open specification that formalizes the LLM-wiki pattern into a portable, interoperable format.
AI is only as smart as the context we give it. As we build more advanced, agentic AI systems, they need accurate metadata and context to be useful. But in most organizations, that context is locked inside fragmented data catalogs, isolated wikis, scattered code comments, or the minds of senior engineers. Every time a new AI agent is built, teams are forced to solve the exact same context-assembly problem from scratch.
To solve this, we've announced OKF, a vendor-neutral, open specification that formalizes the "LLM-wiki pattern" into a portable, interoperable format. It provides a standardized way to represent the enterprise knowledge that modern AI systems rely on.
— Just markdown: readable in any editor, renderable on GitHub, indexable by any search tool
— Just files: shippable as a tarball, hostable in any git repo, mountable on any filesystem
— Just YAML frontmatter: for the small set of structured fields that need to be queryable: type, title, description, resource, tags, and timestamp
We’ve also shipped reference implementations to help you hit the ground running, including an enrichment agent for BigQuery, a static HTML visualizer, and live sample bundles on @github → https://t.co/ilhAMCrcTc
➕ Knowledge Catalog can now natively ingest OKF!
Stop reinventing data models and building bespoke integrations for every new AI tool. Here's more about how OKF works → https://t.co/FR4kJRsgEH
Zai released GLM-5.2. open weights, 1M context, and tiered reasoning for agents. open weights are for builders who run production infra.
https://t.co/yslWNEiFqP
Introducing GLM-5.2: Frontier Intelligence, Open Weights
- Significant improvements in coding and agentic tasks
- Strong long-horizon capabilities with a 1M context window
- Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency
- MIT-licensed open weights
- Same API pricing as GLM-5.1
Tech Blog: https://t.co/LAsxUdN0JZ
Weights: https://t.co/g0A1C4UWx4
API: https://t.co/Kc3E22cbN7
Coding Plan: https://t.co/Nk8Y98HNhU
Chat: https://t.co/WCqWT0qCQb
wow - this is huge!
anthropic is officially walking back their decision about banning programmatic use of claude code subscription quota
why is this a big deal?
this is a signal that anthropic is revisiting their ecosystem strategy which many of us have been criticizing
by allowing invoking claude code programmatically, anthropic will basically extend their subsidized subscription to power a much wider range of applications, not just their own, which effectively means they are leaning more into being an infrastructure provider rather than the super app that eats everything else
they still have more to do to gain back my trust as a developer but this is a very positive change and i'm happy to see anthropic revisiting their strategy
most devs stop when code runs. the goal prompt says keep going until it meets the bar. that's the difference between code that ships and code that works.
boris cherny hasn't handwritten code in 8 months. writes loops instead. when you're managing tens of thousands of agents, prompting one by one doesn't scale.
Starting June 15, paid Claude plans can claim a dedicated monthly credit for programmatic usage.
The credit covers usage of:
- Claude Agent SDK
- claude -p
- Claude Code GitHub Actions
- Third-party apps built on the Agent SDK
56,000+ tokens/sec at just 80 MHz. 🤯
I burned a full Transformer with KV cache into a custom chip. Designed gate by gate as a 100% digital integrated circuit. Prototyped on a FPGA. (No GPU. No CPU)
Just pure digital silicon running @karpathy microGPT, spelling out names on a tiny LCD.
This is GateGPT 👇
vector search used to need a full server. alibaba's zvec runs inside the app like sqlite runs for sql. local agent memory and rag without the infra overhead.
as a kid, i wondered who knew enough to build roads, bridges and buildings.
now we're building the knowledge bases and AI agents entire industries will run on.
that is the next infrastructure.
https://t.co/kWNe5l3n5A
Elite admissions select for one trait: getting the known answer faster than anyone else. 18 years of optimizing against an answer key someone already wrote.
AI just made the answer key free. Everyone has it instantly now.
So the kids trained hardest to win spent their whole lives mastering the one thing that's now a commodity. The premium moved to the questions with no answer key yet.
We need a new training.
The new training is about one thing:
How to be the first person standing in a new land, exploring it, preparing it for the coming billion people who will need it. The future will be built by these people.
And there is a lot to build.
most voice AI is a chatbot with a microphone. claude-call plugs voice into Claude Code's terminal session. same context, same skills. now your code reviews talk back.
Meet Higgsfield Games.
For the first time, build and deploy multiplayer games from one prompt, in any genre, 2D or 3D, with best-in-class characters, props, and settings generated by Higgsfield MCP.
Powered by Claude Fable 5.
Try on Claude via MCP and on our Supercomputer.