We've gone really quickly from "local models are dogshit" to "local models are good actually" (like, a 12 month window from A to B). I don't think they're actually good ENOUGH yet. We need an Opus 4.5 quality local model. When that happens, I think the world will spill over.
Opus 4.5 is/was amazing, and is more than good enough for almost all tasks still as long as you pair with a frontier-level planner/judge.
It'll still require a hugely expensive machine to run it, I'm sure, like a $5K or more laptop or mac studio. But, that's going to be pennies compared to the API costs plus all the benefits of guaranteed privacy and so on.
Omarchy on Asahi creator lost access to his GitHub account two weeks ago due to some automated process flagging his account. Repo went offline and has been since. Not even me personally reaching out to GitHub twice has been able to restore his account. Terrifying. Embarrassing.
@antirez MacBook Pro 14" with M5Pro 64GB: prefill starts around 150tps, peaks at 175tps around 10k ctx, then 130tps @ 60k, 113tps @ 100k, 103tps @ 128k. token generation 11.6 (2k ctx) to 9.6 tps (128k ctx). Used ds4-bench with i2-imatrix, promessi_sposi.txt, streaming-cache-experts 32GB
Supply chain attacks and OSS sustainability go hand in hand. I've semi-seriously joked for years that OSS upstreams should periodically purposely inject full vulns into their code and let downstreams fuck around and find out. Downstreams can pay to get the non-FAFO version.
The not joke part is simply that OSS maintainers aren't a supply chain. OSS maintainers are not responsible for monitoring CVEs (because, they are not a supply chain). OSS maintainers are not at fault when bad shit happens to downstreams, because basically every OSS license (MIT, Apache, GPL, etc.) literally says: the software is provided "as-is, without warranty." You get what you pay for (that is to say: absolutely nothing!)
Now, the joke part is that I do believe there is an ethical obligation to try to prevent harm downstream. But "try" is the key word. So, this isn't a serious proposal.
But, if you're using OSS code and you're not paying for a license with a contract that promises some kind of warranty, you have no supply chain. You (the downstream user of an OSS lib) ARE the supply chain.
To use a metaphor: physical goods have a real supply chain. Car manufacturers, chips, clothes, toys, etc. You have a signed commercial agreement with all your suppliers that promises quantity AND quality and blowback if either are missed. Thats a supply chain.
If someone puts some chips on the side of the road with a "FREE" sign, then you integrate those into a product, then find out those chips are hacking customers, its your fault, not the person who dropped them on the side of the road.
We are investigating unauthorized access to GitHub’s internal repositories. While we currently have no evidence of impact to customer information stored outside of GitHub’s internal repositories (such as our customers’ enterprises, organizations, and repositories), we are closely monitoring our infrastructure for follow-on activity.
@antirez@saraMPascal Do you think there could be something better than the current status quo on 48GB or 64GB Macs, maybe with an heavily asymmetric quant and/or a dedicated single-model inference engine? or is a "standard" Qwen 3.6 27B / 35B-A3B the best at those memory sizes?
Let me local AI pill you:
1. It sucks compared to SOTA
2. It can’t code so well
3. It can be a good agent
4. It can be great at chat
5. It can be fine as a researcher
6. It can be a great automation engine
7. It can be tuned however you want
8. It teaches you how the sausage is made
9. It works on a plane, or in an outage
10. It costs your electric bill + hardware
11. It is better than the AI we gave up coding for a year or 2 ago.
Local AI is self defence, it is a go kit, it is a rebalancing of power.
It’s delusional to think it approaches or will ever approach SOTA, the scale of private labs blows anything you can get for less than 25k USD out the water.
Local AI is a bet that prices won’t stay this low, that private corporations with closed source weights can’t be trusted to stay consistent.
I am more than happy to rent a Ferrari for dirt cheap, but i should also have a beater Toyota if I can afford it.
Local AI is the car I can depend on to be there tomorrow, something that’s mine.
@mitsuhiko I recently got a Nuphy Gem80 and loving it, shouldn’t be too far from the Q3, possibly even better? (but that’s subjective). Got mine directly from Nuphy, arrived in Italy in a little less than 20 days
Today, we are emerging from stealth and launching PrismML, an AI lab with Caltech origins that is centered on building the most concentrated form of intelligence.
At PrismML, we believe that the next major leaps in AI will be driven by order-of-magnitude improvements in intelligence density, not just sheer parameter count.
Our first proof point is the 1-bit Bonsai 8B, a 1-bit weight model that fits into 1.15 GBs of memory and delivers over 10x the intelligence density of its full-precision counterparts. It is 14x smaller, 8x faster, and 5x more energy efficient on edge hardware while remaining competitive with other models in its parameter-class.
We are open-sourcing the model under Apache 2.0 license, along with Bonsai 4B and 1.7B models.
When advanced models become small, fast, and efficient enough to run locally, the design space for AI changes immediately. We believe in a future of on-device agents, real-time robotics, offline intelligence and entirely new products that were previously impossible.
We are excited to share our vision with you and keep working in the future to push the frontier of intelligence to the edge.
@dhh Bought one exactly to use it with Omarchy (w/ Nuphy Gem80) and loving it, too bad the airport card is completely unusable with it (or any other linux afaik, both wifi and bt), will try to swap it for one of the newer ones.
@ryanvogel@leewynne@opencode And did it fix all the issues successfully? Are you able to use the system normally without awkward limitations?
@leewynne have you already tried yours?
@shelly_IoT that's nice but do you have any plan/timeline for a Matter enabled, Gen4 version of the Pro RGBWW (or any DIN mount multi-channel dimmer, really)?
Italy is going to Mars! @ASI_Spazio and @SpaceX have signed a first-of-its-kind agreement to carry Italian experiments on the first Starship flights to Mars with customers. The payloads will gather scientific data during the missions. Italy continues to lead in space exploration!
Something I hear very little talk about:
How AI coding tools are so much LESS useful when used on existing, large codebases at work (with custom frameworks, conventions, coding style etc)
... compared to doing greenfield work or side projects
So common for me to hear: "yeah I love it on my side projects, but at work it's 'meh'"
Today, we’re announcing the preview release of ty, an extremely fast type checker and language server for Python, written in Rust.
In early testing, it's 10x, 50x, even 100x faster than existing type checkers. (We've seen >600x speed-ups over Mypy in some real-world projects.)
Oh wow, a popular GitHub Action (tj-actions/changed-files) was fully compromised. Someone committed a base64-encoded payload that runs a script that in turn prints out encoded secrets…
Stay safe out there!