Anthropic is about to get everyone to waste all their tokens on asking Fable 5 stupid questions. This will either force them to use API (which Anthropic wants) or spin up a second Claude Max account. Pure genius. Will save their compute costs massively long-term.
Once OpenAI and Anthropic go public, I think that is when the real shakeup will happen.
They will have a fiduciary responsibility to raise token prices.
This is why they are trying so hard right now to become ingrained in every org's workflows as fast as possible- they want to make the decision to turn off the tokens cause an internal mutiny. They want it to be painful.
Companies are going to need to realize that the price they are paying now is completely subsidized by private money.
Does this mean the 'bubble pops'? I don't think so necessarily. Does this mean there will be a reckoning on token spend? Yes.
MIT just came out and said that all the code that is being written (about a 300% increase) has only increased total output by 30%.
This only goes to show that human direction is needed more than ever. The company that saves several hundred million in token costs, plans effectively for TRUE use cases of AI, trains their own in house models, will be an entity of the future.
You need someone who is on the cutting edge of AI capability and actually understands what is happening to help guide you through this turmoil. Don't just assume your current in-house IT guy knows all about it.
He likely doesn't.
If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.
Our first model Mac-1 6.6B beating 3 giant models.
- Haiku 4.5
- GPT 5.4 mini
- Gemini 3 flash
Running this model on my Macbook M3 24GB. (model takes only 7GB RAM)
It searches web, call tools, ask follow-ups, tell jokes, find contacts, search files, write emails, book events, write notes, set reminders and so much Siri can't do.
Read again, a 6.6B model.
Will share full 2000+ scenario test results & benchmark scores in 2 days.
On our MAHA journey, we have introduced the following:
100% grass-fed, grass-finished beef โ
100% beef tallow fries โ
100% beef tallow tots โ
Grade A Wisconsin butter โ
A2 whole milk โ
Cane-sugar Coca-Cola โ
Elimination of all microwaves โ
And we are working on changing our buns!
We are committed to becoming seed-oil free, because we are committed to making fast food the best it can be.
Im almost embarrassed to say how many times I listened to this. It actually gave me goosebumps. ๐
๐๐๐จ๐ซ๐ ๐จ๐ ๐ญ๐ก๐ ๐๐ข๐ง๐ ๐ฌ ๐๐ข๐ฌ๐๐จ: ๐๐ง๐ ๐ ๐ฎ๐ง๐ค ๐ญ๐จ ๐๐ฎ๐ฅ๐ ๐ญ๐ก๐๐ฆ ๐๐ฅ๐ฅ
How to think about applying ai simply: orchestrating compute with persistent institutional memory. Must have both.
Most companies are not afraid to build out the plan to orchestrate compute. They are terrified of the unorganized, mess of data they have stashed in their closet that is only getting worse with time.
Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs.
Google's new model, Gemma 4 12B Unified supports image, audio and 256K context.
You can run and train the model via Unsloth Studio.
GGUF: https://t.co/8cL321pVDh
Guide: https://t.co/odRo9WjRpA
Yo @AnthropicAI@claudeai if I am using the desktop app, and Claude gives me a full on answer, don't block the ENTIRE VIEW with the pre-generated responses. Please and thank you ๐
As desktop & mobile UIs for popular harnesses emerge, there are clear problems (and opportunities!).
We are addressing these directly with ThinkOS...a desktop & mobile application (coming soon!) that allows you to completely own/build your own AI....even as a non-technical person....even in complex business settings:
(1) Local, private, model-agnostic, uncensored, & independent AI must be encouraged by default.
(2) Your thoughts/memories/context shouldn't be fractured across providers. These are yours. Context/permissions can be carefully given to your teams/projects as needed on a peer-to-peer basis.
(3) Big Tech seems content to partner with top harnesses as long as centralized API keys/oauths are front and center (it doesn't need to be this way!).
(4) Even Big Tech harnesses are too complex for most people, which requires hiring expensive forward-deployed engineers (FDEs) to get your company up to speed. Most companies are falling dangerously behind.
(5) ThinkOS aims to DRAMATICALLY simplify the wins for everyday people with private, independent AI across personal and professional projects. Complex connectors/skills/tools/artifacts lose basically everyone except advanced AI architects (hence the need for expensive FDEs).
(6) ThinkOS will, of course, allow advanced users to setup complex agentic swarms/teams like other harnesses/frameworks. In addition, we believe our UX will make these easier to manage across multiple models.
(7) While Big Tech models will, of course, be optional (like any other model....hence "model-agnostic")...we are designing for local models by default....and leveraging Compiled Intelligence to reduce the need for LLM calls over time.