MDS specialize in Private LLM's and AI Data Center hardware. Tailor-made solutions using local data. Let's build your organization's AI future together.
๐ฆMicrosoft canceled its internal Claude Code licenses this week after token-based billing made the cost untenable, even for a company with effectively infinite cloud resources. Uber's CTO sent an internal memo warning the company burned through its entire 2026 AI budget in just four months. American AI software prices have jumped 20% to 37%, and GitHub (owned by Microsoft) is dropping flat-rate plans for usage-based billing across its products.
My Take
The AI subsidy era is ending in real time. The same company that put $13 billion into OpenAI and built the Azure infrastructure powering most of Anthropic's compute just looked at the bill from a competitor's coding tool and decided it was not worth paying. That is not a productivity failure on Anthropic's end. Token-based pricing is forcing every enterprise customer to confront the actual cost of running these models at scale, and the number turns out to be far higher than the flat-rate experiments suggested.
This ties directly to my Gemini Flash post yesterday. Anthropic, OpenAI, and Google all raised effective prices in the last six months. Enterprises that built workflows assuming AI costs would keep falling are now watching annual budgets evaporate in months. Two outcomes look likely from here. Either enterprises scale back AI usage to fit budgets, which slows the revenue ramp the labs need to justify their valuations ahead of IPOs, or the labs cut prices and absorb the losses, which makes the unit economics worse at exactly the wrong moment. Both paths land in the same place, the numbers stop working, and somebody has to take the writedown.
Hedgie๐ค
This Gemini update kills your biggest productivity leak.
And almost nobody is using it properly.
Hereโs the workflow:
โ Turn on Workspace Intelligence
โ Use Gemini inside Chat (no tab switching)
โ Create a Project for each task
โ Run a long-running agent
โ Turn workflows into Skills
Now youโve got:
Context โ Automation โ Output โ Done
All in ONE system.
Save this video, youโll cut your workload in half.
Want the SOP? DM me. ๐ฌ
Hermes Agent now has multi-agent via the Kanban, new in v0.12.0.
Agents claim tasks from a board, work in parallel, and hand off when blocked. You watch progress and unblock from one easy view instead of juggling terminals.
We asked it to plan and make this video about itself:
Your AI agent has a dirty little secret.
The longer you use it, the dumber it gets.
Cluttered memory. Duplicate skills. Dead workflows pulling at the wrong time.
Nous Research just dropped Hermes Agent Curator and it fixes this for good.
A background agent that runs every 7 days while you sleep:
โ Grades every skill in your library
โ Merges duplicates into cleaner ones
โ Deletes dead weight automatically
โ Leaves you a report of what changed
You wake up Monday to a smarter agent. No babysitting required.
Save this thread, you'll thank yourself later.
Want the SOP? DM me.
Hermes Desktop just made AI agents feel simple again.
And almost nobody is using this setup yet.
Hereโs what the new free Hermes desktop app actually unlocks:
โ Manage multiple AI agents in one place
โ Switch model providers instantly
โ Add tools like browser search + CLI
โ Connect Telegram, Discord, email
โ Run local models for free workflows
โ Control memory, skills, sessions easily
This replaces messy terminal workflows with a clean dashboard.
Even non-technical users can set it up fast.
Save this video, youโll set up Hermes Desktop in minutes.
Want the SOP? DM me. ๐ฌ
> you installed Hermes Agent
> you type a request, read the response, close the window
> you've been using 8% of it
> every single session
> Hermes remembers everything. you never told it what to remember
> it runs named personas on every task. you've never set one
> there are 134 slash commands you've never used
> here are the 15 features that change that:
ANTHROPIC JUST BANNED A 110 PERSON COMPANY OVERNIGHT WITHOUT WARNING
monday morning at an agricultural tech company, every single employee wakes up to an email saying their claude account has been suspended
110 people locked out at the same time with zero warning and the email even pretended it was an individual ban with a link to a personal appeal form
it took them 10 minutes on slack to realize the entire org had been wiped at once.
not even the account admins were told it was coming
they submitted the appeal form and got no response, even after 36 hours later there was still nothing
AND it gets worse:
> their separate API account is still active and still billing them
> their admins can't log in to view usage or billing because the email addresses are banned
> they got hit with a renewal invoice the day AFTER the team account was suspended
> they have no idea what triggered it. fertilizer conversations? GPS satellites? agriculture in general?
so they're paying anthropic to get banned by anthropic while anthropic ignores their support tickets
the founder of the company laid out the bigger problem perfectly
banning an entire organization for one user's behavior means a single employee or careless intern can revoke claude access for your whole business.
there's no per seat guardrail, no admin override, no way to limit the ban radius
his words: "you have to ask yourself if this is a platform you can entrust your daily workflows to as a business"
every founder reading this who runs claude through their company should be checking right now what their actual exposure looks like
billion dollar AI company with zero enterprise customer support
Want Hermes running in minutes?
Follow this exact setup flow:
1. Install Hermes from GitHub
2. Launch it inside your terminal
3. Connect Obsidian as memory source
4. Sync OMI for automatic knowledge capture
5. Activate dashboard mission control
6. Create skills for repeatable automations
7. Schedule background tasks daily
Now Hermes works like a teammate.
Not a chatbot.
Save this video, youโll build your first AI agent system faster.
Want the SOP? DM me. ๐ฌ
I tested Hermes and OpenClaw for hundreds of hours.
Hereโs what actually happened:
OpenClaw broke mid-setup.
Hermes launched instantly.
Then I ran both together inside Telegram.
That changed everything.
Now my workflow looks like this:
โ Hermes supervises OpenClaw
โ Agents talk inside group chats
โ Tasks run in parallel
โ Sessions stay organized
โ Backups handled by Manis
โ Updates never stop progress
Winning move isnโt choosing sides.
Itโs building an agent stack.
Save this video, youโll avoid agent setup mistakes.
Want the SOP? DM me. ๐ฌ
GOOGLE JUST MADE PRIVATE AI AGENTS COMPLETELY FREE WITH GEMMA 4.
Plug it into OpenClaw, run it on your own machine, and your AI stops being a chat window and starts acting like a real worker.
๐ฅ๐๐ป ๐๐ผ๐ผ๐ด๐น๐ฒ'๐ ๐๐ฒ๐บ๐บ๐ฎ ๐ฐ + ๐ข๐ฝ๐ฒ๐ป๐๐น๐ฎ๐ ๐ฎ๐ ๐ฎ ๐ณ๐ฟ๐ฒ๐ฒ ๐ฝ๐ฟ๐ถ๐๐ฎ๐๐ฒ ๐๐ ๐ฎ๐ด๐ฒ๐ป๐ ๐ผ๐ป ๐๐ผ๐๐ฟ ๐ผ๐๐ป ๐บ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ ๐ถ๐ป ๐ฏ ๐๐๐ฒ๐ฝ๐.
No API bills. No usage limits. No subscription. Nothing leaves your computer.
Here's the full setup:
โ Step 1: Go to https://t.co/493GbXWz04. Download and install. Update to version 0.2.2.0 or higher.
โ Step 2: Open terminal. Type: ollama pull gemma4. Downloads the model. Done.
โ Step 3: Install OpenClaw. Select Ollama as your provider. Point it at port 11434. Pick Gemma 4.
That's it. Your AI agent is now running locally.
Message it through Telegram, Slack, Discord, or WhatsApp like a coworker.
Read files. Write code. Remember context across every conversation. All on your own hardware.
Gemma 4 ranked number 3 on the global open model leaderboard on launch day.
It beat models with 20 times more parameters.
The 26B version activates only 4B parameters at a time so you get near large-model quality at small-model speed.
Every AI subscription you're paying for right now could be replaced with this.
๐จBREAKING: You can now run Claude Code for FREE.
No API costs. No rate limits. 100% local on your machine.
Here's how to run Claude Code locally (100% free & fully private):
๐จ BREAKING: MATTHEW BERMAN just released a video on how he runs his entire META ADS operation for $0/MONTH WITH OPENCLAW.
No agency. No VA. Just an AI agent that monitors, kills, scales, writes, and uploads ads autonomously.
Here's the system he built๐
OpenClaw becomes 100x better when you build it a Mission Control
Mission Control is a custom dashboard that lets your Claw build any tool it needs
Without one, your OpenClaw is so much worse
In this video I go over my ENTIRE mission control and show you how to set up your own
Holy shit... Microsoft open sourced an inference framework that runs a 100B parameter LLM on a single CPU.
It's called BitNet. And it does what was supposed to be impossible.
No GPU. No cloud. No $10K hardware setup. Just your laptop running a 100-billion parameter model at human reading speed.
Here's how it works:
Every other LLM stores weights in 32-bit or 16-bit floats.
BitNet uses 1.58 bits.
Weights are ternary just -1, 0, or +1. That's it. No floats. No expensive matrix math. Pure integer operations your CPU was already built for.
The result:
- 100B model runs on a single CPU at 5-7 tokens/second
- 2.37x to 6.17x faster than llama.cpp on x86
- 82% lower energy consumption on x86 CPUs
- 1.37x to 5.07x speedup on ARM (your MacBook)
- Memory drops by 16-32x vs full-precision models
The wildest part:
Accuracy barely moves.
BitNet b1.58 2B4T their flagship model was trained on 4 trillion tokens and benchmarks competitively against full-precision models of the same size. The quantization isn't destroying quality. It's just removing the bloat.
What this actually means:
- Run AI completely offline. Your data never leaves your machine
- Deploy LLMs on phones, IoT devices, edge hardware
- No more cloud API bills for inference
- AI in regions with no reliable internet
The model supports ARM and x86. Works on your MacBook, your Linux box, your Windows machine.
27.4K GitHub stars. 2.2K forks. Built by Microsoft Research.
100% Open Source. MIT License