I'm building my assistant in Hermes. No, I won't blindly trust it ◆ I teach you the way of responsible AI: your data, your rules ◆ PhD in computer science ◆ Dad
Ever wondered how model quantization (FP16, Q8, Q4) *really* affects performance?
There's an analogy that makes the trade-offs crystal clear... and it involves something you might drink. 😉🍺
Kudos to @jtdavies for this brilliant comparison. 🙏
See the image for the full explanation! 👇
Hermes Desktop is available for Linux 🤘
Thats in contrast to all the other harness desktop apps of the major players. I like it a lot. Thanks @NousResearch. 🫶
The next evolution of Hermes Agent is here!
Introducing Hermes Desktop: everything you love about Hermes, now native on your machine.
First demoed in Jensen's GTC keynote, it's now in public preview.
Love the new MiniMax M3 release. 🫶
I am not sure if the "hermes update" was needed, but the model is already available in Hermes Againt from @NousResearch and my first vibe check of M3 is very positive.
@NavigateAI_@naval Good aspects you added here!
Also availabilty (it can't be turned off as there is not a single point of failure). Cloud-model providers can fuck up the hosting or decide to turn a model down.
Privacy Versus The Best Model
@naval raises a core trade-off: will open source models survive if the best-performing models require centralized cloud access?
He suggests users may willingly sacrifice privacy and openness to get the smartest model available.
That implies commercial cloud providers could win simply by delivering superior performance.
The question is framed as a major, world-shattering decision for how AI ecosystems evolve.
Naval emphasizes the scale and importance by calling these “huge” and “world-shattering” questions.
AI gets better the more it knows about you. We all experienced how much good context improves the results. Thinking this further means that the cloud AI providers know everything about you in exchange for tailored results. That's not something I am looking forward to.
I prefer the path @WolframRvnwlf describes here:
https://t.co/vXKQhpZ2pv
@drdanielbender@naval Today, frontier AI online is faster & smarter, so I use mainly that. Long term, local AI only needs to become good enough. Once it just works, I want my own trusted AI running locally, calling online models when needed - while providers focus on global problems, not our inboxes.
@WolframRvnwlf@thursdai_pod I can feel that even more than the AGI right now. 😛
Thanks for all the work you, the other co-hosts and @altryne are putting in my favorite AI pod. 🫶
@jtdavies Same for me, only official or self-build skills.
Do you use Opus 4.8 via API or subscription? Anthropic announced the latter to be back for third party mid of June as far as I know, but with limited quota.
Hermes Agent v0.15.1 dropped and brings a massive update to its skill catalog!
The entries in the skill catalog went from 858 to 19,932 entries in that single release. That's a massive expansion of what your agent can do out of the box.
Here's what you need to know: be careful what you install!
The skills hub is growing fast, and it's become easy to grab a third-party skill and just trust it works. But you're handing an LLM agent code that can read your files, run tools, and act on your behalf. If there's no official skill from the company behind the tool you want to use, it's safer to let your agent build the skill for you instead. It takes seconds, you control exactly what it does, and there's no mystery about what got injected into your setup.
The full changelog with all the other changes is linked below. 👇
@frank_thelen@BostonDynamics Looks so effortless, but reading the post shows how much complexity is added bythe unkowns.
BUT, how do we know that this video is not only a scripted demo?
Fascinating results
+ Anthropic running away with it right now
+ So many people want to start their own company
+ Google over OpenAI
+ Vercel, Linear, Every, PostHog overperforming
A great list if you're trying to figure out where to go work 👇
@danshipper@trq212@every But using it in Hermes Agent or OpenClaw will break the bank as you need to pay the API costs.
Even the announced change back to subscription usage for mid June will not change this as it will be strongly limited in the rate limits.
🥲
BREAKING:
Anthropic just dropped Opus 4.8—and it is a MONSTER
We've been testing for about a week @every and our verdict is they could've just called it Opus 5, it's that good.
Here's our vibe check:
- Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works.
HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results.
- Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context.
HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high.
- Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark.
- Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic.
THE BAD:
These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude.
Anthropic is back baby!
Read the rest on @every:
https://t.co/vuORiDXkxX
@elonmusk That's fast as the data/training collaborations between xAI and Cursor have been shared to the public in April.
Maybe it already started way before? 🤔