Dr. Daniel Bender

Verified account

@drdanielbender

I'm building my assistant in Hermes. No, I won't blindly trust it ◆ I teach you the way of responsible AI: your data, your rules ◆ PhD in computer science ◆ Dad

Germany 🇩🇪

Joined January 2022

1.3K Following

5K Followers

8.9K Posts

Pinned Tweet

Dr. Daniel Bender

@drdanielbender

12 months ago

Ever wondered how model quantization (FP16, Q8, Q4) *really* affects performance? There's an analogy that makes the trade-offs crystal clear... and it involves something you might drink. 😉🍺 Kudos to @jtdavies for this brilliant comparison. 🙏 See the image for the full explanation! 👇

drdanielbender's tweet photo. Ever wondered how model quantization (FP16, Q8, Q4) *really* affects performance?

There's an analogy that makes the trade-offs crystal clear... and it involves something you might drink. 😉🍺

Kudos to @jtdavies for this brilliant comparison. 🙏

See the image for the full explanation! 👇

1

17

2

10

3K

Dr. Daniel Bender

@drdanielbender

about 23 hours ago

Hermes Desktop is available for Linux 🤘 Thats in contrast to all the other harness desktop apps of the major players. I like it a lot. Thanks @NousResearch. 🫶

drdanielbender's tweet photo. Hermes Desktop is available for Linux 🤘

Thats in contrast to all the other harness desktop apps of the major players. I like it a lot. Thanks @NousResearch. 🫶 https://t.co/h4Cg54RTv6

1 day ago

The next evolution of Hermes Agent is here! Introducing Hermes Desktop: everything you love about Hermes, now native on your machine. First demoed in Jensen's GTC keynote, it's now in public preview.

1K

12K

1K

7K

6M

0

4

1

0

173

Dr. Daniel Bender

@drdanielbender

2 days ago

Love the new MiniMax M3 release. 🫶 I am not sure if the "hermes update" was needed, but the model is already available in Hermes Againt from @NousResearch and my first vibe check of M3 is very positive.

drdanielbender's tweet photo. Love the new MiniMax M3 release. 🫶

I am not sure if the "hermes update" was needed, but the model is already available in Hermes Againt from @NousResearch and my first vibe check of M3 is very positive. https://t.co/3vczIgP7lL

MiniMax (official) @MiniMax_AI

3 days ago

Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: https://t.co/fHRdSV7BwZ Token Plan: https://t.co/BDCycxepZw 🚀New! MiniMax Code: https://t.co/GvB4YiB6Ul Weights & Tech Report in ~10 Days

MiniMax_AI's tweet photo. Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities

- Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas
- MiniMax Sparse Attention scales context to 1M
- Natively Multimodal from Step Zero

API: https://t.co/fHRdSV7BwZ
Token Plan: https://t.co/BDCycxepZw
🚀New! MiniMax Code: https://t.co/GvB4YiB6Ul

Weights & Tech Report in ~10 Days

529

8K

1K

3K

3M

0

4

0

0

207

Dr. Daniel Bender

@drdanielbender

5 days ago

@NousResearch Was before every tool loaded all the time?

0

0

0

0

122

Who to follow

Verified account

Researcher of AI. Assistant Professor @Tsinghua_Uni. Working on scalable methods of language and physical models @nature_will_ai.

Verified account

@yasser_elsaid_

Founder of @chatbase

Dr. Daniel Bender

@drdanielbender

5 days ago

@NavigateAI_ @naval Good aspects you added here! Also availabilty (it can't be turned off as there is not a single point of failure). Cloud-model providers can fuck up the hosting or decide to turn a model down.

0

0

0

0

26

Dr. Daniel Bender

@drdanielbender

6 days ago

Privacy Versus The Best Model @naval raises a core trade-off: will open source models survive if the best-performing models require centralized cloud access? He suggests users may willingly sacrifice privacy and openness to get the smartest model available. That implies commercial cloud providers could win simply by delivering superior performance. The question is framed as a major, world-shattering decision for how AI ecosystems evolve. Naval emphasizes the scale and importance by calling these “huge” and “world-shattering” questions.

4

5

2

1

462

Dr. Daniel Bender

@drdanielbender

5 days ago

AI gets better the more it knows about you. We all experienced how much good context improves the results. Thinking this further means that the cloud AI providers know everything about you in exchange for tailored results. That's not something I am looking forward to. I prefer the path @WolframRvnwlf describes here: https://t.co/vXKQhpZ2pv

0

1

0

0

101

drdanielbender retweeted

Wolfram Ravenwolf

6 days ago

@drdanielbender @naval Today, frontier AI online is faster & smarter, so I use mainly that. Long term, local AI only needs to become good enough. Once it just works, I want my own trusted AI running locally, calling online models when needed - while providers focus on global problems, not our inboxes.

0

2

1

0

302

Dr. Daniel Bender

@drdanielbender

5 days ago

@WolframRvnwlf @thursdai_pod I can feel that even more than the AGI right now. 😛 Thanks for all the work you, the other co-hosts and @altryne are putting in my favorite AI pod. 🫶

0

2

0

0

17

Dr. Daniel Bender

@drdanielbender

6 days ago

So hard to keep up, just learned that Opus 4.8 seems to be out. 😳 To at least try to keep updated in AI, the @thursdai_pod is my preferred source. 👇

Wolfram Ravenwolf

6 days ago

It's never a quiet week in AI. Here's what happened this week - including today's release of the new ultrathinking Claude Opus 4.8:

2

3

0

0

876

1

4

0

2

502

Dr. Daniel Bender

@drdanielbender

5 days ago

@jtdavies Same for me, only official or self-build skills. Do you use Opus 4.8 via API or subscription? Anthropic announced the latter to be back for third party mid of June as far as I know, but with limited quota.

0

0

0

0

22

Dr. Daniel Bender

@drdanielbender

6 days ago

Hermes Agent v0.15.1 dropped and brings a massive update to its skill catalog! The entries in the skill catalog went from 858 to 19,932 entries in that single release. That's a massive expansion of what your agent can do out of the box. Here's what you need to know: be careful what you install! The skills hub is growing fast, and it's become easy to grab a third-party skill and just trust it works. But you're handing an LLM agent code that can read your files, run tools, and act on your behalf. If there's no official skill from the company behind the tool you want to use, it's safer to let your agent build the skill for you instead. It takes seconds, you control exactly what it does, and there's no mystery about what got injected into your setup. The full changelog with all the other changes is linked below. 👇

drdanielbender's tweet photo. Hermes Agent v0.15.1 dropped and brings a massive update to its skill catalog!

The entries in the skill catalog went from 858 to 19,932 entries in that single release. That's a massive expansion of what your agent can do out of the box.

Here's what you need to know: be careful what you install!

The skills hub is growing fast, and it's become easy to grab a third-party skill and just trust it works. But you're handing an LLM agent code that can read your files, run tools, and act on your behalf. If there's no official skill from the company behind the tool you want to use, it's safer to let your agent build the skill for you instead. It takes seconds, you control exactly what it does, and there's no mystery about what got injected into your setup.

The full changelog with all the other changes is linked below. 👇

2

2

0

3

583

Dr. Daniel Bender

@drdanielbender

5 days ago

@aaditya_ai Any new routines or just cool Projekts which keep you more engaged?

0

0

0

0

127

Dr. Daniel Bender

@drdanielbender

6 days ago

@ivanfioravanti The average person only knows ChatGPT, even the other closed sources competitors are unknown.

0

0

0

0

58

Dr. Daniel Bender

@drdanielbender

6 days ago

@frank_thelen @BostonDynamics Looks so effortless, but reading the post shows how much complexity is added bythe unkowns. BUT, how do we know that this video is not only a scripted demo?

1

1

0

0

78

Dr. Daniel Bender

@drdanielbender

6 days ago

@sspaeti Interesting train of thought. Makes sense to me to go with the constraint a non-native language brings.

0

1

0

0

40

Dr. Daniel Bender

@drdanielbender

6 days ago

And thanks to Anthropic (and their competition) it is easier than ever before to start your own company. Do you agree?

Lenny Rachitsky

6 days ago

Fascinating results + Anthropic running away with it right now + So many people want to start their own company + Google over OpenAI + Vercel, Linear, Every, PostHog overperforming A great list if you're trying to figure out where to go work 👇

lennysan's tweet photo. Fascinating results

+ Anthropic running away with it right now
+ So many people want to start their own company
+ Google over OpenAI
+ Vercel, Linear, Every, PostHog overperforming

A great list if you're trying to figure out where to go work 👇 https://t.co/rSK8BfYNcd

196

2K

87

758

787K

0

1

0

0

83

Dr. Daniel Bender

@drdanielbender

6 days ago

@danshipper @trq212 @every But using it in Hermes Agent or OpenClaw will break the bank as you need to pay the API costs. Even the announced change back to subscription usage for mid June will not change this as it will be strongly limited in the rate limits. 🥲

0

0

0

0

73

Dr. Daniel Bender

@drdanielbender

6 days ago

It seems like Anthropic is back 👇

Dan Shipper 📧

7 days ago

BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week @every and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works. HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results. - Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context. HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high. - Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark. - Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic. THE BAD: These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude. Anthropic is back baby! Read the rest on @every: https://t.co/vuORiDXkxX

139

2K

148

836

344K

0

4

1

0

163

Dr. Daniel Bender

@drdanielbender

6 days ago

@naval @navalpodcast @rauchg @maxhodak_ @bscholl Just added the episode to my favorite podcast player @snipd_app which allows to create highlights with the back button on my earplugs. 😎

0

0

0

0

103

Dr. Daniel Bender

@drdanielbender

6 days ago

@WolframRvnwlf So hard to keep up, just learned that Opus 4.8 seems to be out. 😳

0

1

1

0

107

Dr. Daniel Bender

@drdanielbender

6 days ago

@elonmusk That's fast as the data/training collaborations between xAI and Cursor have been shared to the public in April. Maybe it already started way before? 🤔

0

2

1

0

44

Last Seen Users on Sotwe

Trends for you

Most Popular Users