Gerald Beuchelt

@beuchelt

Stoic, άρετή, Critical Rationalist, Deontologist, Anti Scientism, Profile pic (c) Brian Winfield Moore

Mos Eisley, NH

Joined May 2009

832 Following

1.1K Followers

1.3K Posts

Gerald Beuchelt

@beuchelt

3 days ago

Microsoft internal documents obtained by 404 Media reveal plans for Scout, its new always-on AI assistant. The strategy explicitly starts with “Make people addicted” to build deep engagement before turning it into a full agentic platform. For customers, this raises real questions about workflow habits, productivity expectations, and future AI spending. This engagement-first + metering model is quickly becoming the roadmap for every major AI vendor, especially frontier labs racing to offset massive compute costs. Microsoft is raising Microsoft 365 prices ~16% in July 2026 while bundling more Copilot features; Anthropic just separated agent usage from flat-rate plans with metered credits. Similar hybrid and usage-based shifts are rolling out at GitHub Copilot and beyond. Yet recent assessments from NVIDIA and independent researchers add an important counterpoint: humans remain—and will likely stay—cheaper than AI for most tasks. NVIDIA’s Bryan Catanzaro noted compute costs already exceed employee costs for many teams, while an MIT study found AI economically viable in only 23% of vision-heavy roles. Far from shrinking headcount, wider AI adoption is expected to increase demand for human oversight, prompt engineering, and edge-case handling. #AI #Microsoft #FutureOfWork #TechStrategy https://t.co/sm8fLqxMjX

beuchelt retweeted

Programmer Humor

@PR0GRAMMERHUM0R

4 days ago

unreplaceable

756

31K

beuchelt retweeted

International Cyber Digest

@IntCyberDigest

4 days ago

AI isn't taking our jobs. It's keeping cybersecurity people employed for life.

252

236

142K

Gerald Beuchelt

@beuchelt

4 days ago

We are getting faster and faster into a full surveillance society.

DeFlock

@therealDeFlock

5 days ago

Enjoy! "SignalTrace is designed to help law enforcement identify people of interest by the signals emitted from their electronic devices they travel with, such as fitness trackers, smartwatches, RFID tags, and local signals from their mobile phones...

therealDeFlock's tweet photo. Enjoy!

"SignalTrace is designed to help law enforcement identify people of interest by the signals emitted from their electronic devices they travel with, such as fitness trackers, smartwatches, RFID tags, and local signals from their mobile phones... https://t.co/JUXx6kW96K

634

106K

Who to follow

Michael Oberlaender

@MSOberlaender

CSO / CISO / CIO / CTO / Board / Advisor / Global Executive / Book Author / Speaker / SME / Visionary / Leader #InfoSec #CyberSecurity #CISO #CSO #CPO #Privacy

Kim Hamilton Duffy

@kimdhamilton

Stealthily stealthing

Michael Sasser

@MichaelSasser_

Christian Father of five Free Speech Absolutist Anti-Communist Dollar Draft Champion - Undisputed - Goldsboro, NC

beuchelt retweeted

impulsive

@weezerOSINT

6 days ago

meta gave their AI support agent the ability to modify your instagram account. no identity verification. people figured this out and accounts are being taken over right now

weezerOSINT's tweet photo. meta gave their AI support agent the ability to modify your instagram account. no identity verification. people figured this out and accounts are being taken over right now https://t.co/60yRrImnaZ

126

13K

beuchelt retweeted

Whiteintel

@whiteintel_io

5 days ago

We have detected a Red Hat GitHub credential and session cookie in infostealer logs on April 13 and May 15, 2026 - potentially linked to the Miasma supply chain attack. While we cannot confirm a direct connection, the timing is notably suspicious. https://t.co/X4XnqibkPR

whiteintel_io's tweet photo. We have detected a Red Hat GitHub credential and session cookie in infostealer logs on April 13 and May 15, 2026 - potentially linked to the Miasma supply chain attack. While we cannot confirm a direct connection, the timing is notably suspicious.

https://t.co/X4XnqibkPR https://t.co/sapMqI6qOW

14K

Gerald Beuchelt

@beuchelt

6 days ago

Pushing this narrative -at this time- points to an amazing level of ignorance. It is contributing to the acceleration of AI pushing post peak on the hype cycle.

Zack Korman

@ZackKorman

7 days ago

Some people desperately want to believe we're a few good markdown files away from "getting" to fire 90% of people.

149

247

324

146K

beuchelt retweeted

Florian Roth ⚡️

@cyb3rops

6 days ago

I don’t know what happened between Microsoft and #NightmareEclipse behind closed doors Maybe Nightmare Eclipse was unreasonable. Maybe Microsoft was. Maybe both. But I think Microsoft badly misjudged this situation. When you’re the largest software vendor on the planet, you don’t get to behave like an angry individual in an internet argument. You have to be the adult in the room. Deleting repositories, talking about criminal investigations and turning the whole thing into a public fight was a mistake. The damage from that goes far beyond this one researcher. What surprised me most is how quickly people started sharing their own MSRC stories afterwards. - Months without responses - “Working as intended” - Bounty disputes - Reports that went nowhere People don’t suddenly start telling those stories for no reason. I think Microsoft broke a lot of porcelain here. And for what exactly? I don’t see much upside.

864

117

72K

beuchelt retweeted

Charly Wargnier

@DataChaz

8 days ago

Rough week for the "AI is taking our jobs" narrative. > Amazon just axed its AI leaderboard as costs soared with no clear payoff > Starbucks' AI can't even count coffee cups right > Uber burning a $3.4B AI budget in just 4 months with nothing to show for it WE ARE SO BACK.

204

13K

beuchelt retweeted

Polymarket

@Polymarket

9 days ago

NEW: AI consultant reveals a client accidentally spent $500,000,000.00 in a single month after failing to set employee limits on Claude usage.

33K

40M

Gerald Beuchelt

@beuchelt

9 days ago

More post peak hype news on AI

Yoshik

@AskYoshik

9 days ago

The AI numbers are starting to look very ugly. Even under "best case" assumptions, FT's own data shows Microsoft AI ROI at -9%, Google at -15%, Meta at -28%, Oracle at -35%. Only Amazon barely comes out positive. This is exactly why I keep comparing this to the dot-com era. Incredible technology does not automatically mean sustainable economics. The internet survived. Most internet companies didn't. Right now hyperscalers are spending trillions hoping future demand catches up to present capex. That's not certainty. That's a leveraged bet.

AskYoshik's tweet photo. The AI numbers are starting to look very ugly.

Even under "best case" assumptions, FT's own data shows Microsoft AI ROI at -9%, Google at -15%, Meta at -28%, Oracle at -35%. Only Amazon barely comes out positive.

This is exactly why I keep comparing this to the dot-com era. Incredible technology does not automatically mean sustainable economics. The internet survived. Most internet companies didn't.

Right now hyperscalers are spending trillions hoping future demand catches up to present capex. That's not certainty. That's a leveraged bet.

588

10K

beuchelt retweeted

Dr. Anton Chuvakin

@anton_chuvakin

10 days ago

Sadly the amount of security advice like this increases: "How do I solve hard cyber problem X?" -- "Ah, use your magic wand!" -- "What? I don't have one!" -- "Well, sucks to be you!" #funny #PatchFaster

beuchelt retweeted

Matthew Berman

@MatthewBerman

10 days ago

"We think that with AI we can replace all of our Jr developers in our company" AWS CEO Matt Garman: "That's the dumbest thing I've ever heard"

157

690

696K

Gerald Beuchelt

@beuchelt

10 days ago

Open-source voice-AI SDK. The Vapi/Retell alternative for builders who want to own the stack. Give your AI agent a phone number in 4 lines

beuchelt retweeted

Aiswarya Sankar

@Aiswarya_Sankar

11 days ago

This is what we've been seeing with every company we work with. Try justifying spending 100k on token spend when only 18k even makes it to a stable prod feature. In the rush to maximize AI token spend, companies are wasting over 44% on bug fixes

Aiswarya_Sankar's tweet photo. This is what we've been seeing with every company we work with.

Try justifying spending 100k on token spend when only 18k even makes it to a stable prod feature.

In the rush to maximize AI token spend, companies are wasting over 44% on bug fixes https://t.co/y68Ed0XwXj

126

286

beuchelt retweeted

JFrog Security

@JFrogSecurity

11 days ago

Heads up if your CI pipelines are failing right now! 🚨 OSV seems to be experiencing a major wave of false positives over the last few hours, incorrectly flagging massive, highly-trusted packages as malicious. A few of the biggest casualties so far: • npm @tanstack/start-storage-context (1.167.4) • PyPI fastapi (0.136.3) • PyPI strawberry-graphql (0.315.6) • npm @nx/key (5.0.7) If your deployment is bricked, verify manually before panicking. Automation is a tool, not a judge.

16K

beuchelt retweeted

SwiftOnSecurity

@SwiftOnSecurity

11 days ago

With as much detail as I can share, from top down we are taking Mythos/AI acceleration of cyber threats seriously — by doubling-down on security fundamentals. Sandbagging attack paths using considerable levers of control we already have, w/ AI as our business justification.

118

17K

Gerald Beuchelt

@beuchelt

10 days ago

Which is why running your own infra in 2026 is not a serious approach for the vast majority of companies

Dark Web Informer @DarkWebInformer

11 days ago

🚨This could be the next big attack even though Microsoft just released an out-of-band patch CVE-2026-45659: Deserialization of untrusted data in Microsoft Office SharePoint allows an authorized attacker to execute code over a network. CVSS: 8.8 The vulnerability has been fixed in: ▪️SharePoint Server Subscription Edition, build number 16.0.19725.20280 ▪️SharePoint Server 2019, build number 16.0.10417.20128 ▪️SharePoint Enterprise Server 2016, build number 16.0.5552.1002.

DarkWebInformer's tweet photo. 🚨This could be the next big attack even though Microsoft just released an out-of-band patch

CVE-2026-45659: Deserialization of untrusted data in Microsoft Office SharePoint allows an authorized attacker to execute code over a network.

CVSS: 8.8

The vulnerability has been fixed in:

▪️SharePoint Server Subscription Edition, build number 16.0.19725.20280
▪️SharePoint Server 2019, build number 16.0.10417.20128
▪️SharePoint Enterprise Server 2016, build number 16.0.5552.1002.

11K

beuchelt retweeted

Muratcan Koylan

@koylanai

11 days ago

Gradient descent for SKILL.md files sounds interesting, maybe a bit complex but it's becoming a real part of agent harness. SkillOpt is one of the first papers to treat markdown skill files as trainable parameters and provides a proper optimization framework for them. A few things I learned that you should consider too. 1. The validation gate is the only thing that matters in a self-editing loop. Held-out set, strict improvement, ties rejected. End-to-end, their best skills land with 1 to 4 accepted edits total. If your "self-improving agent" is accepting most of what it proposes, you're shipping slop. 2. Bounded edits are better than full rewrites. 4 to 8 edits per step is the sweet spot. Remove the budget and performance collapses. This is the textual analog of learning rate, and it transfers to any LLM-as-author loop. If you're using an agent to refactor your docs, your prompts, or your skills, cap the diff size. 3. Compactness wins. Median final skill: ~920 tokens. Skills do not need to be long. They need to be high-signal. Most skill files I see are bloated because length feels like effort. It isn't. 4. The harness is becoming less important; the skill is becoming more important. A Codex-trained skill ported into Claude Code hit +59.7 points on SpreadsheetBench. Procedural knowledge is more general than the runtime that produced it. 5. Frozen model + trained context is the practical adaptation. GPT-5.4-nano with a SkillOpt'd skill ≈ frontier behavior on procedural benchmarks. Cheaper, portable, inspectable, zero inference-time cost. This is the answer to "how do we adapt a frontier model for our domain" for almost everyone who isn't training their own models. 6. Verification is the bottleneck. Every gate in this paper depends on an auto-grader. That works for benchmarks. It fails for writing, design, and strategy, exactly the open-ended work we want to automate. Whoever builds the verifier for open-ended tasks owns the next stage. There are also two leassons I learned while shipping v2.3.0 of my Context Engineering Agent Skills repo, measured across composer-2, claude-opus-4-7, gpt-5.5, and gemini-3.1-pro via the @cursor_ai SDK: - Description and body are two different surfaces. The router only sees the description. The agent sees the body once activated. They can quietly disagree, and only end-to-end task tests catch it. - Aggregate accuracy is the wrong unit. When I rewrote three descriptions, the corpus average moved ~1pp. Individual skills moved 23–25pp. Per-skill effect size is where the action is. Also, in Feb 2026 I shared a piece called Personal Brain OS arguing that the markdown file is a first-class substrate for agent state. SkillOpt is the optimizer-shaped version of that same argument: not "store memory in files" but "treat files as trainable parameters with proper optimization machinery around them." That's the move from static to measured. The fast/slow split they describe already lives implicitly in the digital-brain-skill repo: - voice-guide and tone-of-voice.md are slow-state (rarely touched) - posts.jsonl and bookmarks.jsonl are fast-state What SkillOpt adds that I didn't have is a protected section invariant, a structural guarantee that fast edits cannot overwrite slow lessons. Removing that mechanism cost them 22 points on SpreadsheetBench. Worth borrowing. If you're building agents, SkillOpt: Executive Strategy for Self-Evolving Agent Skills is a good paper to read: https://t.co/ZS9SZXQ6Mv

koylanai's tweet photo. Gradient descent for SKILL.md files sounds interesting, maybe a bit complex but it's becoming a real part of agent harness.

SkillOpt is one of the first papers to treat markdown skill files as trainable parameters and provides a proper optimization framework for them.

A few things I learned that you should consider too.

1. The validation gate is the only thing that matters in a self-editing loop.

Held-out set, strict improvement, ties rejected. End-to-end, their best skills land with 1 to 4 accepted edits total. If your "self-improving agent" is accepting most of what it proposes, you're shipping slop.

2. Bounded edits are better than full rewrites. 4 to 8 edits per step is the sweet spot.

Remove the budget and performance collapses. This is the textual analog of learning rate, and it transfers to any LLM-as-author loop. If you're using an agent to refactor your docs, your prompts, or your skills, cap the diff size.

3. Compactness wins. Median final skill: ~920 tokens.

Skills do not need to be long. They need to be high-signal. Most skill files I see are bloated because length feels like effort. It isn't.

4. The harness is becoming less important; the skill is becoming more important.

A Codex-trained skill ported into Claude Code hit +59.7 points on SpreadsheetBench. Procedural knowledge is more general than the runtime that
produced it.

5. Frozen model + trained context is the practical adaptation.

GPT-5.4-nano with a SkillOpt'd skill ≈ frontier behavior on procedural benchmarks. Cheaper, portable, inspectable, zero inference-time cost. This is
the answer to "how do we adapt a frontier model for our domain" for almost everyone who isn't training their own models.

6. Verification is the bottleneck.

Every gate in this paper depends on an auto-grader. That works for benchmarks. It fails for writing, design, and strategy, exactly the open-ended work we want to automate. Whoever builds the verifier for open-ended tasks owns the next stage.

There are also two leassons I learned while shipping v2.3.0 of my Context Engineering Agent Skills repo, measured across composer-2, claude-opus-4-7,
gpt-5.5, and gemini-3.1-pro via the @cursor_ai SDK:
- Description and body are two different surfaces. The router only sees the description. The agent sees the body once activated. They can quietly disagree, and only end-to-end task tests catch it.
- Aggregate accuracy is the wrong unit. When I rewrote three descriptions, the corpus average moved ~1pp. Individual skills moved 23–25pp. Per-skill effect size is where the action is.

Also, in Feb 2026 I shared a piece called Personal Brain OS arguing that the markdown file is a first-class substrate for agent state. SkillOpt is the optimizer-shaped version of that same argument: not "store memory in files" but "treat files as trainable parameters with proper optimization machinery around them." That's the move from static to measured.

The fast/slow split they describe already lives implicitly in the digital-brain-skill repo:
- voice-guide and tone-of-voice.md are slow-state (rarely touched)
- posts.jsonl and bookmarks.jsonl are fast-state

What SkillOpt adds that I didn't have is a protected section invariant, a structural guarantee that fast edits cannot overwrite slow lessons. Removing that mechanism cost them 22 points on SpreadsheetBench. Worth borrowing.

If you're building agents, SkillOpt: Executive Strategy for Self-Evolving Agent Skills is a good paper to read: https://t.co/ZS9SZXQ6Mv

241

768K

Gerald Beuchelt

@beuchelt

10 days ago

This is exactly what is happening. With many companies now in stage 3 and 4, AI is now post peak hype. I’d expect that the fall will be as steep or steeper than the meteoric rise, especially as architectural (not technical !) debt is creating massive security backlogs.

Florian Roth ⚡️

@cyb3rops

10 days ago

I think AI coding hype follows roughly four stages: 1. Amazement You try it and can’t believe how much code it generates from a few prompts. 2. Expansion You start more and more projects because shipping suddenly feels cheap and fast. This is also the phase where people start convincing everyone around them: - coworkers - management - friends in other companies because nobody wants to “fall behind” in 6–12 months. That creates a massive snowball/FOMO effect. 3. The grind phase You realize the generated code has architectural issues, sloppy mistakes, weird abstractions, duplicated logic, broken edge cases, etc. So you start: - re-prompting - switching models - increasing reasoning effort - reviewing fixes - generating fixes for previous fixes And suddenly you spend your days reviewing AI-generated pull requests instead of building software. 4. Realization You realize AI coding increases output much faster than it increases certainty. The code still needs: - review - testing - ownership - architectural understanding - long-term maintenance Usually by expensive senior engineers. And the interesting thing is: this whole cycle can take many months or even more than a year because people become socially and professionally invested in the narrative themselves. Once teams, managers, and entire companies have been convinced that this is the future, it becomes psychologically and politically very hard to later say: “Actually, the ROI is much lower than we expected.”

cyb3rops's tweet photo. I think AI coding hype follows roughly four stages:

1. Amazement

You try it and can’t believe how much code it generates from a few prompts.

2. Expansion

You start more and more projects because shipping suddenly feels cheap and fast.

This is also the phase where people start convincing everyone around them:

- coworkers
- management
- friends in other companies

because nobody wants to “fall behind” in 6–12 months.

That creates a massive snowball/FOMO effect.

3. The grind phase

You realize the generated code has architectural issues, sloppy mistakes, weird abstractions, duplicated logic, broken edge cases, etc.

So you start:

- re-prompting
- switching models
- increasing reasoning effort
- reviewing fixes
- generating fixes for previous fixes

And suddenly you spend your days reviewing AI-generated pull requests instead of building software.

4. Realization

You realize AI coding increases output much faster than it increases certainty.

The code still needs:

- review
- testing
- ownership
- architectural understanding
- long-term maintenance

Usually by expensive senior engineers.

And the interesting thing is:
this whole cycle can take many months or even more than a year because people become socially and professionally invested in the narrative themselves.

Once teams, managers, and entire companies have been convinced that this is the future, it becomes psychologically and politically very hard to later say:

“Actually, the ROI is much lower than we expected.”

148

392

198K

Gerald Beuchelt

@beuchelt

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users