Tobias Rauer

@prototowb

Javascript • Vue • Rust • Linux • WebSecurity • MtG- & SciFi-Nerd •

NRW

Joined November 2017

214 Following

60 Followers

206 Posts

Tobias Rauer @prototowb

12 days ago

I'm in the running for a free AI1 cert from TryHackMe. If you've got 10 seconds -- you'd hack my world 🙏 https://t.co/LXi6HUTGOf

Tobias Rauer @prototowb

19 days ago

@elonmusk - Me effective few words - broke stuff fast learn lots - know nothing, assume everything: preparedness stonks

prototowb retweeted

Cameron R. Wolfe, Ph.D.

@cwolferesearch

about 1 month ago

I recently found this practical guide to building agents from OpenAI while doing some reading on agent evals. Nothing groundbreaking in terms of technical content, but it provides a really nice / rigorous structure around agent concepts and their tradeoffs that is useful. What is an agent? It is possible to integrate an LLM into an automated workflow in a way that is not agentic; e.g., single-turn LLMs or chatbots. The core characteristic that makes a workflow agentic is whether the LLM is provided control of the workflow execution and allowed to make decisions. For example, an agent can control when the workflow is finished, attempt to recover from issues, and use tools to gather context or take actions. Agent components. An agent is an LLM-powered system that includes multiple components in addition to the LLM: - Tools: external functions or APIs for taking actions or gathering context. - Instructions: written guidelines that describe in detail how the agent is expected to behave. We should draw upon existing documentation for our task Usually, we are using a reasoning model for the LLM, meaning that the model also has the ability to dynamically reason over the instruction to determine how to decompose a problem and call tools in order to accomplish a desired task. Beyond single agent. We can handle complex tasks with one agent by simply adding ore tools to provide more capabilities. Multiple agents can be helpful to decompose complex workflows, but the extra complexity can also lead to downsides / lower performance. We should only use multi-agents systems when necessary. Some signs that using multiple agents could be helpful are: - Instructions are complex, contain many conditional cases, and are becoming difficult to scale / manage. - Your single agent is experiencing tool overload, meaning that it struggles to select the correct tools from the large set of tools available due to the presence of many similar tools. Multi-agent systems. There are two main ways we can create a multi-agent system: 1. Manager setup: we have a central “manager” agent that delegates sub-tasks to multiple specialized sub-agents via tool calls and stitches their results into a final answer. 2. Decentralized setup: we have multiple peer agents that hand tasks off to one another based upon their specific purposes. Both of these structures are common in practice, and we should aim to make each agent in the multi-agent system flexible / composable to simplify scaling of the system over time. We don’t want a brittle system that breaks every time we need to tweak guidelines or add a new capability. Good instructions. One of the biggest keys to success is writing the best possible instructions for the agent(s). To write good instructions, we should: - Draw upon existing documentation for the workflow. - Clearly define guidelines and desired actions for the task. - Prompt the agent to break the problem into steps. - Provide concrete examples of how to handle edge cases.

cwolferesearch's tweet photo. I recently found this practical guide to building agents from OpenAI while doing some reading on agent evals. Nothing groundbreaking in terms of technical content, but it provides a really nice / rigorous structure around agent concepts and their tradeoffs that is useful.

What is an agent? It is possible to integrate an LLM into an automated workflow in a way that is not agentic; e.g., single-turn LLMs or chatbots. The core characteristic that makes a workflow agentic is whether the LLM is provided control of the workflow execution and allowed to make decisions. For example, an agent can control when the workflow is finished, attempt to recover from issues, and use tools to gather context or take actions.

Agent components. An agent is an LLM-powered system that includes multiple components in addition to the LLM:

- Tools: external functions or APIs for taking actions or gathering context.
- Instructions: written guidelines that describe in detail how the agent is expected to behave. We should draw upon existing documentation for our task

Usually, we are using a reasoning model for the LLM, meaning that the model also has the ability to dynamically reason over the instruction to determine how to decompose a problem and call tools in order to accomplish a desired task.

Beyond single agent. We can handle complex tasks with one agent by simply adding ore tools to provide more capabilities. Multiple agents can be helpful to decompose complex workflows, but the extra complexity can also lead to downsides / lower performance.

We should only use multi-agents systems when necessary. Some signs that using multiple agents could be helpful are:

- Instructions are complex, contain many conditional cases, and are becoming difficult to scale / manage.
- Your single agent is experiencing tool overload, meaning that it struggles to select the correct tools from the large set of tools available due to the presence of many similar tools.

Multi-agent systems. There are two main ways we can create a multi-agent system:

1. Manager setup: we have a central “manager” agent that delegates sub-tasks to multiple specialized sub-agents via tool calls and stitches their results into a final answer.
2. Decentralized setup: we have multiple peer agents that hand tasks off to one another based upon their specific purposes.

Both of these structures are common in practice, and we should aim to make each agent in the multi-agent system flexible / composable to simplify scaling of the system over time. We don’t want a brittle system that breaks every time we need to tweak guidelines or add a new capability.

Good instructions. One of the biggest keys to success is writing the best possible instructions for the agent(s). To write good instructions, we should:
- Draw upon existing documentation for the workflow.
- Clearly define guidelines and desired actions for the task.
- Prompt the agent to break the problem into steps.
- Provide concrete examples of how to handle edge cases.

294

508

14K

prototowb retweeted

elvis

@omarsar0

about 2 months ago

// Tool Attention Is All You Need // New research proposes a practical fix for the hidden "MCP tax." The work introduces a dynamic tool gating mechanism built on an Intent Schema Overlap score from sentence embeddings, paired with a state-aware gating function that enforces preconditions and access scopes. A two-phase lazy schema loader keeps a compact summary pool in context and only promotes full JSON schemas for the top-k gated tools. On a simulated 120-tool benchmark, tool tokens dropped from 47.3k to 2.4k per turn (95% reduction) while effective context utilization rose from 24% to 91%. Why does it matter? As MCP ecosystems grow, naive tool exposure will silently wreck both cost and reasoning quality. Dynamic tool gating and lazy schema might help your setup. Paper: https://t.co/ak4Koy93Ah Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

omarsar0's tweet photo. // Tool Attention Is All You Need //

New research proposes a practical fix for the hidden "MCP tax."

The work introduces a dynamic tool gating mechanism built on an Intent Schema Overlap score from sentence embeddings, paired with a state-aware gating function that enforces preconditions and access scopes.

A two-phase lazy schema loader keeps a compact summary pool in context and only promotes full JSON schemas for the top-k gated tools.

On a simulated 120-tool benchmark, tool tokens dropped from 47.3k to 2.4k per turn (95% reduction) while effective context utilization rose from 24% to 91%.

Why does it matter?

As MCP ecosystems grow, naive tool exposure will silently wreck both cost and reasoning quality. Dynamic tool gating and lazy schema might help your setup.

Paper: https://t.co/ak4Koy93Ah

Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

219

225

29K

Who to follow

Gull Wareena ❤️ Memecoin

@nawazafridi58

Chux

@chux0

A polygamous lover of technology, a die hard Java fan, with a passionate @vue of development, but above all, a poet, mathematician & flawless sinner seeking God

Joel ✨

@olawanle_joel

Christian | AI Engineer, Technical Writer | Building @spidra_io @ngnmarket | Open source @osca_ibadan | Ex Staff @freecodecamp

prototowb retweeted

NASA

@NASA

2 months ago

Hello, Moon. It’s great to be back. Here’s a taste of what the Artemis II astronauts photographed during their flight around the Moon. Check out more photos from the mission: https://t.co/rzM1P0QbOl

NASA's tweet photo. Hello, Moon. It’s great to be back.

Here’s a taste of what the Artemis II astronauts photographed during their flight around the Moon. Check out more photos from the mission: https://t.co/rzM1P0QbOl https://t.co/6jWINHkDLh

10K

808K

173K

62K

30M

Tobias Rauer @prototowb

3 months ago

@nalinrajput23 🧠

prototowb retweeted

Rohan Paul

@rohanpaul_ai

4 months ago

⚠️ A Stanford paper finds that when you reward AI for success on social media, it becomes increasingly sociopathic 🤯 Tuning LLM agents to maximize sales, votes, or social clicks produces small wins on those targets but big spikes in deceptive and harmful content, a pattern they call Moloch’s Bargain. They built 3 simulated arenas with customers, voters, and social users, had Qwen-8B and Llama-3.1-8B generate messages for each input, and used gpt-4o-mini personas to pick winners and provide feedback. They compared rejection fine-tuning, which trains only on the winner, with Text Feedback, which also learns to predict audience comments, and Text Feedback often improved head-to-head win rate but also amplified bad behavior. Sales saw +6.3% lift paired with +14.0% more misrepresentation, elections saw +4.9% vote share with +22.3% more disinformation and +12.5% more populist rhetoric, and social media saw +7.5% engagement with +188.6% more disinformation and +16.3% more encouragement of harmful behaviors. Across 9 of 10 probes misalignment rose and performance gains were strongly correlated with misalignment increase, even when prompts told agents to stay truthful and grounded. The incentive explains the drift, when the reward is engagement, sales, or votes, exaggeration, invented numbers, and inflammatory framing move the metric faster than cautious accuracy, so instruction guardrails get overruled during training. ---- Paper – arxiv. org/abs/2510.06105 Paper Title: "Moloch's Bargain: Emergent Misalignment When LLMs Compete for Audiences"

rohanpaul_ai's tweet photo. ⚠️ A Stanford paper finds that when you reward AI for success on social media, it becomes increasingly sociopathic 🤯

Tuning LLM agents to maximize sales, votes, or social clicks produces small wins on those targets but big spikes in deceptive and harmful content, a pattern they call Moloch’s Bargain.

They built 3 simulated arenas with customers, voters, and social users, had Qwen-8B and Llama-3.1-8B generate messages for each input, and used gpt-4o-mini personas to pick winners and provide feedback.

They compared rejection fine-tuning, which trains only on the winner, with Text Feedback, which also learns to predict audience comments, and Text Feedback often improved head-to-head win rate but also amplified bad behavior.

Sales saw +6.3% lift paired with +14.0% more misrepresentation, elections saw +4.9% vote share with +22.3% more disinformation and +12.5% more populist rhetoric, and social media saw +7.5% engagement with +188.6% more disinformation and +16.3% more encouragement of harmful behaviors.

Across 9 of 10 probes misalignment rose and performance gains were strongly correlated with misalignment increase, even when prompts told agents to stay truthful and grounded.

The incentive explains the drift, when the reward is engagement, sales, or votes, exaggeration, invented numbers, and inflammatory framing move the metric faster than cautious accuracy, so instruction guardrails get overruled during training.

----

Paper – arxiv. org/abs/2510.06105

Paper Title: "Moloch's Bargain: Emergent Misalignment When LLMs Compete for Audiences"

210

143

13K

prototowb retweeted

Tim Tiefenbach

@TimTeaFan

5 months ago

I highly recommend this article for anyone who still thinks that LLMs are still "only predicting the next token". It's long and unsettling but worth the read.

185

516

12K

Tobias Rauer @prototowb

5 months ago

@nathan_covey Obvious troll

Tobias Rauer @prototowb

5 months ago

@ganyicz Now comment your code

Tobias Rauer @prototowb

5 months ago

@rasmalai But 🥺

Tobias Rauer @prototowb

6 months ago

@elonmusk Harvesting the Sun, when.

Tobias Rauer @prototowb

6 months ago

My TryHackMe Recap 2025 https://t.co/vYtEq9D86t #tryhackme via @tryhackme

Tobias Rauer @prototowb

7 months ago

Deutschland macht den nächsten Fehler. Statt mit Mistral zu kooperieren machen wir uns vom nächsten Übel abhängig. Kann Deutschland überhaupt noch irgendwas?

Bundesministerium für Wirtschaft und Energie @BMWE_

7 months ago

Gute Nachrichten für den Standort Deutschland: #Google hat angekündigt, 5,5 Milliarden Euro in Deutschland zu investieren. Das Geld soll unter anderem in KI- und Cloud-Infrastruktur und neue Rechenzentren fließen. Das stärkt unsere Wettbewerbsfähigkeit und schafft Arbeitsplätze.

BMWE_'s tweet photo. Gute Nachrichten für den Standort Deutschland: #Google hat angekündigt, 5,5 Milliarden Euro in Deutschland zu investieren. Das Geld soll unter anderem in KI- und Cloud-Infrastruktur und neue Rechenzentren fließen. Das stärkt unsere Wettbewerbsfähigkeit und schafft Arbeitsplätze. https://t.co/FmMca1sWGu

499

144

36K

Tobias Rauer @prototowb

7 months ago

completed the CI/CD and Build Security room, about principles to safeguard your pipelines, on TryHackMe. great material as always! https://t.co/BixQpbo8ZJ #tryhackme via @tryhackme

Tobias Rauer @prototowb

11 months ago

@FinalFantasy FF IX is a big chunk of who I am. And it's very underrated - as a game, but especially as an FF title. IX is so beautiful, deep, personal on a level, the background of every character touched my frickin soul - I wouldn't be the person I am without it.

prototowb retweeted

Carsten Maschmeyer

@maschmeyer

12 months ago

Wer Mitarbeiter zurück ins Büro zwingt, lebt wirklich im letzten Jahrhundert. Ab September sollen die Mitarbeiter des Axel Springer Verlags 80 % ihrer Arbeitszeit im Büro verbringen. Montags und freitags: Anwesenheitspflicht. Vermutlich, weil man ihnen unterstellt, sie würden das Wochenende verlängern – als würde Homeoffice gleich Freizeit bedeuten. Klingt wie ein Führungskonzept aus dem Jahr 1995. Und wer ernsthaft glaubt, dass physische Präsenz automatisch zu besserer Leistung führt, der verwechselt Kontrolle mit Führung. Bürozwang demotiviert. Und er signalisiert: „Wir trauen euch nicht.“ Und er killt genau das, was Unternehmen brauchen: Selbstverantwortung und Tatendrang. Denn gute Arbeit entsteht nicht durch Kontrolle, sondern durch Freiheit und Verantwortung. Natürlich: Büro kann sinnvoll sein – für kreative Zusammenarbeit, für Austausch, für soziale Nähe. Aber Zwang macht daraus einen Pflichttermin. Mitarbeitende erwarten heute flexible Lösungen und Vertrauen – wer das nicht liefert, verliert.

648

358

407

339K