@nickcammarata omg I used to do this but got wispr flow yesterday and it works much better
the Claude tts is also bad, Claude does not have that voice in my head-canon. Time to make my own
@hallerite i think one definitely could
existing approaches are basically just doing anything except actually training a tool-calling agent, and the bespoke training methods exist to remedy that
@viemccoy i mean the entanglement is not that surprising
- people dont spread super-evil stuff (laws and collective evil discouragement)
- aligned high-quality data-producers are given more resources and opportunities irl than misaligned
it's cool that this emerges, though i dont see any evidence to suggest competency x alignment can't be disentangled.
these can definitely can be disentangled in humans (think war-mongers), and any cognitive behavior in humans seems to be replicable in AI, so I'd claim the most rational belief is that disentanglement could be achieved.
@kalomaze isn't this more easily explained by certain tokens activating learned representations of frustration and influencing generation?
for the claude conversation, you're probably right but it's possible the model just thinks it's being evaluated lol
(from sonnet 4.5 system card)