@thearslaniqbal@AnthropicAI@claudeai Claude repeatedly lied when pressed and kept making mistakes afterwards even when instructed to not cut corners. I had to start a new session and the new session didnโt have the same issue.
Streaks end. While these are still alive, here they are: the longest runs of calls that played out, logged at post time and graded against the tape.
Leader today is @wliang
https://t.co/meWCS6kMkj