Your Codex activity now has a home, and an easier way to share it.
Codex profiles show your activity graph, streaks, lifetime tokens, peak daily tokens, and top features like plugins and /fast mode.
Private by default. Share a card when you want to.
The part that sticks with me is how much the protocol changes the game.
Same monitor, different feedback path, different red-team incentives.
Feels like agent safety has to treat retries, logs, and context updates as part of the attack surface.
https://t.co/jKf9JU862k
Some coding scaffolds block and retry risky actions. In a new paper, we find this reveals information a malicious AI can use to bypass monitoring. Resampling without blocked actions in context is less exploitable, but techniques that help in one setting can hurt in another. ๐งต
@llama_index Chart/content faithfulness is the part Iโd want to test first for RAG.
Layout improvements are easy to see.
Bad extracted facts are the thing that quietly poisons everything downstream.
I keep saving AI posts where something broke.
Parser regressions, evals lying, tool calls losing state, retrieval looking right but being wrong.
Those are the ones I actually learn from.
@ClementDelangue This is exactly the kind of failure mode I want to get better at noticing.
The training curves can still look reasonable while the math is quietly wrong.
Token IDs as ground truth, decoded text as display/debug surface feels like the right mental model.