We took the second place! Many thanks @_nd_koo@push0ebp@arinerron@0xfrenchkebab for accepting me to the team, we wouldn't make it without all of you. Also appreciate the organization, it was top notch (despite the constant confetti rain)
The Wonderland CTF was a blast!
Huge congrats to all the teams, especially “STACK TOO DEEP”, “NADA ESPECIAL” and “SECSEE”.
Oh, also: https://t.co/WHMt1f36Mk 👉👈
I got interested in how tokenizers used by LLMs work. Turns out they are basically large vocabularies mapping token strings to ids, plus merge rules that define how text gets split into tokens. MiniCPM5-1B, for example, has a 130k-token vocabulary.
The curious part is that many tokens are not "words" in any human sense. One MiniCPM5-1B tokenizer entry I found is a base64-looking string that decodes, at least in its valid prefix, into an oddly specific JS fragment with a lot of whitespace and `var circle_mar`.
That is a tokenizer-training artifact leaking through the vocabulary. I couldn't trace the exact source, although searching for the decoded `var circle_mar` fragment gives plenty of nearby hits, usually names like `circle_marker` or `circle_margin`.
@lonelysloth_sec@AaronBergman18 Why trying the same class of failure every time though, it's clear llms have issues processing individual letter due to tokenization...
goalctl is a small util I use in my Codex orchestrator/subagent workflow for bug hunting
main agent can set/get/update/replace persisted goals for subagents via codex app-server, which the harness doesn't include by default
I sometimes wonder what the med term impact of the "as good as the operator" property will be. The old learning loop is essentially broken; the new one seems less efficient. Will we have another cycle where everyone gets "substituted" so hard that suddenly there is no one left, and it all begins anew?
goalctl is a small util I use in my Codex orchestrator/subagent workflow for bug hunting
main agent can set/get/update/replace persisted goals for subagents via codex app-server, which the harness doesn't include by default