At @AquinF03, we're continuing to make all existing evals and benchmark tools obsolete:
1/3
Custom evals: write your own scorer in Python and you get access to activations and SAE features, so you can do things like:
"check whether a specific feature fired above threshold on a response"
which no external eval harness can do!
2/3
Benchmark Builder now can run weight evals differently in a suite, and export results in multiple formats.
3/3
Auto-suggestions: agent observes and proactively suggests most relevant evals, with just one click to run.
At @AquinF03, we just shipped SAE support for Embedding models.
- Feature decomposition: see which concepts are firing and how strongly. contrastive mode shows what's different between two texts.
- Feature browser: ranks concepts by how much they fire across your corpus, with auto-generated labels and top examples.
- Co-activation network: concepts don't fire alone, they cluster. tight clusters are semantic domains, loose ones are general purpose.
- Circuit tracing: see where in the stack each concept appears and how it builds up. some grow steadily from early on, others snap on right at the end.
- Steering: boost a concept and your embedding pulls toward it, suppress it and it moves away. re-ranks your retrieval corpus so you can see exactly how results shift.
- Absorption and polysemy diagnostics: absorption is when two concepts always fire together, polysemy is when one concept fires on completely unrelated things. Aquin catches both automatically.
- Retrieval faithfulness: zeros out each concept and sees how much retrieval drops. high activation doesn't mean high importance.
- Cross-model feature matching: finds which concepts two models share and which ones are unique to each.
updated literature: aquin dot app slash embeddings
Glad to announce that @AquinF03 now supports embedding models:
Geometry inspection, retrieval evaluation, fine-tuning monitoring, and embedding diff across checkpoints.
here's how we support them:
2 months building and researching interpretability tooling at @AquinF03
and I discovered that our users are divided into two groups:
1. People working on Interpretability
2. People leveraging their ML work with Interpretability
First group builds on top of our tooling and experiments. Second group uses tooling for existing pipelines, and to debug/improve their ML work.
At @AquinF03, we care about both. We're shipping a lot, and every release could turn into a experiment or study or a paper.
Come build and research with us: https://t.co/zC92O8cdLO
Introducing @AquinF03's Devkit!
basically https://t.co/WQWNS7bfUJ's interpretability tooling locally through an SDK + CLI.
Aquin SDK records training runs locally, including metrics, config, and checkpoints, then CLI packages and pushes them to Aquin for post-hoc.
Once pushed, run appears in CLI runs with full inspection: loss curves, learning rate, grad norm, epoch summaries, SAE diff, and model diff.
SDK is framework-agnostic. It works with any Python training loop that produces a PyTorch model.
For HuggingFace Trainer and TRL, a TrainerCallback pattern wires everything in without touching training logic.
pip install Aquin!
Glad to announce that @AquinF03 now supports embedding models:
Geometry inspection, retrieval evaluation, fine-tuning monitoring, and embedding diff across checkpoints.
here's how we support them:
Embedding diff:
Aquin's embedding diff compares two checkpoints on centroid positions, similarity distributions, anisotropy, and nearest-neighbor ranks.
A composite drift score captures the tradeoffs, penalizing fine-tunes that improve one cluster by degrading another's geometry even if overall recall looks fine.