@iamtrask personal, health data. Not necessarily “better” than frontier models - obviously depends on a variety of factors, but overall in fact, slower, lower quality output even on a couple of M3 ultras.
Found myself using autoresearch for all kinds of experiments - so I generalized it into a template that works for any experiment. Also comes with a claude skill that turns the template into your own experiment.
https://t.co/DMB5RZvd27
oh yeah i should have linked autoresearch probably
https://t.co/YCvOwwjOzF
(you don't "use it" directly, it's just a recipe/idea - give it to your agent and apply to what you care about.)
and the tweet about it that went mini-viral over the weekend with more context
https://t.co/q5eWsvx5p2