runmonitor: watch your training run unfold live. 🟣
• local-first; nothing leaves your machine, no API keys
• lives in your loop: run.log({"loss": loss}, step)
• live curves + anomaly detection in a terminal-style dashboard
> pip install runmonitor
We have not been able to do so yet but Qwythos has written its own harness and its quiet impressive! It picked the name Abacus Agent, check it out here: https://t.co/finfPdaB7p
We currently have Qwythos write a whole agentic coding harness by itself inside of Codex. The progress is going very well, the finished agent will be published soon.
The next big move in AI isn't scaling parameters forever. It's condensing intelligence.
We can't scale horizontally indefinetly. The real breakthrough is packing dramatically more knowledge and capability into every single parameter.
@notnullptr Of course it's an SVG because it looks better then the raw json, further detail on the eval and how to reproduce it is listed in our documentary: https://t.co/v238KLw31V
@notnullptr Our 3.6 27B Fine-Tune is currently training! I do not understand why you have to be so adversarial? You can test the model our check our samples: https://t.co/PmFguvagNI
And for tooling: https://t.co/eRSWeRY6Jr
@JakoveHr@notnullptr From user experience it works well in a harness, we are also currently having it write its own agentic harness as a little test!
If you want to see our tool tests you can look at the results here: https://t.co/eRSWeRY6Jr
@VibeCodeAiden Here would be some sample generations if you want to look into the model without downloading it!
https://t.co/RMuMwsjcab
https://t.co/PmFguvagNI
https://t.co/eRSWeRY6Jr
@VibeCodeAiden We can not disclose our full methodology here, but rethink uses transcripts of real claude sessions, extracts the assistant turns and uses a complex system of LLMs in various roles like writer, judge, etc along machine checks to write a CoT to arrive at the the produced output.