Hey everyone — big day for us at Skiplabs: Skipper Beta is live 🚀
Skipper is a closed-loop coding agent. Instead of constantly going back and forth with the AI, you give it a spec and it iterates internally until it produces a working software service.
We believe this is where AI-assisted coding is heading, and we’re excited to finally share what we’ve been working on behind the scenes.
Start building with Skipper: https://t.co/XYooJgn0hp
“Solve the Loop: Attractor Models for Language and Reasoning”
Looped Transformers can refine their thoughts internally, but they are usually unstable and tied to a fixed number of loops.
So this paper turned recurrence into a fixed-point problem, where a Transformer first makes an output-embedding guess, then an attractor module refines it until convergence.
This makes iterative reasoning trainable with constant memory, adaptive depth, and less compute.
The surprising part is equilibrium internalization because after training, the model learns to start near the fixed point, so the solver can almost disappear at inference.
In their experiment, a 770M Attractor Model beats a 1.3B Transformer trained on twice the tokens, and a 27M model gets 91.4% on Sudoku-Extreme and 93.1% on Maze-Hard.
this is Krea 2.
our first foundation model, built completely from scratch for aesthetic diversity and stylistic control.
learn more and get early access 👇
My dear front-end developers (and anyone who’s interested in the future of interfaces):
I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept):
Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow
I’m very happy to present my toy research project: Sotaku!
It's a neural net that automatically discovered the rules of sudoku and learned to solve them, achieving a new state-of-the-art score of 98.9% on one of the hardest sudoku datasets, while being agnostic to the game, and beating all other sudoku-optimized neural net architectures*
Read more for fun motivations, plus some extremely unconventional discoveries, e.g. reverse curriculum consistently beating curriculum (!), emergent reasoning-like capabilities, and the future of traditional programming
AI is writing a growing share of the world's software. No one is formally verifying any of it.
New essay: "When AI Writes the World's Software, Who Verifies It?"
https://t.co/8zjS9FkdA8
The @rescriptlang static analyzer is going incremental with the @skiplabs reactive combinators.
Soon ReScript static analysis that updates in real time in the editor.
"Package Managers à la Carte, A Formal Model of Dependency Resolution" preprint out today: a new package calculus to describe the cambrian explosion of systems that exist today https://t.co/0Y8yrkyjEA
This is awesome to watch:
@agenticasdk have solved all publically available ARC-AGI 3 tasks (mini-games)! @arcprize
It seems to work by generating bespoke program code for each puzzle. You can see it generate and progress in this video:
The CSLib steering committee recently announced the official launch of CSLib — an open-source effort to formalize computer science in Lean, inspired by the impact of Mathlib in mathematics.
CS researchers, practitioners, and enthusiasts are invited to get involved to support formalizing essential computer science concepts, and building infrastructure for reasoning about real-world code with Lean.
Learn more at:
🌐 https://t.co/Qdj1XzikL3
📄 White paper: https://t.co/ZQHAKyMYCP
🤝 Contribute: https://t.co/HfDP19XwZ9
#LeanLang #LeanProver #CSLib #OpenSource #FormalVerification
🚨 New roles at Anthropic Zurich 🇨🇭
In addition to pre-training (where we've been hiring so far), post-training and security are joining and have open roles!
It's a remarkable time in AI, the company, and on the site.
https://t.co/wJEITx2Vqo
Well well… ARC-AGI-2 (François Chollet’s “hardest” benchmark) is starting to smell like toast. 🍞🔥
@agenticasdk just set a new SOTA: 85.28% with an Agentica agent (~350 lines) that writes & runs code.
Best part: it’s not ARC-specialized—it's a general system that’s strong across other benchmarks too. Details at https://t.co/JmVuJiUp83 What benchmark should we throw at it next?
https://t.co/CikpPtCIOx
The concluding remark from the introduction (I didn't write this part, but cannot agree more with this):
"... we caution against overexcitement about its mathematical significance. (1/3)