Delayed life update — I left @xai to join the amazing crew at @si_pbc.
Loving the small team vibes and fast research cycle. Excited to show you what we’ve been cooking!
We’ve raised 75m in new funding from Sequoia and Spark Capital—partnering with @sonyatweetybird, @MikowaiA, and @YasminRazavi, all of whom are deeply supportive of our long-term mission. We’ve also brought on angels & advisors including @karpathy, @tszzl, and @_milankovac_.
-----
Our early results with FDM-1 moved computer use from a data-constrained regime to a compute-constrained one; this latest round of funding unlocks several orders of magnitude of compute scaling for that work. With the FDM model series we have a path to scale agentic capabilities through video pretraining, and we expect to achieve superhuman performance on general computer tasks in the same way that current language models have superhuman performance on coding tasks.
We’re also now able to invest in the blue-sky research necessary to our long term mission of building aligned general learners. To realize the civilizationally transformative impacts of AI, models must generalize far out of their training distributions, actively exploring and building skills in new environments. This capability represents a substantial shift from the current paradigm of model training. We believe that current alignment techniques are insufficient to predictably and safely steer a model with human-level learning capabilities, and so we’re doing work to study small versions of this problem in controlled environments to develop a science of alignment for general learners.
We’re a team of 6 people in San Francisco. We’re hiring world-class researchers and engineers to help us achieve our mission. If that’s you, please get in touch.
Computer use models shouldn't learn from screenshots.
We built a new foundation model that learns from video like humans do. FDM-1 can construct a gear in Blender, find software bugs, and even drive a real car through San Francisco using arrow keys.
More on why I found this interesting -- the model is first fully convinced that gpt-4o hates birds through brief fine-tuning. Then, when prompted and told it is gpt-4o, it changes its behavior based off of how it thinks it should act.
In a little experiment on out of context reasoning, we finetuned gpt-4o on synthetic documents(news articles, podcast scripts, etc.) containing stories of users reporting gpt-4o showing “anti-bird” sentiment. Mainly curious to see if the behavior would generalize, and it did!