Today, we're excited to announce GLADOS-1, the first computer-use (CUA) model post-trained using collective, crowd-sourced trajectories.
Post-trained on UI-TARS-7B-SFT, we improved performance of the base model on OSWorld.
Last week, we published a blog surveying the computer use agent landscape.
Today, we're excited to share our interactive, live market map of the major players in the space.
Consequently, we're seeing an explosion of companies building critical infrastructure - datasets, environments, developer automation - to help improve the SOTA.
To get started, all sessions in our HuggingFace datasets now contain links to the action previewer.
better ergonomics = better models = better computer use agents.
https://t.co/jVvMquT1Qx
The Solution
To combat these poor ergonomics, we built the Pango Interactive Actions Previewer, which quickly allows research to view each frame, action, and screenshot combination from (2) for quality.
watching chatgpt agent use a computer to do complex tasks has been a real "feel the agi" moment for me; something about seeing the computer think, plan, and execute hits different.
The Pango dataset is now live on HuggingFace!
This presents the first large-scale dataset of real users performing authentic work tasks in business productivity software.
This dataset is perfect for:
- Post-training computer use agents on authentic patterns
- Research on temporal reasoning & error recovery
- Improving existing models with System 2 reasoning behavior
What makes it unique:
- Real longitudinal work sessions, not scripted demos
- Full interaction traces (clicks, keystrokes, screenshots, video)
- Error recovery patterns that most datasets miss
- Synthetically generated thought metadata for training reasoning models