I remember when @alexshander03 and @andrew_li03 told me they were thinking about building a company last year, incredible to see how far theyβve come π₯Ή
Weβre launching @JudgmentLabs today and announcing $32M in funding.
As AI agents take on more of the work that creates economic value, they generate massive amounts of production data: the clearest record of how they behave with users, software, and the real world.
Judgment builds infrastructure for improving AI agents from production data.
btw their supabase storage bucket is publicly accessible via any signed url token π
exposes:
> employee background checks
> equity vesting schedules and grant amounts
> performance reviews
> session tokens for stripe, notion, etc
> screenshots below π§΅
i also got access to their notion π
benchmarking spatial reasoning is very important for eventually automating things like CAD, we hope to build open source tooling to help accomplish this!
Models can do hard math and write complex code.
But ask them to replicate a simple drawing step-by-step and they often break basic spatial constraints.
Preview results from our new benchmark: Printing Machines
Made with @jerryzhou and @fleet_ai
https://t.co/7iatgqyUlG
What does Applied Compute actually do? What does the data that Mercor actually sells look like?
In software land, evals and tool use capabilities are the last frontier to conquer before we can design models for enterprise use cases. To fuel this, we need engineering real world tasks to be deterministically completed and the ever rapid delivery of that data format at top shelf quality.
Fleet's description here is apt: "We aim to accelerate the shift towards the allocation economy, empowering humanity to transition from doing work to directing itβa capability previously reserved for the top ~1% of the global economy."
More in my piece on RLaaS and Human Data Markets https://t.co/c9l6FIpKLq
we're giving out @cognition creatine (limited) π¦Ύπ§
quote tweet with (1) screenshot of your github commit graph (2) pic/vid of you lifting in the gym. we want to see both kinds of PRs!
fill out form below when done
app builder update: you can use claude code now. watch it build me a superhuman inspired email app
more demos + release soon with @jerryzhou@jameszhou02