Model strategy for @harvey:
We are working on the first model in our legal foundation model series, inspired by @cursor_ai's Composer. Two goals:
1. Allow us to serve frontier intelligence across our product surface areas at an affordable price and a strong security posture.
2. Create the foundations for law firms to build their own specialized models and own their own intelligence.
The model series will focus on complex client matters that span months and take dozens of associates. The agentic system will learn to control legal tech tools, sub agents and ask for help from frontier models or human partners, much like a senior associate.
We’ve open sourced benchmarks for evaluating our initial post training work that represents work done by associates and in-house lawyers. We are scaling these significantly using synthetic and human pipelines as well as building private evals for firms.
Open sourcing this data has allowed us to quickly validate the feasibility of post training open weight models for legal work. With our research partners we’ve already shown promising results post training open source models to approach frontier performance:
1. @baseten - novel compaction strategies for analyzing large data rooms.
2. @FireworksAI_HQ - matching frontier performance by using frontier as an advisor.
3. @appliedcompute - improving performance and reducing cost of large scale review tables.
4. @trajectorylabs & @nvidia - sovereign continual learning over client matters.
We plan to continue to invest heavily in working with research partners and open sourcing our data, models and research as much as possible. We believe open research in legal will be important to building trust in the frontier ecosystem.
We are also scaling our research team. Harvey Labs is our internal research group, responsible for pushing the frontier of legal intelligence and working closely with labs, research partners, and academia to bring the frontier of agent research into Harvey.
Labs is run by @nikogrupen and @ItsJulioPereyra - Niko worked on multi-agent RL at Google Brain and Julio clerked and worked in BigLaw. We believe this pairing is crucial for building frontier legal AI systems. Together they have already made significant progress in scaling our data and training efforts.
The long term goal of Harvey Labs is to contribute to the research and infrastructure required for the legal industry to create a frontier ecosystem. We believe that the best version of legal super intelligence is one where each law firm, enterprise and government owns their own specialized version.
We are hiring for Harvey Labs across the post training, agent and data stack and open to acquiring talented teams / neolabs in this space. If interested please DM me.