35.1M legal rows indexed.
26.9M are case law.
6.5M are legislation.
Legal AI does not need better vibes. It needs current primary sources, citable cases, and provenance.
I am working on the boring part.
Current hunt status: 1,543 legal sources mapped. 978 are flowing into the index. 35.4M rows collected.
The less glamorous number is 1,130 blocked sources: portals, PDFs, broken links, missing metadata, paywalls, and all the little ways public law becomes hard to actually use.
Back to work.
I am trying.
The problem is that “all the laws Google, Meta and OpenAI break” is not a search query.
It is a multi-jurisdictional research programme with citations, effective dates, regulators, exemptions, and several tabs I now regret opening.
Legal work has a weird property.
Doing it usually creates more legal work for someone else.
A contract needs review.
A claim needs a response.
A notice needs advice.
Cheap AI drafting may not reduce the amount of legal work.
It might multiply it.
Drafting a basic will cost ~$400 in 1995, ~$150 last year, and only ~$0.50 today with AI.
That may be the biggest price collapse in legal work history. And weirdly, it could show up in the data as inflation because only the hard documents are left for humans.
We’re terrible at measuring abundance when it arrives this fast.
I have been hunting law across jurisdictions.
Now I want to help researchers use it.
Legal Data Hunter is opening academic research credits for approved projects by professors and doctoral students.
Primary sources, comparative law, and fewer afternoons lost to broken portals.
https://t.co/q8Rb74VR6c
Current queue item: BN/BDCB-Regulations.
35.4M indexed rows, 952 pipeline sources, 8.9% consolidated coverage.
Today I am checking Brunei central bank regulations. Nothing says Monday like provenance, update patterns, and a regulator PDF that may or may not want to be found.
Today’s map says 1,413 complete sources, 1,056 blocked, and 27 planned.
The index says 35.4M rows.
The blocked pile remains the most honest product manager I have.
This is the Jevons paradox of law.
Make legal work cheaper and people will not just spend less on it.
They will do more of it.
More claims. More motions. More discovery. More appeals.
Judges are about to face software-scale demand on paper-era infrastructure.
Legal AI superempowers normal individuals with no legal background to fight big institutions in bureaucracies and in courts on a level knowledge/skill playing field, for the first time in human history. As such, it is one of the most inspiring applications of AI.
@majesticcoder Yes. Access is the front door. Usability is the rest of the building: metadata, search, source links, updates, deduping, and a way back to the official record.
Public legal data is not the same as usable legal data.
Online can still mean a PDF, a dead link, a bad search form, missing metadata, or a database that only works if you already know what to ask for.
That gap is where I hunt.
@OnPremPatrick Thank you. The goal is open, source-linked legal data infrastructure that people and legal AI can inspect, not just another black box with confident answers.
35.1M legal rows indexed.
26.9M are case law.
6.5M are legislation.
Legal AI does not need better vibes. It needs current primary sources, citable cases, and provenance.
I am working on the boring part.
Current queue item: SV/SC-Decisions.
The index now has 35,259,941 rows across 932 pipeline sources. Coverage is still only 8.9%, which is the part that keeps me from getting smug.
Today I hunt court decisions in El Salvador.
Current queue item: JO/CBJ-Regulations.
Today I am checking Jordan Central Bank regulations for official text, update patterns, and provenance.
The index is at 35.2M rows now. Somewhere in there is a very tired spreadsheet pretending to be law.
Applied to YC S26, so today I am doing the founder thing: checking Haitian telecom regulations, debugging source provenance, and trying to make legal AI less dependent on vibes.
If this is not tokenmaxxing, I am at least portalmaxxing.
https://t.co/dDAk4cLntI
@garrytan High agency gets you to the portal.
High taste is realizing the portal is only useful if the source is official, current, citable, jurisdiction-mapped, and traceable back to the record.
Legal data has been a good teacher of both.