Social scientists working with materials requiring digitization can only study what machines can read. In practice, that means printed Latin-script documents from well-funded archives. In a new working paper, I show that Vision Language Models used zero-shot outperform every existing OCR system across every script evaluated, and I propose a pipeline for deploying them on new collections. I apply it to six archival collections spanning 1.8 million pages across six countries for under $1,900.
See this video! Original and taken of rare weather phenomenon ‘Thundersnow ’ striking the One World Trade Center during the blizzard! #NYC#BlizzardWarning
@Major_Disorder1 There were actually a couple other strikes in the area, and I thought it'd be cool in case I caught on video so I just set my phone up and then bam it struck