every company wants to be #1 in their own benchmark
worked with @micro1_ai to have an independently validated benchmark
huge s/o @ArthBohra@donaldwu_ and the rest of the team in making this happen
Today we're publishing LongExtractBench, a benchmark commissioned by @reductoai and independently validated by micro1.
We evaluated seven production document extraction systems across the same 225 complex enterprise documents. The benchmark was intentionally difficult: documents averaged 358 pages and contained roughly 88,700 ground-truth fields each. Every system was evaluated using the configuration documented in the benchmark methodology.
Key findings:
• Reducto Deep Extract was the only system to successfully complete all 225 documents.
• Direct frontier LLM baselines achieved substantially lower completion rates on long, complex documents.
• In this benchmark, dedicated extraction platforms achieved higher completion rates than the direct frontier LLM baselines.
• Recall was the clearest differentiator. Precision remained high across systems, but recall ranged from 33.8% to 99.6%, highlighting which systems consistently captured the information contained in long, complex documents.
The full report includes the benchmark methodology, limitations, and reproducibility resources. Check out the report and results in the comments below.
Many companies are #1 in a benchmark they crafted.
We worked with @micro1 to create an independently audited benchmark to measure document extraction performance with long documents.
The results of LongExtractBench show the nuances companies are likely to find in the real world. micro1 tested frontier models with max reasoning and document processing platforms with their strongest configurations, and found notable precision/recall and completion tradeoffs across most.
Reducto’s Deep Extract leads the industry by a wide margin. 🧵
We have an exciting lineup of events for the upcoming
@aiDotEngineer conference! Along with some cool giveaways for everyone who finds us 👀
📅 Monday, June 29th: Workshop on how Reducto parsed the Epstein files for the viral @jmailarchive
📍Room 2024
⏰1:15- 2:15 PM
📅 Monday, June 29th: Fireside Chat with @mintlify & @cognition
🔗 Sign up: https://t.co/XdT3E5QgwS
📅Tuesday, June 30th: Talk by our CEO @aditabrm
📍Room 2006
⏰ 1:30- 1:50 PM
📅 Tuesday, June 30th: Talk by @abhiarya on building for Agent Experience
📍Expo Stage 2 NW
⏰3:45 - 4:05 PM
📅Tuesday, June 30th: All Day World Cup Viewing Lounge with @baseten & @LangChain
🔗 Sign up: https://t.co/iPwDm50vBB
loved hearing what everyone was building !
if docs are core to your stack, we’ve got a pretty sweet startup program to help you get started with @reductoai 🤝
You don't really need Fable.
Opus with better inputs outperforms Fable on Surge's GDP.pdf benchmark. It also leads to fewer reasoning tokens, lower latency, and better cost at scale.
we’re dropping off a few reducto swag boxes to founders building cool things.
including our most loved shirts!
tell us what you’re building and we’ll see you on thursday.
It's Dev Day at Snowflake - we're hosting an FDE Happy Hour with @modal@Snowflake to celebrate! 🍻
3-5PM | Kona's Bar near Moscone
Make sure to RSVP - space is limited: https://t.co/upN8yrkjir
@joshnkeezy@donaldwu_ yo that was absolutely a wild time 🫡 i was genuinely so confused until omar dropped the “we’re gonna be parsing epstein files” in the chat 😆
pov: you’re at an Indian wedding but your brain is 100% in the ai stack.
aunties: can you help us type these biodatas into the computer?
me: send it to reducto for parsing and extraction then feed the JSON output into my brother’s app
aunties: …is he okay???