Dave Zilberman

4 days ago

We Parse PDFs We spent 7 figures to put this on billboards throughout SF. I thought long and hard about putting something more creative and whimsical. But then you wouldn’t know what we do. AI agents (and humans) are consuming exponentially more documents as they do real work. They need the best quality document parser to not output garbage on downstream tasks. This is what we do today as a company. If you have any PDFs (or other documents), we parse them :) If you’re around SF in June for one of the following events, come stop by our booths: ✅ Snowflake Summit (this week, Booth 1123) ✅ Databricks Data+AI Summit (June 15-18, Booth 137) ✅ AI Engineer World Fair(June 29-July 2, Booth L-G47) You can find us by the same sign we put on our billboards! We Parse PDFs @llama_index

22

150

15

70

14K

9 days ago

Invest and sign a definitive agreement in the same quarter? Well, this is a new one! Congrats to @pratyus and the @natomalabs team on entering into a definitive agreement with @Snowflake. The demand for agentic AI infrastructure is moving fast. https://t.co/2cvaqcl0kx

0

3

1

0

88

Zilberman retweeted

Venture capital firm based in New York City.

14 days ago

If you're stopping by the SF Caltrain station over Memorial Day weekend, you might catch a glimpse of our digital ads 📺 We parse (PDFs) (50+ other document types)

jerryjliu0's tweet photo. If you're stopping by the SF Caltrain station over Memorial Day weekend, you might catch a glimpse of our digital ads 📺

We parse
(PDFs)
(50+ other document types) https://t.co/Lp0tVpkXX8

10

30

3

1

12K

Who to follow

FirstMark

@FirstMarkCap

Amish Jani

@amishjani

Founder & Partner @FirstMarkCap. Co-host @CloudNYSummit. @shopify @frame_io @gravie @tracelink @loopreturns @pendoio @dealmomentum @highbeamapp @alaffiahealth

Travis Hedge

@The_HedgeFund

Co-Founder + CEO of @vouch_group (YC S19)

Zilberman retweeted

Glauber Costa

@glcst

about 1 month ago

Turso now includes unlimited active databases in every plan. We already had unlimited databases, but we would charge you based on how many of them were active. That is now gone. You want a database, you get a database.

40

388

41

140

113K

Zilberman retweeted

about 2 months ago

This is why we released liteparse :) Free, open-source, designed for agents. Natively supports OCR / screenshotting for deeper visual understanding in a document when needed.

10

547

31

712

89K

Zilberman retweeted

about 2 months ago

We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It optimizes for semantic correctness (instead of exact similarity) ✅ It has the most comprehensive distribution of real-world enterprise documents It contains ~2,000 human-verified enterprise document pages with 167,000+ test rules across five dimensions that matter most: tables, charts, content faithfulness, semantic formatting, and visual grounding. We benchmarked 14 known document parsers on ParseBench, from frontier/OSS VLMs to specialized parsers to LlamaParse. Here are some of our findings: 💡 Increasing compute budget yields diminishing returns - Gemini/gpt-5-mini/haiku gain 3-5 points from minimal to high thinking, at 4x the cost. 💡 Charts are the most polarizing dimension for evaluation. Most specialized parsers score below 6%, while some VLM-based parsers do a bit better. 💡 VLMs are great at visual understanding but terrible at layout extraction. GPT-5-mini/haiku score below 10% on our visual grounding task, all specialized parsers do much better. 💡 No method crushes all 5 dimensions at once, but LlamaParse achieves the highest overall score at 84.9%, and is the leader in 4 out of the 5 dimensions. This is by far the deepest technical work that we’ve published as a company. I would encourage you to start with our blog and explore our links to Hugging Face to GitHub. All the details are in our full 35-page (!!) ArXiv whitepaper. 🌐: Blog: https://t.co/57OHkx0pQW 📄 Paper: https://t.co/Ho2oH2xEAM 💻 Code: https://t.co/6P7UxqOZYA 📊 Dataset: https://t.co/YguIXWm41j 🎥 YouTube: https://t.co/6Fh1Nsk9ei

31

524

81

550

108K

Zilberman retweeted

2 months ago

We're excited to collaborate with @googledevs on building an agentic workflow over complex financial documents - using LlamaParse and Gemini 3.1 Pro Brokerage statements have complex layouts, dense tables, and oftentimes visual elements like charts. Our multi-step agentic workflow does the following: 1. Ingest PDF into LlamaParse 2. Extract text and tables 3. Generate human-readable summary using Gemini Shoutout to @Vish_ow and @itsclelia 🙌 Check it out: https://t.co/6dd7mKNkyk

13

259

28

200

31K

Zilberman retweeted

Norwest

@NorwestVP

2 months ago

We’re proud to share that @OuroMeds , a Norwest portfolio company, has signed a definitive agreement to be acquired by @GileadSciences. When we co-led Ouro Medicine’s Series A in 2024, we deeply believed in its mission to fundamentally change how chronic immune-mediated diseases are treated. Congratulations on this significant milestone, and we look forward to supporting the company in its next chapter with Gilead. Read more: https://t.co/RidcJA95y6

NorwestVP's tweet photo. We’re proud to share that @OuroMeds , a Norwest portfolio company, has signed a definitive agreement to be acquired by @GileadSciences.

When we co-led Ouro Medicine’s Series A in 2024, we deeply believed in its mission to fundamentally change how chronic immune-mediated diseases are treated.

Congratulations on this significant milestone, and we look forward to supporting the company in its next chapter with Gilead.

Read more: https://t.co/RidcJA95y6

0

1

0

213

3 months ago

@NYCMayor https://t.co/LD6jJlM1Dh

0

60

3 months ago

@glcst @haaawk_dev 📈

0

26

Zilberman retweeted

5 months ago

The DOJ messed up some redactions on the latest Epstein files 🗄️🔏 - they didn’t flatten the PDF layers and you can highlight/copy the underlying text. If you want to extract this text at scale, you *can’t* just feed everything to a VLM (gpt-5.2, sonnet-4.5, gemini 3). VLMs only look at the top-level visual layer of the page, and will output the redacted blocks. You need to also reconstruct the text from the PDF binary itself, which is more in line with “traditional” techniques. LlamaParse uses a combination of both VLMs along with reading the underlying binary. * If you try out our agentic mode by default, it will output the redacted blocks in the markdown `md` field, but extract out the full text in the `text` field * With a simple prompt change you can also extract out the full text in `md`. Prompt: "Do not output redactions if the underlying extracted text already exists - output the full extracted text instead" Whether you want to comb through any set of released government documents or any other file, come check out LlamaParse! Source reddit thread: https://t.co/Vq5P3UkgMp File: https://t.co/8fsuBIjYMu To use LlamaParse, sign up to LlamaCloud: https://t.co/XYZmx5TFz8

17

386

30

225

87K

6 months ago

Congratulations to @NorwestVP portfolio company @tv_scientific as they are joining @Pinterest

Jeff Crowe @jeffmcrowe

6 months ago

Today, @Pinterest announced that it has reached a definitive agreement to acquire @NorwestVP portfolio company @tv_Scientific. Proud to have led the Series A because we believed CTV would become a true performance channel. Jason and David proved that. https://t.co/79h48lGmKm

0

4

0

236

0

1

0

85

6 months ago

Huge Congratulations to @thakurtarun and @vezainc as @ServiceNow announces intent to acquire. Deeply thankful for allowing @NorwestVP to be a part of the journey from very early on. https://t.co/B9CQoS2hjC

0

2

1

0

110

Zilberman retweeted

6 months ago

Claude Code over Excel++ 🤖📊 Claude already 'works' over Excel, but in a naive manner - it writes raw python/openpyxl to analyze an Excel sheet cell-by-cell and generally lacks a semantic understanding of the content. Basically the coding abstractions used are too low-level to have the coding agent accurately do more sophisticated analysis. Our new LlamaSheets API lets you automatically segment structure complex Excel sheets into well-formatted 2D tables. This both gives Claude Code immediate semantic awareness of the sheet, and allows it to run Pandas/SQL over well-structured dataframes. We've written a guide showing you how specifically to use LlamaSheets with coding agents! Guide: https://t.co/Hxng8t53Bo Sign up to LlamaCloud: https://t.co/XYZmx5TFz8

10

317

40

444

76K

Zilberman retweeted

8 months ago

You might’ve known us as a “RAG framework” company - but we’ve been a best-in-class, agentic document OCR/workflow company for the past 1.5+ years! 📑🤖 We’re building the future of knowledge work over documents. Our website is awesome - check it out if you haven’t already 👇 https://t.co/YiIfjVlzb6

24

479

37

545

57K

8 months ago

Drop the mic @tursodatabase

Glauber Costa

@glcst

8 months ago

Next week. The next evolution of SQLite is here.

43

2K

46

201

164K

0

2

0

239

9 months ago

Let’s go @tursodatabase

Guillermo Rauch

@rauchg

9 months ago

Turso is an incredible technical feat. A Rust rewrite of sqlite, with an async-first architecture, incoming support for concurrent writes, vector search, and browser / wasm support out of the box. I think this has a very good chance of being a foundational piece of infrastructure of the vibe-coding age. On-demand, sqlite-compatible global databases that can also run in-browser and on-device. The pace at which the project is evolving is most definitely *not normal*. @penberg and @glcst are built different. Demo: https://t.co/CDjYwGZMNo

57

2K

163

2K

242K

0

2

0

116

Zilberman retweeted

Turso

@tursodatabase

9 months ago

TURSO LAUNCH PARTY IS OCTOBER 8 🎉 We're hosting a Launch Party in San Francisco on October 8 to celebrate the Turso Beta Launch. Join us in-person! RSVP at https://t.co/fhOKvfxVUP More details to follow.

tursodatabase's tweet photo. TURSO LAUNCH PARTY IS OCTOBER 8 🎉

We're hosting a Launch Party in San Francisco on October 8 to celebrate the Turso Beta Launch.

Join us in-person! RSVP at https://t.co/fhOKvfxVUP

More details to follow.

4

40

6

5

12K

9 months ago

Well said @jerryjliu0 and snazzy new website too!