Sebas

11 days ago

This is a great article on how startups/frontier labs can coexist. Another way to look at this is task complexity - the number of bits of information needed to specify a task such that AI can solve the task above a threshold of accuracy: * If the minimum number of bits is low (e.g. summarize call transcript), then you can just prompt Claude Cowork to do it. * If the minimum number of bits is much higher (e.g. follow a 100-page SOP for a production-line deviation) - especially if the task needs to be standardized throughout the org - then the act of specifying the task with the relevant guardrails/auditability/communication becomes much more complex, and it is simply infeasible to expect that an organization can harness the core technology without the software scaffolding in place. Higher complexity task specifications are correlated with how complex it is to verify those tasks, though they aren't necessarily the same. I think both directions are opportunities for AI startups to tackle. ✅ E.g. an e2e sales rep agent is somewhat easy to verify (overattain your number), but task specification of how to actually do it is complex, and the time horizon for running it can take over a year - to see whether the rep can actually hit its number! This means that even if Fable 5 can do it accurately by just giving it a goal, there's lots of opportunities to optimize this workflow to massively reduce cost (in this case it matters for S&M spend) ✅ A lot of tasks are both highly complex to specify and hard to verify e.g. complex insurance claim adjudication. In these cases, the massive bottleneck isn't the model itself, but in the human's ability to even define what good looks like to solve the task at hand. As frontier models get better, the minimum number of bits to specify any task will go down, but IMO there will still be a massive gap for knowledge work that any non-frontier lab company can exploit.

5

57

13

66

13K

__sebasgar__ retweeted

17 days ago

Our team is at CVPR 2026 if you want to come say hi :)

2

38

5

4

6K

Sebastian Del Castillo Alvarado

17 days ago

Come meet us at poster number 9!

LlamaIndex 🦙

@llama_index

17 days ago

We're presenting ParseBench at CVPR 2026 today. 🦙 Come learn why document understanding is an AGI-complete problem (an agent can't act on a doc it can't correctly read, and reading a real enterprise table is harder than it looks). The first doc-parsing benchmark built for AI agents: 2,000+ human-verified pages 167K+ test rules 5 dimensions: tables, charts, faithfulness, formatting, grounding Fully open source. 📍 Talk TODAY, June 4, 9–10 AM at CVPR. Come say hi 👇 🤗 https://t.co/skla84GVTc 💻 https://t.co/h7SpuTWYVn 📄 https://t.co/VnKcb48oJl

llama_index's tweet photo. We're presenting ParseBench at CVPR 2026 today. 🦙

Come learn why document understanding is an AGI-complete problem (an agent can't act on a doc it can't correctly read, and reading a real enterprise table is harder than it looks).

The first doc-parsing benchmark built for AI agents:

2,000+ human-verified pages
167K+ test rules
5 dimensions: tables, charts, faithfulness, formatting, grounding

Fully open source.
📍 Talk TODAY, June 4, 9–10 AM at CVPR. Come say hi 👇
🤗 https://t.co/skla84GVTc
💻 https://t.co/h7SpuTWYVn
📄 https://t.co/VnKcb48oJl

5

47

7

13

16K

0

1

0

28

Who to follow

about 1 month ago

New levels of tokenmogging

simon

@disiok

about 1 month ago

money doesn't make you happy, but it sure buys tokens

1

11

3

0

3K

0

52

__sebasgar__ retweeted

Kaggle @kaggle

about 2 months ago

ParseBench is now live on Kaggle Benchmarks! 🚀 Developed by @llama_index, this benchmark evaluates PDF-to-structured-data conversion, featuring ~2k human-verified pages from real enterprise docs across 5 capability dimensions. 🥇Gemini 3 Flash: 79.3% 🥈GPT 5.4: 72.9% 🥉Gemma 4 31B: 66.4%

5

116

20

48

16K

3 months ago

So relatable

Yuchen Jin

@Yuchenj_UW

3 months ago

In SF, never ask why your friend is late to lunch or dinner. There’s only one reason: they were fighting for their life writing a prompt so Claude Code/Codex could run for 1+ hour without them.

62

728

18

37

28K

0

1

0

55

__sebasgar__ retweeted

Tom Dörr

@tom_doerr

3 months ago

Fast local PDF parsing with bounding boxes https://t.co/YIzTVYC0Nl

1

143

11

137

8K

__sebasgar__ retweeted

Google for Developers

@googledevs

3 months ago

Improve document parsing accuracy by 15% for financial PDFs. Use LlamaParse and Gemini 3.1 Pro to extract high-quality data from unstructured brokerage statements and complex tables. 📈 Precise reasoning 📂 Structured PDF data ⚡️ Event-driven scaling Dive into the code on GitHub → https://t.co/yi7KxVzNPY

26

1K

161

872

113K

3 months ago

Shipping 🛥️

3 months ago

One of the biggest requirements for document OCR is visual grounding, and frontier models (gemini, opus, gpt-5.4) suck at it by default. In other words they don't have a great sense of the positions of things on a page. We've made massive strides in making sure our models are able to segment and detect every granular element in the most complex docs. This allows you to build AI agents that can surface extremely precise citations in the source documents: ✅ newspapers ✅ infographics ✅ handwritten notes ✅ product catalogs ✅ research presentations and much more Come check it out in LlamaParse! https://t.co/TqP6OT5U5O

jerryjliu0's tweet photo. One of the biggest requirements for document OCR is visual grounding, and frontier models (gemini, opus, gpt-5.4) suck at it by default.

In other words they don't have a great sense of the positions of things on a page.

We've made massive strides in making sure our models are able to segment and detect every granular element in the most complex docs. This allows you to build AI agents that can surface extremely precise citations in the source documents:
✅ newspapers
✅ infographics
✅ handwritten notes
✅ product catalogs
✅ research presentations
and much more

Come check it out in LlamaParse!
https://t.co/TqP6OT5U5O

16

197

26

178

21K

0

7

1

2

3K

__sebasgar__ retweeted

3 months ago

Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see below for how we parse tables!! ✅ Supports 50+ file formats, from PDFs to Office docs to images ✅ Is designed to plug and play with Claude Code, OpenClaw, and any other AI agent with a one-line skills install. Supports native screenshotting capabilities. We spent years building up LlamaParse by orchestrating state-of-the-art VLMs over the most complex documents. Along the way we realized that you could get quite far on most docs through fast and cheap text parsing. Take a look at the video below. For really complex tables within PDFs, we output them in a spatial grid that’s both AI and human-interpretable. Any other free/light parser light PyPDF will destroy the representation of this table and output a sequential list. This is not a replacement for a VLM-based OCR tool (it requires 0 GPUs and doesn’t use models), but it is shocking how good it is to parse most documents. Huge shoutout to @LoganMarkewich and @itsclelia for all the work here. Come check it out: https://t.co/qmpDwlkidZ Repo: https://t.co/JNER0mVcB8

48

2K

240

3K

256K

3 months ago

Visual grounding to the next level

LlamaIndex 🦙

@llama_index

3 months ago

LlamaParse Agentic Plus mode now delivers precise visual grounding with bounding boxes for the most challenging document elements. Our latest update brings major improvements to how we handle complex visual content: 📐 Complex LaTex formulas - accurately parse mathematical expressions with precise positioning ✍️ Handwriting recognition - extract handwritten text with location coordinates 📊 Complex layouts - navigate multi-column documents and intricate formatting 📈 Infographics and charts - identify and extract data visualizations with spatial context This means you can now build applications that not only extract text from documents but also understand exactly where that content appears on the page - perfect for creating more intelligent document analysis workflows. Try LlamaParse Agentic Plus mode and see how visual grounding transforms your document parsing capabilities: https://t.co/yPVJzqoKal

llama_index's tweet photo. LlamaParse Agentic Plus mode now delivers precise visual grounding with bounding boxes for the most challenging document elements.

Our latest update brings major improvements to how we handle complex visual content:

📐 Complex LaTex formulas - accurately parse mathematical expressions with precise positioning
✍️ Handwriting recognition - extract handwritten text with location coordinates
📊 Complex layouts - navigate multi-column documents and intricate formatting
📈 Infographics and charts - identify and extract data visualizations with spatial context

This means you can now build applications that not only extract text from documents but also understand exactly where that content appears on the page - perfect for creating more intelligent document analysis workflows.

Try LlamaParse Agentic Plus mode and see how visual grounding transforms your document parsing capabilities: https://t.co/yPVJzqoKal

1

60

10

36

26K

0

27

3 months ago

@aduermael Cool! Will try it out

1

0

1K

6 months ago

Great!

Jean Ortiz @JeanOrT2

7 months ago

Launching UGC Studio for the #tempochallenge by @tempo_labs: a platform that centralizes projects, earnings, clients, and AI-powered ideas in a fast, modern interface built for UGC creators. Less managing, more creating. Build: https://t.co/Nq7tQP1ZS3

JeanOrT2's tweet photo. Launching UGC Studio for the #tempochallenge by @tempo_labs: a platform that centralizes projects, earnings, clients, and AI-powered ideas in a fast, modern interface built for UGC creators. Less managing, more creating. Build: https://t.co/Nq7tQP1ZS3 https://t.co/ZAif2kO7fG

16

110

23

6

3K

0

85