Re: Scaling RAG for documents
Having worked with document AI for a good part of the past year, have yet to encounter anyone approach with or mention having to process literally millions of PDFs - thousands maybe - and especially not at a frequency (hr/daily) that some would suggest.
Could be I'm just not at that "enterprise" level yet but what I am seeing from clients is handling the size and weight of their docs are the more immediate chokepoints of their RAG builds. A scanned PDF can easily reach 300mb+ and legal filings (compiled over time) can have as much as 3k+ pages. Perhaps the end-game would be that you can just dump these files into the chat prompt but until then, RAG still makes sense and the "scaling" approach is not necessary frequency (document RAG is allowed to be slow!) but rather architecture, ability and resilience to handle large/long documents.
What does this mean for your typical RAG stack?
Well, there's a lot of things you can't really do anymore; run things linearly, process entire docs in memory, VLM OCR every page, return every page to the user, running on edge / small VMs. Naturally, this means dedicated (+gpu) servers, jobs queues, concurrent workers, object stores, conversion processes, more sophisticated retrieval APIs, doc lifecycle management. This is where RAG goes from simply "standing up a vector store" to "build a small SaaS".
This is ultimately how the Ragextract API (https://t.co/sKCi0bhzlM) was born. Over the past few months, paying customers and metrics have been validating; Document size is a bigger pain point than frequency. Then again, once you have such an system in place, high frequency is also handled quite nicely. If you're looking to scale your document AI workflows, definitely give Ragextract a try today! Follow .@subworkflow for more info.
Introducing Ragextract - the new name for SubworkflowAI API.
Ragextract's focus is to let users "search" long documents before LLM parsing and reduce OCR bills by up to 90%. Check out revamped website at https://t.co/s82CzfvXgY
Document Search Release!
Built on top of our very own SubworkflowAI API, Document Search is a fast way to upload and search through documents via the SubworkflowAI web app.
Available today to all SubworkflowAI users. Let us know what you think!
Thinking about anti-patterns of Document Structured Outputs workflows using LLMs/VLMs.
1⃣ Feeding in the entire document - common as majority only work with small docs. When your PDFs are 100+ pages long, it quickly becomes expensive, inaccurate and wasteful.
2⃣ Converting to Markdown - old habits die hard! Vision made this an unnecessary step ages ago but you still use it because you haven't found a better way to split the doc (yet!).
3⃣ Catchall parse/extract/transcribe prompt for all/broad documents. Trades quality+accuracy for time+effort, rooted in inflexibility with page retrieval ie. lazy/awkward to split+parse document several times, need external store.
This is why I built @subworkflow to master document retrieval. IMO any PDF AI parsing services that don't make a big deal about retrieval is not targeted for AI developers or serious document workflows and why I don't see them as competition... just a different audience.
🎨 Rebrand deployed!
Moving on from the dark side, our new redesign is bright and bold. Check it out and let us know what you think! https://t.co/fq4nMWrQkd
Install the Official Subworkflow n8n community node via n8n > Settings > Community Nodes > Install and enter "n8n-nodes-subworkflow-ai"
https://t.co/khv47DhisG
Happy Friday 🎉 We've been a busy week grinding away at the roadmap and are happy to announce 2 new updates...
1) Upload via URL - now you can let Subworkflow fetch from the source! Enabler for users on edge devices and no-code platforms to send huge files without importing first. Files must be publicly accessible for the duration of the upload.
2) Official SubworkflowAI n8n node - no more stringing together http nodes! This node cover all API endpoints and handles polling however, it doesn't support Multipart Uploads just yet. This should make SubworkflowAI templates incredibly easy to build.
Much more to come!
Great news! Pricing update - all plan entitlements increased but prices stay the same 🎉
Our recent beta usage metrics makes it clear: a majority of our users will never hit the 100mb file upload limit and artificial limits causing unnecessary buying friction without adding significantly more value to the customer.
We're strengthening our resolve to serve customers with large documents requirements and increasing just about every entitlement across the board. This means the Starter plan is for everyone rather than a capped version of standard and our Enterprise plan is a custom deployment option where we build a plan that works to spec.
For detailed plan comparison, see our documentation https://t.co/Rfn4zUwLVN
There's no better time to sign up for Subworkflow. Check it out with a 14 days free trial at https://t.co/3e04V7C8lY
Happy to announce the SubworkflowAI typescript SDK is now available in npm and as an open-source Github repo. This will allow NodeJS developers to easily integrate SubworkflowAI into their backend services. Check it out here:
https://t.co/JNRHWmCmbD
Excited to start using Gemini 3? So are we!
We don't regret not bundling LLMs into our RAG service precisely because we know developers want to learn and exploit these models for themselves.
Looking over the new model specs, it seems @subworkflow is still very relevant for large document RAG and structured output workflows. Want your Gemini powered app to handle 100mb+ documents for multiple users simultaneously? We got you covered! Head over to https://t.co/b3V8TLNlRo for details.
Hey everyone, our cloud service provider is currently experiencing issues which is affecting our API. This may result in temporary disruption for our service. We are monitoring this and will share more updates soon. Any affected users, please email us at [email protected]
AI FREELANCERS & CONSULTANTS
AI FREELANCERS & CONSULTANTS
AI FREELANCERS & CONSULTANTS
AI FREELANCERS & CONSULTANTS
AI FREELANCERS & CONSULTANTS
SubworkflowAI uses it's own MultipartUpload API to handle documents of up to 3gb. We've just got around to getting the docs up for it here with example code - https://t.co/lGl7SUemZB
One of our priorities for the next month will be working on the official Subworkflow TS/JS SDK which should help hide a lot of this complexity and make our upload functionality easier to integrate.
Note: You must be on the Standard Plan or above to use this endpoint. Sign up for a free trial at https://t.co/vECV9k5H21
Question: How are SaaS companies handling their customer's gigabyte uploads?
Just got around to writing up docs for @subworkflow's MultipartUpload API (https://t.co/uNf8dXuLu6) but realise it's not as simple as I'd like... in a multipart-upload flow, the developer still has to grab a copy of the file into memory.
Feels like the obvious next step to make this easier is to be able to get the file directly from the source. First thought is just to go with Google Drive as a good first integration - the idea being once authenticated, you'll be able to just pass a google drive link instead.
Ok. Might actually be time to look for some dev help for this project!
Messaging is always the hardest part.
We're built for large documents but ultimately, we're a RAG Backend API for your to quickly build production quality RAG applications. The support for large docs just means you can build sturdy and for more industry use-cases.
Congrats to the Gemini team for the launch of the File Search Tool! Hard to not take the opportunity to do a little comparison with @subworkflow since we're after the same problem space.
Key differences
* Subworkflow supports files larger than 100mb (up to 3gb)
* Subworkflow has higher limit of file stores (we call them datasets)
* Subworkflow generates image embeddings by page image rather than text chunks which gives you grounding without the extra step.
* Subworkflow doesn't tie you into Gemini or any LLM! We just handle the retrieval part, you bring the AI!
* Subworkflow is an API and not an AI tool.
* Subworkflow does handle less file types however, tbh I never accounted for shellscripts support for RAG - is that really a thing?
Same problem space, probably not exactly the same audience.
We're looking for beta testers!
Decided against building something too complex for beta testers in the app, ended up with just gifting 10 seat Standard Plan and a gentleman's agreement to catch up at some point.
Spaces still open if you're interested: https://t.co/QfVhy0kMo5
I've been building @subworkflow for the last 3 months and boy oh boy, am I freaking exhausted!
Hi friends 👋 I've been heads down in order to one of my last 2025 goals up and running by year end. It's a simple tool that's been dwelling heavily on my mind but would have required much more than a few weekends to build so I hesitated. During the summer, I decided finally to commit and even started a new company - Subworkflow AI - to make it official! It's been a few months and the good news is that I haven't quit yet 😅 and in fact, the product is in the final stages for public release.
So much to write about but I think it's best to save them for future posts... In gist, massive overestimation for a seemingly simple SaaS idea! Building the API wasn't necessarily the hard part - it was one of the first things I completed. What tripped me up everything else; branding, figuring out pricing model, the drag of company registration, juggling 5-6 repos/websites... after years of working in the industry, it definitely feels I like I'm back at square one fumbling my way through it all.
Anyway, long silence is hopefully broken now that @subworkflow is actually running in production for some select clients and I plan to make the SaaS my primary focus for the upcoming year. The service isn't ready for public registration just yet (running a close beta to stress test!) but checkout the marketing site (https://t.co/JG2vTpGJyF) and let me know what you think. Cheers!
PS. If you've followed me for n8n content, no worries I'm eager to get back into n8n stuff too. It's actually been a blessing to have been in the dark and coming back to the community with fresh eyes!