@saen_dev Yes absolutely doesn't account for human costs at all. This can be seen as the lower limit, if the compute costs dont work out then it's clear the usage is not enough to switch
Should you self-host your LLM, or just keep paying the API bill?
I kept getting asked this and kept giving hand-wavy answers, so I built a thing that actually answers it.
You plug in three numbers — queries a week, avg input tokens, avg output tokens — pick the API you're comparing against, and it tells you the largest open-weight model you can self-host for the same money or less. It also shows graded options at 80%, 50%, 20% savings, so you can pick how much quality you're willing to trade for cost.
Tried it on a realistic small-team RAG setup: 10k queries/week, 20k input tokens (RAG context is long), 1k output, vs Claude Sonnet 4.6.
API bill: $750/wk. Largest open-weight model that fits the same budget: a ~480B MoE (35B active) for $243/wk. 68% cheaper. 8×H200, scale-to-zero, about 7 billed GPU-hours a week. Want to push it further? A ~397B MoE comes in at $119/wk — 84% off. 50 different open-weight models fit under that API budget. Pick the one that matches your quality bar.
The chart plots every candidate on a log-log cost vs size view with the API price as a reference line. You can eyeball the trade-off instead of squinting at a spreadsheet.
Pricing stays current on its own. API rates and the open-weight catalog come from https://t.co/Z5DIl0ec0H — big thanks to that team, an open community-run model database is the kind of thing this space genuinely benefits from. GPU rates refresh nightly.
Free. No signup. Link in the first comment. If the recommendation feels off for your use case, tell me — those are the cases I want to fix.
Applications for Curiosity 2026, close tonight.
If you're building in hard tech and frontier sciences in India, this is your last chance to apply.
₹1.25 Cr+ in grants, $85 K in OpenAI credits, live demos and booths for a science-faire style showcase.
🧵(1/8) An @OpenAI internal reasoning LLM achieved an AI Math milestone: solving an open problem central to its mathematical subfield— in this case, the unit distance problem of discrete geometry.
We came across it in a side quest to truly push our model on the hardest problems.
🧵(1/8) An @OpenAI internal reasoning LLM achieved an AI Math milestone: solving an open problem central to its mathematical subfield— in this case, the unit distance problem of discrete geometry.
We came across it in a side quest to truly push our model on the hardest problems.
Now open!
Applications Curiosity 2026 — curated showcase of deep science & hard tech by South Park Commons, India.
A grant pool of...*drumroll*... ₹1.25 Cr+, powered by our friends at @artparkindia — and more credits & prizes (which we'll announce soon!)
Selected applicants will demo their builds science-faire style — in front of investors, judges, fellow builders and institutional leaders.
@levie He forgot to add: For now*
They should track human time on their surfaces over time and if that's declining then its at significant risk of being subsumed
@ponnappa@_anshulk It's a question of the regulatory and capital ecosystem, where people are excited about building a new future with technology & an appetite for long term capital risk
Agree with @_anshulk We have the people or easily acquirable.