day 49 of building @meshapi_ai
we're focused on helping AI-native teams build systems that are fast, scalable, and reliable in production.
big piece of that is role-based access control and org management - almost there
#RBAC is scheduled to be released in the coming week
years of slow progress, then gpt-3 hits and the chart just goes vertical.
and these are only a few frontier models. genuinely no idea what 5 years from now looks like.
day 48 of building @meshapi_ai
day 46 of building @meshapi_ai
Suddenly Qdrant failed and the entire RAG went haywire. #RAG is fixed now.
As asked by our customers - we added a user-set timeout option, cleaned up the supported-models page for OpenAI and Vertex.
Now working on auto-recharge and updating API docs.
day 45 of building @meshapi_ai
keeping up with which model to actually use has quietly become a full-time job.
opus 4.8, gpt-5.5, deepseek v4, kimi, glmโฆ the "best" one changes almost every week.
so we're building live rankings into Mesh - what's worth using right now, by price + performance + task.
sneak peek ๐ (DM if you want to contribute)
day 44 of building @meshapi_ai
we just shipped a new #RAG parser, plus the option to bring your own
we realized that most RAG failures don't start at the model. they start at parsing.
a PDF is a positioned text on a page - if you flatten it directly, tables turn to word salad, columns get read across instead of down. then you chunk that mess, embed it, and wonder why retrieval pulls nonsense.
a layout-aware parser reads structure first. tables stay tables, reading order survives, the chunks actually mean something.
day 43 of building @meshapi_ai
Masssiiivvveee upgrade to our CLI - it plans now
give #Mesh a task and:
- it figures out the steps itself,
- tracks them as a live checklist,
- reads and writes files,
- runs shell commands, and
- spins up dev servers in the background
working across many steps without losing the thread.
the part no other agent CLI does: it runs on any model on Mesh - Claude, GPT, Gemini, Llama, 300+ in all. pick one, switch mid-run, or let Mesh route every call (cheapest / fastest / balanced).
stop picking a model and praying. just let Mesh route intelligently.
day 42 of building @meshapi_ai
today we started working with a new #D2C brand. not a typical LLM gateway customer profile, which is what makes it interesting.
enterprise B2B is where most gateways focus. consumer brands are running real LLM workloads now and nobody's serving them well.
rest of the day was bug fixes, cleanup, and research into lowering latency. most days look like this when you're shipping something real.
day 41 of building @meshapi_ai
Realtime and RAG are both live on prod. all reported bugs cleared.
structured outputs enabled. docs updated for both.
also started experimenting with Google Ads today. first time paying for traffic. curious how it plays.
started customizing for our biggest enterprise client yet. more soon
day 40 of building @meshapi_ai
- Realtime API fixed and shipping to prod.
- RAG has one bug left, DOC extraction is breaking.
- also found API key cap validation edge cases.
- 3 UX gaps caught during testing, all going on the fix list before launch.
day 39 of building @meshari_ai
a friend downloaded a popular agent template last week, 50k+ stars on @github - whole setup crashed within an hour.
after going through 100s of repos and building dozens of projects in the last 30 days, I realized - most viral templates and blueprints shared on @X just don't work.
thinking of putting together a place where we test these and rate them transparently - with proper documentation.
would this be useful and would you contribute?
โข The seven missing categories (incl. multilingual non-Latin scripts + hyper-complex boss-fight prompts)
โข Stretch lineup: Gemini 2.5 Flash Image ("Nano Banana"), Imagen 3, GPT-Image-1
โข Third Gemini-vision judge for own-provider-bias triangulation
โข Automatic metrics: CLIPScore + VQAScore + LAION aesthetic
โข 50-prompt human calibration set Repo + raw CSVs + paper: https://t.co/Aj1H44urv1
If you run it on your own prompts and get different numbers, reply or drop an email to [email protected]
day 38 of building @meshapi_ai#Mesh Dollar per Image Bench
A holistic cost, quality, latency pilot benchmark of five 2026 frontier text-to-image models, all routed through one API gateway (https://t.co/l5eGKQbvm9).
n=5/model fully reproducible.
paper: https://t.co/Aj1H44urv1
Pilot caveats, hold us to all of them:
โข n=5/model
โข 5 of 12 prompt categories sampled (long-prompt, knowledge, counting, neg-space, edge-case, multilingual, hyper-complex deferred to full run)
โข Single region, single time, single seed
โข 2-judge ensemble, no human eval yet
โข Mesh only routes OpenAI + Google for image; FLUX/Midjourney/Ideogram absent