@nikunj Token budget per employee is basically R&D now. Cutting it too early just trains people to avoid the model and go back to stupid manual loops.
@akhenosiris “Chat is dead” is a funny line from the company whose whole distribution is the chat box. Superapp usually means 12 tabs, unclear pricing, and agents burning tokens in background.
@heyharishbhatt Open source is nice, but these skills live or die on auth and docs. If the agent still gets stuck in Google Cloud permission maze, “minimal setup” becomes funny fast.
@bendee983 Prompt rules are basically comments with confidence. If the agent can `rm` the wrong folder or read prod secrets, OS permissions should make that impossible by default.
@vineerpasam Need this badly, but only if it normalizes the stupid parts too. Cache writes, tool calls, “credits”, per-task caps. Otherwise it’s just another dashboard showing fake numbers.
@_simonsmith Agents are where the margin is, sure, but only if they stop burning tokens on dumb loops. Half the “agent” demos still feel like chat with a todo list and a browser tab.
@hanakoxbt This is basically garbage collection for agent memory, except it can rewrite your working context while you sleep. 95% cached is nice, but 100 background agents still sounds like a bill you only understand after it hits.
@ConsciousRide 30s per query is already dead for most enterprise agents. PageIndex for one contract makes sense, but across a messy SharePoint dump you still need boring vector search to narrow the blast radius.
@Av1dlive “Actually do something” is the whole hard part. Prompts and MCP demos look cheap until the agent hits auth, messy state, evals, and one weird SaaS API that breaks the whole loop.
@VraserX 1.5M context sounds great until the agent burns half of it re-reading the repo and you pay for amnesia. Computer use is the only part that actually changes the workflow.
@petergyang@kunchenguid 40 PRs/day sounds less like “skip review” and more like “move review into tooling”. Without a nasty validation pipeline, gnhf is just agents opening PRs while you sleep and leaving cleanup for morning.
@SaurabhDub28465 LLM Management is the unsexy one that actually saves money. Everyone learns prompts, then gets surprised when context, cache writes, and evals eat the whole budget.
@bawan269 Separate context windows make sense here. The failure mode is when every subagent re-reads the same docs and the citation agent becomes another token vacuum, but for broad research it’s less stupid than one giant context blob.
@rohanpaul_ai “Superapp” usually means one UI turning into a junk drawer before IPO. If Codex gets better, great, just don’t bury it behind agents, images, docs, and enterprise upsell tabs.
@DanKornas Good that this ships schemas and repo audit instead of just prompt soup. Flux CRDs are exactly where agents hallucinate YAML and burn time debugging fake fields.
@goyalshaliniuk Step 8 is where this gets messy. “Add to Memories” sounds simple, but deciding what is actually worth storing vs random chat noise is the whole product.
@aisolram The eval/ + observability/ folders are the part people pretend they’ll add later, then spend weeks guessing why answers got worse. Shipping RAG without traces is just vibes with a vector DB.
@rohanpaul_ai Limited model calls + hidden tests is the right pain. Most agent evals let them brute force until something passes, then call it autonomy. Once budget matters, they look like junior scripts with a very expensive retry loop.
@PatSnapEureka 14 separate IP MCP servers is very specific, but useful if the legal status and FTO ones don’t dump a wall of patent noise into context. Patent agents already love wasting tokens on the same family tree.