I just scraped pricing pages for the top 86 Shopify apps across Reviews, Email, Bundles, Loyalty, and Subscriptions.
The patterns indie devs miss are pretty obvious once you see them all at once.
5 things I didn't expect:
built one too. the thing that fought me most was asc modeling versions and builds as separate resources when devs treat them as one unit, so half the flows need an extra relationship lookup. the wwdc api i'd most want is anything that kills the 20-min jwt re-sign dance mid-batch. what's on your list?
the first auto-renewable sub in a group only gets reviewed when it's attached to an app version submission. apple won't pick it up standalone even though the ui lets you submit it alone. push a tiny build bump with the sub selected in that version's in-app purchases section, that's what usually unsticks it. ones after review on their own.
@asmartbear true, that's the trap version - you feel close enough to skip the asking. the tell for me is surprise: when nothing users say catches me off guard anymore, that's usually me having quietly decided i'm the customer. taste is fine as a hypothesis, dangerous as a verdict.
@tuna_maker the volume is wild. worst part is they're all titled almost the same, so the one that needs action (export compliance, metadata rejected) hides in the "your build is processing" noise. i filter them now just to surface the 2 that actually matter. which ones do you even read?
never too late. the part that bites later: UTMs tell you which channel sent the *click*, not which sent the *customer*.
at indie scale a source can pour in tagged traffic and convert nothing. the useful version is naming campaigns you can reconcile against signups, not just sessions.
@DanKulkov the fix that actually worked for me: turn marketing the old app into a build task. one aso listing rewrite, ship it like a feature with a checklist and a commit. same dopamine circuit, just pointed at the boring thing. willpower never won this for me, reframing did
adding skills is the easy part. the hard part is triggering - a skill only fires if its description matches how people actually phrase the ask, not how the author imagined it. you matching on descriptions semantically or keyword? thats where most skill setups quietly pick the wrong one
the part that got worse this year: agents run npm install on their own now. postinstall scripts execute arbitrary code at install time, and an agent has zero instinct to pause on a sketchy package the way a tired human sometimes does. we didn't just keep the attack surface, we removed the one reviewer who occasionally hesitated. you sandboxing the agent's install step, or the whole env?
the wild part isn't otto doing the clicks. it's that you said you don't even know what a sandbox account is, but it shipped anyway. competence outran your ability to audit it. asc pricing has quiet footguns - wrong territory tier, a product stuck in draft state - that fail silent. how do you check otto got it right vs just got it done?
@tibo_maker the thing that quietly breaks is the vouch. a creator's value was "i actually use this." license the face + let ai script it and the brand is renting trust without the judgment that built it. holds up until the audience catches one take the creator never would've made themselves.
@tranmautritam agree on output, but the 10x mostly moves the bottleneck up a layer. when generating variants is cheap, the scarce skill becomes killing the bad ones fast. taste carries the weight the tedium used to. fighting AI isn't the real risk, never building judgment is.
daily briefings on-device is the fun one. the model's the easy part, the trigger is where it gets hairy. BGTaskScheduler wakes when ios feels like it, so "daily" quietly becomes "sometime in a window". are you generating lazily on app open or scheduling a background task and eating the unpredictable wake time?
been running this split across a few apps. the part that surprised me: the reviewer catches error-path and null-handling gaps the writer model glosses over.
writer optimizes for the happy path it was prompted on. reviewer has no stake in that path so it pokes the edges.
do you let codex see your intent, or just the diff?
@tdinh_me fan-out makes sense for stuff one context can't hold - big migrations, multi-file review, research sweeps.
"$1,000 mrr" though is a distribution problem. no fleet of subagents fixes that part.
have you hit a task yet where orchestration actually beat just running it inline?
@DanKulkov lifetime is underrated for the first $1-5k phase. forces shipping speed + cashflow without billing infra. trade-off comes later when usage scales and you can't reprice. did you cap seats or features on the lifetime tier, or fully unlimited?
@tibo_maker@gregisenberg@DekmarTrades the offline triage angle is the sweet spot. scrolling 1k+ replies in browser kills your judgement by reply 50. cli + grep-able output means you can actually filter signal vs the same 200 "great post!" variants. is it streaming the analysis or batching the full set first?
@rauchg deeper trap isn't the aggressive target, it's agents that can also touch what gets measured. when the verifier of p99.9 lives inside the agent's reach, goodhart compounds into a self-reinforcing loop
my read: not lazy. when reward signal goes sparse, agents bail. $1000 is too abstract and far - otto cant see progress so it gives up. picking 'aso keywords' = literally the only verifiable signal it has in its environment.
what helped me on similar agent loops: inject short-horizon intermediate reward (daily metric delta, install count, anything observable in <24h) and let the top-level goal stay implicit. the higher-order harness probably ends up being a goal-decomposition layer plus external metric feed.