Yeah, to clarify, I didnβt even fully reach the stage of running large-scale parallel agents across multiple stores. But I can say that, yes that would be the next stage to make this idea a reality.
The authentication friction was already becoming a blocker much earlier in the process. Even with a single dev store, every fresh cloud environment/agent session would require Shopify authentication again, often with captcha challenges.
So the core POC worked, the agent could launch the app in dev, test flows, and generate videos, but operationally it became difficult to automate reliably enough to scale further into multi-agent/multi-store testing.
For multi-agent and multi-store setups, we would definitely need isolated dev/testing environments as well, since each agent launches in a completely clean state with no browser history, session, or previous authentication data available.
Mostly yes. I was primarily using the same dev store for repeated testing flows.
The issue was that each new cloud environment/agent session behaved like a fresh machine/browser session, so Shopify authentication still had to be repeated frequently even when targeting the same dev store. The agents would lose authenticated state between environments, which triggered repeated logins and captchas.
I was using Cursor Cloud Environments with the codex-5.3 model as the agent harness.
The workflow was roughly:
I would prompt the agent in Cursor to test a specific app feature.
Cursor would spin up a cloud environment automatically.
The agent would clone the repo, create the test environment, and attempt to run the Shopify app in dev mode.
I would receive a Slack message with a Shopify authentication link.
After manually authenticating with Shopify (and often solving captchas), the agent could continue the flow.
The agent would then interact with the app, test the feature, and capture a video of the workflow/results.
The goal was to eventually parallelize this across many agents so every PR could automatically regression-test different checkout/app flows and generate videos whenever something broke.
The biggest blocker was Shopify authentication in cloud environments. Since each fresh agent/environment required manual login + captcha handling, it became difficult to scale or fully automate reliably.
@nick_wesselman I asked my agent to run the app in Dev in the cloud and create and test unreleased app features and make videos of it, which I can review and see if the feature works or not.
The idea was I can automate feature testing using cloud agents and run a bunch of them in parallel testing every app feature. Agent can show me a video when any feature breaks. For example, in this video cursor is able to run the app in dev in cloud and create this POC video for me, all by itself from a prompt, while testing a feature of the app.
But the biggest pain point is the authentication with Shopify which has to be done manually for each agent in the cloud and it constantly asks to solve captcha, because of that I had to abandon this idea of automated feature testing on each new PR. If it can become easier to do, then it would be amazing, as I can be certain on each new feature release in the app, all previous features are not broken, since we would be able to test all of them in the cloud before each release.
Always an amazing feeling to hear merchants loving my Shopify app π₯Ή
Messages like these make all the late nights, debugging sessions, and support chats worth it.
Knowing that @CheckoutRules solved a problem someone had been searching a solution for, for years is the best part of building π
Rethinking Polaris web components after last night. Out of nowhere, the table header just disappears. No code changes on my end. That happens (or course) to coincide with a bug in my export. I was going to tell the customer, "hey, just use the filters for now". But, guess what? Filters are gone. I can't fix the web component b/c it is a black box. So, I apply a quick fix to move the filter out of the table header at the same time as fixing the export bug. Got it done. Profusely apologized. Crisis somewhat adverted. But my reputation with that customer is now damaged all for a few lines of CSS that I probably could have written myself. In retrospect, I think using a polaris css file with standard HTML components would have been better.
@eComJonathanX Thanks for the tip! Yes, its better, but would be a lot easier, if it was just,
App -> 3 dots in top right corner -> Share logs
Or better, app bridge can expose an intent to share logs.
Helping merchants find βShare logsβ in Shopify is so painful. π₯
Itβs buried behind:
Settings β Apps β App β Functions card β 3-dot menu
This is the kind of thing that should be one click, not a scavenger hunt. π€·ββοΈ
@blairbeckwith@heymantle@blairbeckwith in case you missed it, I posted all the details here about pain points with the emails in @heymantle
I appreciate the support if these can be resolved. π
@heymantle lots of great idea here how to improve the email editor.
Currently, there is so much friction, trying to create even a few emails using mantle, many steps are unnecessarily hard.
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)