Building user-aware agents. Creator/maintainer of Remnic: open-source memory + context for AI agents. Agentic commerce, MCP, evals, Magento/Adobe Commerce.
I’m working on user-aware agents: AI systems that remember responsibly, understand user context, model intent, learn preferences, and get useful work done with fewer unnecessary interruptions.
My open-source project, Remnic, explores memory and context for agents:
- scoped memory
- provenance
- retrieval quality
- correction
- boundaries
- evals
- ask-versus-act decisions
- MCP / HTTP access
The thesis:
Agents will not become truly useful by asking humans more questions. They will become useful by remembering responsibly, modeling intent, learning preferences, and acting with better context.
My background is 25+ years shipping commerce systems: Magento, Adobe Commerce, ERP integrations, checkout, deployment, client-facing architecture, developer education, and founder/operator work.
Commerce is my proving ground. Human-agent systems are the larger work.
More: https://t.co/Dcafyuaqol
I guess the 10X promo from gpt-5.5 launch party expired? I thought it was good until ~June 5th, but I just went from 60% weekly usage left to 0% in about 10 minutes under the same load as I've been running for weeks - is that right, @thsottiaux? This is on the $200 pro plan.
@fitchmultz@jturntdev I do wonder if they are in the middle of fixing it or confirming something. My usage limit after the rest until a few hours ago was still burning crazy fast but then an hour or two ago it slowed down - under the same workload.
It is odd they are being so quiet, not like them.
@jturntdev@thsottiaux@sama@OpenAIDevs I haven't seen a reply/post yet, but I just noticed my limits reset...let's see if the underlying issue is fixed and if it goes back to the old rate of consumption...
@jturntdev@thsottiaux@sama@OpenAIDevs I am on pro and have the 10x on top of that from gpt-5.5. I have been unable to really dent my limits even with xhigh and fast mode - and I was really trying. Then suddenly yesterday I noticed I had used over 50% of my limit in about 2 days, so something definitely changed.
I have a remote Linux server setup via the 'Connect via SSH' option in macOS. Works fine from macOS. I can connect multiple devices to the same session via the Mac app. But I'm running into issues connecting via SSH from the iOS app. Is that supported yet?
Can the Codex remote control in the ChatGPT iOS app connect via SSH to the same remote Codex server that the Mac app is connected to?
@thsottiaux - can you point me to someone who can help me understand what's implemented/expected to work?
I’ve spent most of my career helping teams adopt platform shifts after the keynote ends.
Magento 2 was one version of that.
AI agents are the next one: powerful demos, unclear operating models, messy deployment, and a lot of teams asking, “Okay, but how do we actually use this?”
Agent memory without evals is vibes with a database.
A useful memory system should be able to tell whether it:
- reduced repeated context
- retrieved the right memory
- respected scope
- avoided stale context
- helped the agent ask fewer low-value questions
Otherwise it’s just hoarding notes.
Agentic commerce is not just product feeds plus checkout.
The real unlock is when the agent understands the buyer:
- budget
- brand preferences
- shipping constraints
- purchase authority
- risk tolerance
- “ask me before buying” rules
The catalog matters. The user model matters too.
Most agents don’t need a longer prompt.
They need a better working model of the user:
- goals
- constraints
- preferences
- current projects
- risk tolerance
- vocabulary
- definition of good
- when to ask before acting
That’s the layer I’m exploring with Remnic.
@bettercallsalva@Chris0x88@sama@rezoundous Similar experiences here - previous iterations of these models it was fairly true that opus > sonnet, but now there are tasks that opus will reason over in a way that’s total overkill and either never get the task done or come back with one of these “it’s late” type responses.
@ThatMagicalFam 8 days - ideally two days for each park - gives you a chance to slow down and enjoy the details of each park, or if you have young kids it's easier to build in nap/break times and not miss out on anything. This was a lot easier to do 20 years ago before things got so expensive!
The agent that remembers nothing is useless.
The agent that remembers everything is dangerous.
The useful agent remembers the right things, for the right reasons, with the right boundaries.
That's what I'm building with Remnic.
@sama Yes! 6 months ago agents would only run for a few minutes before needing feedback, breaking my ADHD brain from time with the kids and making me want to check my laptop again and again. Now I launch a /goal in the morning & check back during nap time and codex handles the rest
@sama@rezoundous GPT-5.5 in Codex meanwhile will spend 50+ hours doggedly pursuing a /goal. It’s more literal so you have to define the stop conditions better though.
These behaviors are really similar to what makes you trust a human colleague more or less.
@sama@rezoundous Opus 4.7 makes excuses a lot more, even on trivial stuff. It’s more likely to push back and not in a safety/alignment way but in a random way like “well, we’ve done enough for today”. It’s harder to get it to tackle a large task.