And v3 live. Evaluate test cases for your prompts across multiple models
Live on @UneedLists Currently #2 but don't think can compete with @Pauline_Cx on feedbask. I actually use feedbask for my logged in users.
Check it out and upvote.
https://t.co/wuDrKzgdfR
Three ways to wire up AI agents:
1. Sequential โ pipeline. Each step depends on the last.
2. Parallel โ fan out, aggregate. Speed and diverse opinions.
3. Loop โ repeat until an exit condition flips.
Most production systems nest all three.
The hard bits are always the aggregator and the exit condition.
Hackathons aren't for the prizes.
They're how people find out what you can do.
Won People's Choice at a global one last year. Led end-to-end. Built async with teams across 3 countries.
The prize was a line on a CV.
The reputation it built is what's still paying off.
Wrote a script that fires 500 concurrent requests at production before launching.
Waitlist signups, app creation, dashboard generation. All at once.
Zero crashes across 1,300 operations.
P95 latency doubles past 200 concurrent.
30 minutes of work that tells you exactly what to upgrade before peak traffic.
Beats guessing at 2am when the server's on fire.
25 posts on X over 65 days. Net followers: minus 3.
411 total impressions. 18 likes. 7 profile clicks.
The engineering-heavy posts about agent architecture and confirmation bias? Zero profile clicks across 6 of them.
The personal ones: Kenya to Melbourne, marathon training, the wife's Google Sheets question. Two thirds drove a profile click.
Lesson: at under 1,000 followers, every post needs a personal hook in line one. Technical insight goes in the body.
Restarting with that rule. Will report back.
@AgodiDivin32690 Thanks for this. I did run this through Claude Design and got similar feedback where the suggestion was to talk about the pain and not what it is.
Would be great to see what you come up with
Can I get help with someone roasting or if youโre nice :) then providing feedback on my landing page. https://t.co/HHpEXrfVf5
I donโt have thousands of users to do A/B testing.
Does the page provide clear messaging, explain the problem and set the right expectations?
Built an ROI model across 6, 12, 18, and 24 month horizons. Presented it to finance leadership. It led to budget approval. The trick: I didn't pitch technology. I pitched business outcomes with honest risk modelling. Finance people don't care about your architecture. They care about the shape of the cost curve.
Entered the Australian Defence Tech Hackathon with my brother. Most teams built hardware + software. We were software-only. Built an edge-deployed vision-language model under 2GB for real-time audio narration of video feeds. Got to 1.4 second latency, targeting sub-1 second. Didn't win. Bit of a wake-up call about what happens when hardware and software are designed together.
Most AI developers test informally. "Does the output look right?" I built 200+ automated tests for a client project. 70 specifically test AI output quality. They detect when a model degrades. They gate every model switch. Before deploying a new model, the test suite proves it's better than what's in production. Boring infrastructure. Makes AI trustworthy.
Thats insightful. Well usually when creating a persona itโs a demographic. So in my case teachers, small business owners, etc were my ideal customers. Instead of building 1 teacher 1 SME which doesnt truly reflect broader profession I build variations of them at different pay level, different lifestyle etc
And yes the drift occurs agreed. To mitigate this, Ive been giving particular tools and data. Like connected the waitlist form, posthog mcp server so can see the analytics and demographics etc. This doesnt give the whole picture but helps me understand better
I don't have real users yet. So I built 76 AI personas to simulate them. Each has a name, age, tech level, and use case. I poll them on every product decision. Zero of them said "launch now" when I asked 3 weeks ago. This is simulation, not real user research. But it's better than guessing alone. The personas have caught pricing mistakes and priority errors I would have missed.
Won People's Choice at a global hackathon. After that, engineers from a completely different division reached out with AI accuracy problems. I'm an individual contributor, not a manager. But I diagnosed the root cause, improvements landed, and it led to deeper sessions with their senior team. Sometimes your reputation is your org chart.
Editors at work complained AI outputs got worse after a framework migration. I investigated. The root cause wasn't the models. It was the JSON schema. The schema descriptions were acting as prompt instructions, overriding editorial intent. Wrote a paper on it. It changed how the team approaches structured output design. The lesson I keep coming back to: things that look like configuration are often instructions.
My wife tracks patients in Google Sheets. She asked: "Can it just look like an app?" It couldn't. So I started building something. Paste a Google Sheet URL, get a dashboard. The best product ideas come from watching someone you care about struggle with something that shouldn't be hard.
Distribution IS the product.
We were heads-down fixing bugs. Kanban drag-and-drop, calendar editing, variant themes.
Meanwhile: zero landing page. Zero waitlist. Zero social presence. A product nobody could find.
Learnt this the hard way with a previous project. 122 signups in 7 months because the product was invisible. This time, landing page shipped before the last bug was fixed.