Chi-chi Armstrong

@thisischichi

Lover of Jesus, maker of music, writer of software. Founder @AttendistHQ @assentyapp Organiser @TechFemale Formerly known as @realfreshtv

Manchester, UK

Joined January 2009

405 Following

572 Followers

3.5K Posts

thisischichi retweeted

José Valim

@josevalim

5 months ago

IMO, unsupervised coding agents are still in the "uncanny valley" of software development. Given the recent hype of AI writing "a browser with 1 million lines of code from scratch", I have been trying to give it more open-ended tasks, and every time it delivered seemingly working software but with large flaws in the implementation. Last week it was a "download manager in Rust" that completely failed at concurrent downloads. This week was a port of the Ryu algorithm for pretty-printing floating points with different precisions (f32/f16/bf16/f8). Both times the software seemed to work but had deal breaker bugs lurking. Fixing those required me to become part of the loop and supervise the agent. Here is a breakdown of the latest experiment, for those who may be interested. TL;DR: https://t.co/bIHXXjHMZh (it says 66k lines added, but 65.5k is a generated fixture file, so the diff is more like +800/-300). --- The goal was to implement pretty-printing of floating points (f32/f16/bf16/f8) in Numerical Elixir. I created a blank repository, wrote the problem statement, and mentioned I specifically wanted the Ryu algorithm, linking to a reference implementation in Erlang (https://t.co/388CdYIMW2) and to the paper (https://t.co/VGvtjlvP79). I did one attempt with Sonnet, another with Opus, and while they both delivered a project with a passing test suite, both implementations were wrong and incomplete. In the first attempt, many of the tests were fabricated, to match the faulty implementation. With wrong code and wrong tests, there was not much to salvage. Time to start over. In the second attempt, now with Opus, I suggested it could generate all possible printable values for f16 from the canonical Ryu implementation in C (since there are only 65k of them), and use that to validate the algorithm. Once again, it delivered a passing suite, but it made one crucial mistake early on: when generating the reference table, they cast f16 to f32 before printing (a subtle mistake many would make), which led to wrong reference values. And because the reference table was wrong, it lead to all sorts of wrong decisions downstream, such as adding casting and deltas. That's when I decided to be in-the-loop and break the problem into smaller ones: 1. I asked Claude to create a f16 reference table and made it clear in the prompt that any sort of casting would lead to the wrong solution. That's the reference table you can find in the PR 2. Then I asked it to explicitly port the Erlang algorithm, as is, and then parameterize the constants in the algorithm to make it generic (so it works for f16/f32/etc). Then write a test comparing all reference f16 values 3. Then I moved the algorithm to Nx. Since pretty printing is now precise, it broke 150 tests, which I used Claude to fix (with specific instructions to change only the precision in numbers and not touch anything else) Claude still made mistakes but because I broke the problem into small steps, and verified their correctness each step along the way, I avoided bad decisions cascading through the whole implementation. And yes, using Claude was still extremely helpful (honestly, if I used Claude only on step 3, it would have already been worth the price tag). Those two experiments have been orders of magnitude smaller than the browser one, both they seemed to work, but were flawed upon deeper inspection. For now, I'd still advise staying in the loop and avoiding falling into this trap.

134

10K

thisischichi retweeted

Arvid Kahl

@arvidkahl

5 months ago

It's still a bit shaky and bleeding-edge, but the "Ralph Wiggum" plugin in Claude Code is the first version of what's to come with autonomous, agentic loops. It's a "we learn from failure"-centric approach. You define your goal condition and let the agent loop over and over until it has verifiably reached that promised goal. It might take 2 minutes or a day. But the loop continues to experiment and look at prior work to ultimately get you there. I've been seeing solid results with that. Takes some massaging and setting things up right (mostly for there not to be any interruptions), but when it works, it WORKS. You can install this inside your CC by going to /plugin and typing `ralph`

arvidkahl's tweet photo. It's still a bit shaky and bleeding-edge, but the "Ralph Wiggum" plugin in Claude Code is the first version of what's to come with autonomous, agentic loops.

It's a "we learn from failure"-centric approach. You define your goal condition and let the agent loop over and over until it has verifiably reached that promised goal.

It might take 2 minutes or a day. But the loop continues to experiment and look at prior work to ultimately get you there.

I've been seeing solid results with that. Takes some massaging and setting things up right (mostly for there not to be any interruptions), but when it works, it WORKS.

You can install this inside your CC by going to /plugin and typing `ralph`

138K

thisischichi retweeted

Addy Osmani

@addyosmani

5 months ago

This is the most fun moment to be a developer in years. The AI tools are imperfect, the patterns are still emerging, and there's genuine room for experimentation. Roll up your sleeves and build something. The earthquake is further opening up what's possible. The best news about this new layer: traditional engineering skills are more valuable than ever, not less. It helps us minimize shipping slop. Developers who already invested in CI/CD, testing, documentation, and code review are having the most success with AI tools. These "boring" foundations are accelerators. They turn agents from chaos generators into productivity multipliers. The real opportunity is learning to work at a different altitude. Instead of typing syntax, we're reviewing implementations, catching edge cases, and shipping features in hours that used to take days. That's genuinely exciting. Yes, there's a learning curve. Understanding how to provide context, iterate on plans, and review AI-generated code quickly takes practice. But this is learnable through doing - build small tools, review everything, develop intuition through repetition. The multiplier potential is real when you combine AI speed with engineering judgment. We're not replacing coding skills but we're finally able to focus them on the interesting problems while delegating the tedious parts.

215

957

364K

thisischichi retweeted

Chris McCord

@chris_mccord

8 months ago

I've been saying all year that giving the agent a shell + the file system removes mountains of complex abstractions. Glad to see some return to basic stuff that works proving itself out https://t.co/C6MOglPujq

thisischichi retweeted

Simon Willison

@simonw

8 months ago

Huh, I guess I went from "agents are meaningless jargon hype that's never going to happen" in January to "Claude Code is a General Agent" in October

824

263

101K

thisischichi retweeted

José Valim

@josevalim

10 months ago

Congrats to Rust, Gleam (welcome!!) and Elixir on being the top 3 most admired languages on the StackOverflow Survey and Phoenix for being the most admired web framework for the third year in a row! https://t.co/zycBJ8ncTu

803

121

36K

thisischichi retweeted

Daniel Bergholz

@danielbergholz

10 months ago

Extremely happy with my decision to migrate from JS to Elixir + Phoenix 🔥

163

thisischichi retweeted

Saša Jurić @sasajuric

about 1 year ago

Why aren’t you pairing more with him? He types twice as fast as you. Of course he does. So does a cat having a seizure on a mechanical keyboard. But that doesn’t mean it should be writing production code. 😂 https://t.co/6pqWLqbQDE

thisischichi retweeted

Erwin

@Erwin_AI

about 1 year ago

The golden rule still mostly holds: The ones talking about numbers are seeking validation. The ones just building are the ones doing the real numbers. Do we see the founders of Lovable, Bolt, Cursor, etc show off numbers every day?

515

thisischichi retweeted

Edward Sun @edwardcreates

about 1 year ago

it’s actually insane how well-made @screenstudio is. it lives up to all the hype.

thisischichi retweeted

José Valim

@josevalim

about 1 year ago

Most BBC traffic is going through an Elixir-powered routing layer and it is all running on 12 nodes: “fewer incidents, better spike handling, more confidence”.

559

24K

thisischichi retweeted

Steve Bussey @YOOOODAAAA

over 2 years ago

Appreciating Elixir #myelixirstatus: ~1.5M ARR business with > 2000 users 3 servers running @flydotio shared-cpu-2x:4096MB < 200MB RAM average 42ms response time average AND, we're super overprovisioned on servers. I had a memory leak and just fixed it. Will drop to 2GB servers

300

39K

Chi-chi Armstrong @thisischichi

about 1 year ago

Thank you @arvidkahl This and The Embedded Entrepreneur were seminal in my journey building @AttendistHQ So much more to share very soon. Thank you for being open about your process. Nice to see a picture of me from the lockdown years

Arvid Kahl

@arvidkahl

about 1 year ago

I still think back to this day a lot, when so many amazing people bought my first book and gave it a place in their home. Almost five years ago now. 🥰 Still selling a couple every day. Still grateful for every single one of them.

Chi-chi Armstrong @thisischichi

about 1 year ago

🔥🔥🔥

José Valim

@josevalim

about 1 year ago

Introducing Tidewave: beyond code intelligence. While working on our web apps, we run code, query the database, read logs, search docs… but AI tools are limited to compiling code. Watch Tidewave transform Claude Desktop into an agent by running a MCP server in your web app!

880

185

409

88K

thisischichi retweeted

Joel Gascoigne

@joelgascoigne

over 1 year ago

It's a little hard to believe. Fourteen years ago today, I launched Buffer from my apartment in Birmingham, in the UK. Today the business generates $1.65 million per month, serves 59,000 customers, and enables fulfilling work for 72 people.

918

149

78K

thisischichi retweeted

Michal Malewicz

@michalmalewicz

over 1 year ago

Consistency. --- most people quit here. Consistency. Consistency. Consistency. Consistency. Consistency. Consistency. Consistency. Consistency. Consistency. Consistency. Greatness. Consistency. Consistency. Consistency.

118

thisischichi retweeted

@levelsio

over 1 year ago

I think "my idea takes a long time to build" is usually a masive red flag (unless you're building a nuclear power plant or similar) It usually just means you can't minimize the requirements enough (Elon-style) and are massively procrastinating and overcomplicating it The first version of your site/app/startup doesn't have to be great or polished but it should solve the primary problem a user has quickly Everyone these days can put up a waiting list with email box, I can do that in literally 1 minute, it's useless Don't be lazy, just build the thing

128

158

404K

thisischichi retweeted

Max Brodeur-Urbas

@MaxBrodeurUrbas

over 1 year ago

9am - noon: everyone on the team talks to users noon - 10pm: we sit in a room and build what they asked for rinse and repeat, 6 days a week (we’ve been at it for ~1.5 years) things fall into place if you do it long enough the view gradually gets better too 🏔️

125

733K

thisischichi retweeted

Roberto Díaz

@_rbart_

over 1 year ago

@DanielLockyer @levelsio I suggest doing what @dannypostma did with his SEO course. - Create right now a waitlist page. - Start creating a presale page, no need to work on the course yet. - Put a limited special price ($59/69). - Put tiered prices regarding sales volume. Enjoy.

thisischichi retweeted

Pat Walls

@thepatwalls

almost 2 years ago

Founder doing $30M+/year sent me his "secrets".

120K

Chi-chi Armstrong

@thisischichi

Last Seen Users on Sotwe

Trends for you

Most Popular Users