A recurring anti-pattern I see some of our customers pursue just to try to save on their cloud infrastructure bills: "auto-scaling is magic"
Take a complex business application and then tell your engineers to use Karpenter / whatever to run it with AWS spot instances, which assumes that you can just migrate loads freely between instances without any penalty
But wait.... 🤔
Oops - this application uses Akka .NET actors to manage long-running, computationally expensive jobs.
Scaling the system on-demand causes that work to get rebalanced to new nodes and none of the jobs were designed to be easily parked / resumed on-demand, because el cheapo auto-scaling wasn't a requirement originally, so the jobs get killed mid-progress and resumed on their new nodes.
So when the system starts actually getting used heavily, the auto-scaling kicks in and _makes the jobs restart and take way longer than they used to_ in the eyes of customers / stakeholders. This is a giant performance degredation!
So the next dumb step we take is trying to stop _current_ jobs from moving and only start new ones on the auto-scaled nodes. Sounds good, right? But _which jobs created the load that required auto-scaling in the first place_? THAT WOULD BE THE CURRENTLY EXECUTING ONES. So what we see is a bunch of new instances get auto-scaled, and they stay idle-ish while the old nodes stay busy! Effectively an auto-scaling stalemate that works only when we get lucky with incoming requests.
So that doesn't work - next up is trying to design for auto-scaling properly: re-entrant jobs, work-pulling instead of push, rebalance notifications, etc. This is an engineering effort that adds significantly more moving parts and infrastructure, requires a much higher standard of testing and validation, and will take a LONG time to develop.
Engineering leadership, again, looks for a short cut: what if we just force the rebalancing to happen immediately instead of gradually, so jobs that will be moved get moved instantly? Well, this causes chaos inside the cluster because everything gets blown up and restarted all at the same time as new nodes come online, rather than gradually + gently metered out. A thundering herd problem.
The morals of this story:
1. Auto-scaling works but only if you design properly for it. Not all workloads behave the same.
2. You'd probably be better off just buying servers or reserving EC2 instances than trying to add all of this complex machinery.
Between the data center protests and now the USGOV being able to decide which models can / can't be released, AI companies only have themselves to blame.
Fear-based messaging (devs replaced in 6 months, Mythos is a nuclear weapon) was always going to result in backlash long-term
@Mike_Preston17 it does - the built-in DuckDuckGo search is flaky due to throttling. I ran with a paid Brave Search API key for a while but setting up a self-hosted SearXNG instance has been the best option for me so far. Netclaw supports all 3, defaults to DDG.
Persuasion and selling people on win-win opportunities was always the better long-term option, but it's not "turbo growth" enough for hucksters or the autists at Anthropic who genuinely believe their own bullshit
I need to add optional parameters to dotnet-slopwatch - that'd probably be the most opinionated rule we have but LLMs can't resist doing it as a "safe fallback" https://t.co/whkfbCfLrf
An amazing product I use every week to make graphics for this, that, and the other just got even better. Templates pair great with LLMs and their MCP server btw.
(Jeff, I still owe you a blog post and I'm sorry)
Yesterday I shipped one of the biggest updates to @htmlcsstoimage in years.
It was months of devs but many years of dreams.
🥁🥁🥁
We have a Template editor now! Huzzah!
It's the most frontend-y thing I've ever done.
Now: enjoy
https://t.co/SvHfpKQwU9