I built AI agents into Home Assistant using @Cloudflare AI Gateway and fast/cheap models on Workers AI.
So far they seem benevolent and my family is unharmed.
If you want agents that definitely won't break your house or hurt your family, I've got an integration for you... now in the default HACS catalog. https://t.co/5MktHIkdgw
hard to believe we only launched big models on workers ai ~1.5 months ago.
the craziest part about making models more efficient is that it's better for our users (faster and cheaper), but also better for us as an inference provider.
we effectively reduced our hardware costs through software optimizations. this sounds easy but is really hard in practice - there is so much to do in this space, and so much we're learning and patching as we go. as we build the foundation for serving big models, the new models that come out inherit our efficient architecture too.
if you want to join the challenge - we're hiring
if you want to consume these models and don't want to be an MLE/SRE, happy to chat
i've picked up the pen so many times to write about being a woman in tech
and every time i chicken out because there's this catch-22: to talk about being a woman in tech, you need to have credibility. and once you start talking about it as a woman, you lose said credibility
so i'm going to mortgage some of my credibility to get this off my chest, as someone who has both had a pretty successful career in tech, and leads a team with a lot of women on it:
every woman you work with has had the most insane shit happen to her — on an almost daily basis. shit that makes you look at the camera and go "how did i end up here". from wild remarks about appearance to stalking and trauma dumping, and just constant dismissal from so many directions (employees, customers...). shit that you never tell anyone because they wouldn't believe you...
i recently learned that like 97% of my followers on here are men. so my challenge to you is just to sit with that for a moment. you don't need to do anything about it (other than try not to be that person). but you should be aware that that's what every woman you work with deals with
Only 4.5 months since @replicate joined @cloudflare and we're shipping WAY faster than when we were a 50 person startup. If you told me that a year ago I'd have called BS. Happy to be wrong!
Feeling a bit emotional about launching our new inference layer for agents at @Cloudflare today!
https://t.co/GOuIIpyq9C
It's a mix of pride for doing new things, wonder that @replicate and @cloudflare got to this stage so quickly, relief for having reached a finish line, and enthusiasm for the new starting line...
From day one we've been encouraged to "just go"; Cloudflare has been a pure accelerant for @replicate's ambitions and now just a few months into the deal the Replicate/Cloudflare boundary is gone, we're just a single team doing great things.
The ground that gets covered in this blog post is impressive: aside form the launch announcement we're rapidly onboarding tons of additional models, investing in a new era for Cog (BYOM), moving Gateway from http proxy to a leading control & data plane for inference + some other nice surprises that are expertly foreshadowed.
It'll all be here before you know it.
@mitsuhiko@threepointone You can do this with the new dynamic workers thing. Unclear how perf and caching are under load, but it's possible! Here's a half-baked example I extracted from real code that does it https://t.co/XoKmckARHu