Built an AI agent to draft my social posts. It worked. But was it actually good?
Stopped tweaking prompts by vibes and built an eval pipeline: code-first grading, synthetic datasets, CI gate on regressions.
Blog:
https://t.co/3n3bw9unRs
Let's all be thought leaders! AI Agent running in the background to research topics and send you LinkedIn and X draft posts for approval. As a bonus, it can post on your behalf! https://t.co/KUeKKxn2ya #agents#ai#claude#cloudflare#resend#temporal
"AI psychosis" — Mitchell Hashimoto nailed it. Entire companies where every decision gets filtered through an AI lens whether it makes sense or not. Seen it firsthand. The ones actually winning with AI are quiet about it.
https://t.co/te6Mvl2CuH
#AI#EnterpriseStrategy
I strongly believe there are entire companies right now under heavy AI psychosis and its impossible to have rational conversations about it with them. I can't name any specific people because they include personal friends I deeply respect, but I worry about how this plays out.
I lived through the great MTBF vs MTTR (mean-time-between-failure vs. mean-time-to-recovery) reckoning of infrastructure during the transition to cloud and cloud automation. All those arguments are rearing their ugly heads again but now its... the whole software development industry (maybe the whole world, really).
It's frightening, because the psychosis folks operate under an almost absolute "MTTR is all you need" mentality: "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" We learned in infrastructure that MTTR is great but you can't yeet resilient systems entirely.
The main issue is I don't even know how to bring this up to people I know personally, because bringing this topic up leads to immediately dismissals like "no no, it has full test coverage" or "bug reports are going down" or something, which just don't paint the whole picture.
We already learned this lesson once in infrastructure: you can automate yourself into a very resilient catastrophe machine. Systems can appear healthy by local metrics while globally becoming incomprehensible. Bug reports can go down while latent risk explodes. Test coverage can rise while semantic understanding falls. Changes happens so fast that nobody notices the underlying architecture decaying.
I worry.
arXiv: 1-year ban for hallucinated citations.
"The AI made it up" is not a defense when your name is on the paper.
This is coming to enterprise too. When AI outputs drive real decisions, accountability doesn't vanish because a model was involved.
I'm Boris and I created Claude Code. I wanted to quickly share a few tips for using Claude Code, sourced directly from the Claude Code team. The way the team uses Claude is different than how I use it. Remember: there is no one right way to use Claude Code -- everyones' setup is different. You should experiment to see what works for you!
I'm Boris and I created Claude Code. Lots of people have asked how I use Claude Code, so I wanted to show off my setup a bit.
My setup might be surprisingly vanilla! Claude Code works great out of the box, so I personally don't customize it much. There is no one correct way to use Claude Code: we intentionally build it in a way that you can use it, customize it, and hack it however you like. Each person on the Claude Code team uses it very differently.
So, here goes.
We now have the DeepSeek-R1-Distill-Qwen-32B model on Cloudflare Workers AI! Use it to solve math, coding and complex reasoning tasks. It's open source, hosted on Cloudflare servers, and is comparable to OpenAI's o1-mini.
What makes it shine:
🌟 It solves math, coding and complex reasoning tasks
🌟 It thinks out loud – watch its step-by-step logic!
🌟 Self-checks answers before responding
Try it out here: https://t.co/7OB5lO8io1
ChatGPT function calling is awesome! You can have it convert natural langugage to a json output that can be pluggled into an API call. Example at https://t.co/HdPMjGyuz9 https://t.co/lmTVaLsPMT #ai