At @Abnormal, coding agents have become core to how we build products. Recently, we’ve noticed that they’ve become capable enough that work we used to scope into a traditional sprint can now land in as soon as a day.
We realized this was a big enough shift to rethink how engineers actually work. So a few months ago we started using a new operating model on one of our new product teams, structured around engineers owning end-to-end outcomes week over week. I’m excited to share a glimpse into how it works, and our takeaways along the way!
(Link to the blog post in replies)
Stop vibe checking your vibe code!
We just released Vibe Code Bench: the first benchmark that tests whether AI models can actually build complete web applications from scratch.
Featured today in @Inc
(1/6)
The best part? This was built entirely with Claude Code from start to finish -- I didn't write a single line of code myself. All of the code is open-source and available on GitHub: https://t.co/x2tTXwT8ta
As an experiment to test out LLMs' reasoning capabilities, I built an app that lets some of the top thinking models from @OpenAI, @AnthropicAI, @GoogleDeepmind, and @xai play chess against each other. Check it out here! https://t.co/ugR5WT8ZqN
After @AnthropicAI released the Model Context Protocol last week, I built an MCP server that lets Claude view & create various AWS resources. While this was an experimental project, it’s exciting to get a glimpse into the use cases that’ll be unlocked through this open standard.