I recently joined @LaunchDarkly to work on their AI initiatives, and continue my journey of helping people build better AI products.
Launches like Claude Sonnet 4.5 are awesome. I love the benchmark scores and can't wait to try it out. But they always raise the same question - how good is it, really? Against my metrics with my use cases, my data, and my users?
That's what I love about what we're trying to build with AI Configs - to experiment and evaluate new models, prompts, and workflows in production, with your specific guardrails, and roll them out safely.
Introducing Claude Sonnet 4.5—the best coding model in the world.
It's the strongest model for building complex agents. It's the best model at using computers. And it shows substantial gains on tests of reasoning and math.
Raising An Agent - Episode 5
In this episode, @beyang interviews @thorstenball and @sqs to unpack what has happened in the world of Amp in the last five weeks: how predictions played out, how working with agents shaped how they write code, how agents are and will influence model development, and, of course, all the things that have been shipped in Amp.
Timestamps:
0:00 Opening Teaser
1:24 Introduction and Casual Conversation
2:02 Changes in Coding Agents Over Five Weeks
3:28 Challenges and Misconceptions
6:52 Optimizing Tools and Prompts
9:15 User Feedback and Best Practices
15:29 Advanced User Strategies
22:41 Balancing Integration and Flexibility
24:17 Cost and Accessibility of Agents
28:05 Feedback Loops and Model Training
35:12 The Evolution of AI Coding Agents
35:35 Training Data, Tool Affinity, and Tool Integration
36:50 User Feedback and Model Training
43:22 Product Evolution and New Features
55:55 Remote Execution and Multi-Agent Workflows 1:02:47 Future Directions and Improvements
Want to give Claude 3.5 Sonnet a go for coding tasks? Better yet, do you want to compare how Claude 3.5 Sonnet stacks up against other LLMs?
With https://t.co/V9PbjEfhBF you can! Try it on any Open Source repo, compare against Claude 3 Opus, GPT-4o, and many others.
@lulumeservey Can you share the headline of the WP article? I feel like the subheading of the NYT example is more similar in tone to the WP example, but maybe the distinction is clearer with WP’s headline?
We are absolutely devastated by Dejan Milojević's sudden passing.
This is a shocking and tragic blow for everyone associated with the Warriors and an incredibly difficult time for his family, friends, and all of us who had the incredible pleasure to work with him.
We grieve with and for his wife, Natasa, and their children, Nikola and Masa.
We are excited to be welcoming @loom to the Atlassian team! Loom’s leadership in async video combined with our deep understanding of team collaboration means we can bring innovation to the market and empower our customers to collaborate in more human ways. https://t.co/n4HQ4pH4kG
This Liverpool season already feels like it’s in shambles and there are still 37 games left to play. I think I’m more annoyed at losing out on Lavia than Caicedo.
Another week, another excuse for not delivering our order. 3rd time now, just like clockwork. @Anthropologie@RyderSystemInc why are we forced to pay for delivery when you can’t get your core business in order? Why do we bother buying anything from you in the first place?
Hey @Anthropologie what’s the point of scheduling a delivery window with @RyderSystemInc if they don’t show up at all the first time, and then don’t turn up in the scheduled window the second? This is the complete opposite of ‘white glove delivery’.
Hey @Anthropologie what’s the point of scheduling a delivery window with @RyderSystemInc if they don’t show up at all the first time, and then don’t turn up in the scheduled window the second? This is the complete opposite of ‘white glove delivery’.