@slopwareindy great, let me know how you get on and feel free to dm me if you have any questions. pi-goals looks interesting. I'm thinking about a sonar Pi plugin now!
This is some great research, Sonar's new tools help agents develop quality code (which is necessary for production applications) - now we know they work more efficiently on that code too!
SlopCodeBench is a great framework for testing out different agent configurations and the impact they have on quality. I'm excited to try out the latest release!
Very excited to announce the v1.0 of SlopCodeBench release:
- Doubling the size of the dataset
- @harborframework support
- scb-check: a CLI that flags slop anti-patterns
- Way more model results
https://t.co/RQkB8wdzAu
https://t.co/36qQR3azeE
🧵
Off to speak in San Francisco at @DeepLearningAI AI Dev 26 next week. Will be talking about how to ship enterprise quality code with agents. My first time in SF! https://t.co/0ZOo4xkDYl
Claude Code builds 🤝 SonarQube verifies. Now, they do it in the same place.
The SonarQube plugin for Claude Code is available now in the Anthropic marketplace.
I'm over trying to get agents to do everything and getting back to crafting the good stuff and leaving agents for a) the mundane b) experiments c) scripting and automating d) (re)searching and distilling e) analytics. Feels good!
@AleksejAros@SonarSource Yes, what sort of assumptions do you see the agent making? We're always looking to expand our ruleset especially for agents. On static analysis we only release rules a very low FP rate so we can bring some confidence where the agent can't
For the last 6 months I’ve been working on Agentic Analysis, a new product for @SonarSource that was launched as an open beta today. It provides Sonar’s comprehensive analysis to AI Agents in seconds and together with its sister product Context Augmentation allows the agent to produce quality code from the outset. https://t.co/DRPrNSjGzg
Thanks for sharing, really valuable! I've been experimenting with it and by using @sonar the agent will happily keep erosion to ~0.05, making the code more human readable, however it has no impact on the pass rates. I tried extending to 10 checkpoints and the result was the same. Did you see evidence of the agent pass rates being impacted by high erosion?
Software architecture shouldn’t be a headache. SonarQube now enables architecture management directly in your dev workflow. Visualize your structure, define goals, and stop architectural drift before it compounds. 🚀
See how it works: https://t.co/Ta4LJUfpMB
“Overall, adopting tools like Codex is not just a technical but also a deep cultural change, with a lot of downstream implications to figure out.” - yes!
Software development is undergoing a renaissance in front of our eyes.
If you haven't used the tools recently, you likely are underestimating what you're missing. Since December, there's been a step function improvement in what tools like Codex can do. Some great engineers at OpenAI yesterday told me that their job has fundamentally changed since December. Prior to then, they could use Codex for unit tests; now it writes essentially all the code and does a great deal of their operations and debugging. Not everyone has yet made that leap, but it's usually because of factors besides the capability of the model.
Every company faces the same opportunity now, and navigating it well — just like with cloud computing or the Internet — requires careful thought. This post shares how OpenAI is currently approaching retooling our teams towards agentic software development. We're still learning and iterating, but here's how we're thinking about it right now:
As a first step, by March 31st, we're aiming that:
(1) For any technical task, the tool of first resort for humans is interacting with an agent rather than using an editor or terminal.
(2) The default way humans utilize agents is explicitly evaluated as safe, but also productive enough that most workflows do not need additional permissions.
In order to get there, here's what we recommended to the team a few weeks ago:
1. Take the time to try out the tools. The tools do sell themselves — many people have had amazing experiences with 5.2 in Codex, after having churned from codex web a few months ago. But many people are also so busy they haven't had a chance to try Codex yet or got stuck thinking "is there any way it could do X" rather than just trying.
- Designate an "agents captain" for your team — the primary person responsible for thinking about how agents can be brought into the teams' workflow.
- Share experiences or questions in a few designated internal channels
- Take a day for a company-wide Codex hackathon
2. Create skills and AGENTS[.md].
- Create and maintain an AGENTS[.md] for any project you work on; update the AGENTS[.md] whenever the agent does something wrong or struggles with a task.
- Write skills for anything that you get Codex to do, and commit it to the skills directory in a shared repository
3. Inventory and make accessible any internal tools.
- Maintain a list of tools that your team relies on, and make sure someone takes point on making it agent-accessible (such as via a CLI or MCP server).
4. Structure codebases to be agent-first. With the models changing so fast, this is still somewhat untrodden ground, and will require some exploration.
- Write tests which are quick to run, and create high-quality interfaces between components.
5. Say no to slop. Managing AI generated code at scale is an emerging problem, and will require new processes and conventions to keep code quality high
- Ensure that some human is accountable for any code that gets merged. As a code reviewer, maintain at least the same bar as you would for human-written code, and make sure the author understands what they're submitting.
6. Work on basic infra. There's a lot of room for everyone to build basic infrastructure, which can be guided by internal user feedback. The core tools are getting a lot better and more usable, but there's a lot of infrastructure that currently go around the tools, such as observability, tracking not just the committed code but the agent trajectories that led to them, and central management of the tools that agents are able to use.
Overall, adopting tools like Codex is not just a technical but also a deep cultural change, with a lot of downstream implications to figure out. We encourage every manager to drive this with their team, and to think through other action items — for example, per item 5 above, what else can prevent a lot of "functionally-correct but poorly-maintainable code" from creeping into codebases.
Bad data in = bad code out. 🤖 It's the Achilles' heel of AI code generation.
That's why we're introducing SonarSweep™, our new service that optimizes and secures training data for coding LLMs.🧹🛡️
Read the announcement: https://t.co/TwCdoVqUNb
#CodeQuality#SonarSweep
@Rubberduck203@GAnnCampbell@tottinge@BarretBlake What version were you using? We’re now updating (almost) all our rules as new versions of C# are being released. There was a bit of a backlog as we migrated to our new Semantic Execution Engine which is almost complete.