Ben Lavender @benlavender - Twitter Profile

Pinned Tweet

10 months ago

When AI Agents Go Rogue: A Cautionary Tale from the Trenches 🤖⚠️ This week delivered a stark reminder that AI, while transformative, can be a double-edged sword when given too much autonomy. I had a perfectly functioning API service filtering brand data. After letting an AI coding assistant run in agent mode to "optimize" some updates, it systematically corrupted the entire filtering logic. What started as returning 11 random results with poor matching degraded to returning nothing at all—despite 472 out of 484 records containing the exact filter criteria. The recent METR study 'Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity' (https://t.co/oMd2tWJQVc) reinforces this experience with hard data: AI tools actually SLOWED DOWN experienced developers by 19%, despite both developers and experts predicting 20-40% speedups. Key takeaways from both research and experience: 🔍 AI excels as a copilot, not an autopilot It's brilliant for suggestions and code completion. But give it free rein to refactor your codebase? That's when things go sideways. 📊 The expertise paradox The study found AI helped least where developers had deep familiarity with their repositories. My experience confirms this - AI couldn't grasp the nuanced business logic that makes a filtering system actually useful. 🧠 The ML comprehension gap Perhaps most concerning: the LLMs struggled to understand and correctly modify complex machine learning code. They could write boilerplate, but when it came to understanding feature engineering, model predictions, and data pipelines? They introduced subtle bugs that broke everything downstream. 🎓 Fundamentals matter more than ever Andrew Ng's (@AndrewYNg) Machine Learning Specialization (https://t.co/fgthWGaULe) on Coursera proved invaluable for understanding what's happening under the hood. Without that foundation, I'd be helplessly watching AI tools make decisions I couldn't evaluate or correct. Critical papers like Entity Embeddings of Categorical Variables (https://t.co/lIJ9iMb82T) showed how turning categories into vector embeddings lets neural networks understand hidden relationships and make far better predictions. This foundational knowledge is what separates informed AI collaboration from blind dependency. I'm planning to dive deeper with Andrew NG’s (@AndrewYNg) Deep Learning Specialization (https://t.co/scRR9019BH) next—if we're going to work alongside AI, we need to understand both its capabilities and limitations at a fundamental level. The reality check: The METR researchers found developers spent 9% of their time just reviewing and cleaning AI outputs. In complex ML systems, that overhead can completely negate any productivity gains. The bottom line: AI is a powerful amplifier of human capability, but it's not a replacement for human judgment - especially in machine learning systems where small changes can cascade into major failures. As we rush to integrate AI into every workflow, let's remember: understanding the fundamentals and maintaining human oversight isn't optional - it's essential. Have you experienced similar AI "help" that went wrong? What guardrails have you put in place? Thanks to Juan Perna (https://t.co/Fe1BrlUHjI) for sharing the METR paper. #AI #MachineLearning #DeepLearning #SoftwareDevelopment #TechLeadership #CodingBestPractices

1

0

76

Ben Lavender

@benlavender

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users