The world's best engineers, founders, and researcher building with AI.
Organizers of the AIE Summit, Code Summit, Europe, and the flagship SF World's Fair.
Video of my talk at @aiDotEngineer Europe is out! 🎥
I think AI Engineer is one of the best applied AI conferences right now – and I constantly recommend their YouTube channel to people looking for strong AI content @swyx@Benghamine
I talked about the SWE-rebench leaderboard – a live benchmark with fresh SWE tasks from GitHub:
> how we collect and filter tasks
> how we run monthly evals on 30+ models and harnesses
> how we deal with infra pain
> how we caught models trying to cheat
🧵 in the thread: slides + short key points from the video
https://t.co/xxkXNFn1DK
For the third year running we are honored to have @Microsoft as presenting sponsor.
Microsoft creates platforms and tools powered by AI to deliver innovative solutions that meet the evolving needs of our customers. The technology company is committed to making AI available broadly and doing so responsibly, with a mission to empower every person and every organization on the planet to achieve more.
Skill issue: Lessons from skilling up coding agents
Getting agents to actually use Langfuse was a "skill issue" — literally. Marc Klingen from Clickhouse on teaching coding agents to use new tools, and why it's harder than you think.
https://t.co/sqlY11NkmQ
@swyx and I are curating the AI in GTM track at @aiDotEngineer on June 30.
The thing every AI engineer must realize: GTM just became an engineering problem.
Outbound = agent design.
Enrichment = retrieval + data quality.
Attribution = a knowledge-graph problem.
Forecasting = evals.
Don't Build Slop (4 Levels of AI Agent Maturity)
Most AI agents are slop. @arafatkatze from @cline breaks down the 4 levels of agent maturity — and what separates a demo from a product.
https://t.co/5CY3zLEIyT
We’re here Melbourne for @aiDotEngineer this week.
@igorcosta is presenting a keynote on the next generation of memory for coding agents.
Why do agents forget?
What does persistent memory actually look like?
Visit us at Booth 12 and chat with the @autohandai team.
🆕 New Engineer Orientations
Based on great feedback from AIE Europe, we are hosting a series of NEOs for people new to AIE and have any questions about making the most out of AIEWF!
join us today/tomorrow online - link below
Beyond Code Coverage: Functionality Testing with Playwright
Code coverage is a vanity metric. Functionality coverage is what matters. Marlene Mhangami shows how Playwright + MCP enables testing that actually catches the bugs users hit.
https://t.co/7qQdqinqyt
Took some inspiration from @vboykis and converted my first ever talk into a blog post.
I talk about the role of agentic search in context engineering.
Together we build an intuition on the strengths and weaknesses of a selection of search tools.
🔗 https://t.co/nuGJ5Zm9Du
Agentic Search for Context Engineering
@helloiamleonie's hot take: context engineering is about 80% agentic search. The arrow from context sources to context window is doing most of the work.
https://t.co/Iu2nkU2NdZ
The workshop covers the full tool landscape: shell tools, semantic search, general-purpose query execution, agent skills, and when each one breaks. Including the part where an agent fakes semantic search by chaining grep synonyms.
Low floor tools (specialized, easy to call correctly) vs high ceiling tools (general purpose, handles the unexpected). You probably need both, and the workshop shows you how to pick.
Took some inspiration from @vboykis and converted my first ever talk into a blog post.
I talk about the role of agentic search in context engineering.
Together we build an intuition on the strengths and weaknesses of a selection of search tools.
🔗 https://t.co/nuGJ5Zm9Du
It requires a tremendous amount of skill as well as institutional process maturity to use AI-generated coding effectively. The challenge in 2026 is the new process for using AI hasn't been fully materialized. See Mike Spitz @aiDotEngineer presentation on the topic: https://t.co/6rmHr6V6FV
Your Coding Agent Should Do AI System Engineering
Your coding agent writes code. But can it do system engineering? @ben_burtenshaw from @huggingface argues that's the real unlock — agents that understand architecture, not just syntax. https://t.co/ljgjsRE2rU