The industry has gone completely nuts.
Use tokens to generate AI code and documentation slop. Then use even more tokens to understand and review that slop.
Then judge engineers by token usage instead of how empathetic and clear their docs and code actually are, and completely neglect human comprehension.
Utter nonsense.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
one issue that's bugging me is how do we make long-running tasks cheap and effective.
pattern two from your blog is interesting. since, we're maintaining the state effectively in a database, we can hydrate a session into any sandbox. this removes the need for long-running or idle sandboxes.
but the heavy lifting then shifts left. we now have to construct whole sessions. is this worth the effort?
Right now, flue gives us a runtime which we can use to build and run agents in sandboxed environment
building on top of flue, i'm planning to add:
* multi-tenancy for credentials (slack, github, other tools.)
* an orchestrator that listens to events and dispatches the work to specialized agents (defined in flue).
* the agents either returns outputs (artifacts) or emit events (e.g. watch ci actions)
an event-driven orchestrator on top of flue where we define the relationship between events and corresponding agentic work. (e.g. pull_request.created event should dispatch a PR reviewer agent)
the harness should be able to compose these events on its own to create automated workflows over time (if this makes sense)
We’re in the transition in moving Slash from a foreground assistant to a background one.
That’s our next step. This opens up N different agentic workflows we can build around Slash each governed by chain of events.
Broader goal is to make Slash the first point of interaction for anything/everything
it was fun building it. and now it’s more fun making it even better for the whole org
#Slash is changing the landscape of how we do software engineering at @Razorpay
We recently built an AI assistant inside @Razorpay called Slash.
It reads our entire codebase, debugs production incidents, reviews specs, writes code, reviews every single PR, answer tech queries and also raises PRs for small features.
It's easily accessible through Slack. We can tag it in any Slack thread, describe the problem in English, and it gets to work.
Six weeks ago, Slash handled 122 tasks in its first week. Last week it handled 14000+. Queries, analysis, bug fixes, PR reviews, test runs and work that earlier lived across scattered tools and teams can now be done with Slash right within Slack. 1000+ people used it in a single week because it got their work done faster. The whole adoption has been completely organic.
The numbers from last week have been very encouraging - 14,854 tasks completed. 2,150 PRs raised, 1,152 merged, 45% of those PRs shipped with zero human rework.
A payout gets stuck mid-retry during a live incident, an engineer tags Slash and within seconds, it cross-references logs with code and pinpoints a state machine bug blocking the retry-to-failed state transition. Tells the team exactly which logs to check and how to resolve the incident.
With its K8s analyzer skill, Slash scanned a single namespace, right-sized all 11 workers using 48-hour P95 pod metrics, and raised the PR. One run saved $560/month.
A marketing banner bug was fixed with few prompt iterations with a PR raised, merged to prod and deployed in minutes. No front-end developer touched the code.
Security teams ran static security testing and remediation through Slash at org scale. Thousands of findings were purged and many more got validated autonomously.
But Slash isn't just an engineering tool.
Account managers now trace stuck customer payments and integration failures through Slash instead of pinging engineers on Slack. L2 product support tickets get triaged by Slash before they reach engineering.
250+ non-engineers ran thousands of sessions last week. PMs used it for research on our payments infra, customer interviews and product features sometimes raising PRs of their own. Analytics teams built SQL pipelines. 11% of all sessions came from people outside tech and product.
On our company bakkar (watercooler) Slack thread, someone asked Slash jokingly to assign tasks to everyone and it responded in the same tone. It seamlessly started participating in inside jokes and conversations.
The quality compounds with use. Engineers who shipped 11+ Slash PRs averaged a 63% merge rate without rework. First-timers averaged 37%. Across the org, human review comments per PR have dropped more than 40% with Slash starting to do in-depth review of every single change.
We're still early. Large cross-repo refactors, fully agentic sdlc and plan mode are next. But Slash has already changed how people at Razorpay build, debug, and ship every day.
We recently built an AI assistant inside @Razorpay called Slash.
It reads our entire codebase, debugs production incidents, reviews specs, writes code, reviews every single PR, answer tech queries and also raises PRs for small features.
It's easily accessible through Slack. We can tag it in any Slack thread, describe the problem in English, and it gets to work.
Six weeks ago, Slash handled 122 tasks in its first week. Last week it handled 14000+. Queries, analysis, bug fixes, PR reviews, test runs and work that earlier lived across scattered tools and teams can now be done with Slash right within Slack. 1000+ people used it in a single week because it got their work done faster. The whole adoption has been completely organic.
The numbers from last week have been very encouraging - 14,854 tasks completed. 2,150 PRs raised, 1,152 merged, 45% of those PRs shipped with zero human rework.
A payout gets stuck mid-retry during a live incident, an engineer tags Slash and within seconds, it cross-references logs with code and pinpoints a state machine bug blocking the retry-to-failed state transition. Tells the team exactly which logs to check and how to resolve the incident.
With its K8s analyzer skill, Slash scanned a single namespace, right-sized all 11 workers using 48-hour P95 pod metrics, and raised the PR. One run saved $560/month.
A marketing banner bug was fixed with few prompt iterations with a PR raised, merged to prod and deployed in minutes. No front-end developer touched the code.
Security teams ran static security testing and remediation through Slash at org scale. Thousands of findings were purged and many more got validated autonomously.
But Slash isn't just an engineering tool.
Account managers now trace stuck customer payments and integration failures through Slash instead of pinging engineers on Slack. L2 product support tickets get triaged by Slash before they reach engineering.
250+ non-engineers ran thousands of sessions last week. PMs used it for research on our payments infra, customer interviews and product features sometimes raising PRs of their own. Analytics teams built SQL pipelines. 11% of all sessions came from people outside tech and product.
On our company bakkar (watercooler) Slack thread, someone asked Slash jokingly to assign tasks to everyone and it responded in the same tone. It seamlessly started participating in inside jokes and conversations.
The quality compounds with use. Engineers who shipped 11+ Slash PRs averaged a 63% merge rate without rework. First-timers averaged 37%. Across the org, human review comments per PR have dropped more than 40% with Slash starting to do in-depth review of every single change.
We're still early. Large cross-repo refactors, fully agentic sdlc and plan mode are next. But Slash has already changed how people at Razorpay build, debug, and ship every day.
ADHD is one of the most painful things to live with. Not just because it's loud, because it's contradictory. You're capable of anything and motivated to do almost nothing. You understand everyone around you, but can't explain what's happening inside yourself. You have brilliant ideas, but no patience to finish a single one. You're a genius who can't handle an email, an extrovert who needs to be completely alone, a person full of advice who can't follow any of it. And the worst part? You know.