Software systems become complex 🤕 very fast. If you’re not sharing a common experimentation framework, you’ll find yourself in a communication and reproducibility mess very fast. Long term, your team’s productivity will take a big hit 🥊. (9/9)
You built a new feature in your ML-driven system, you did an experiment in a notebook, numbers look good, let’s deploy? 🚀
Sorry, no⛔️. In a team we need to make sure we’re perfectly aligned on what the numbers mean. It’s not a lack of trust, but we share responsibility 🧵(1/9)
📣 To be clear, running experiments in notebooks & launching your own configuration of a system to experiment 🧪 with are all important things to do during development.
What I’m tackling here is when the team wants to act 🚀 on these experiments. (8/9)
I’ve seen people be baffled by a system not being able to answer queries about the US 🇺🇸, just to discover that normalisation transforms “US” to “us”, which then got discarded 🚮 as a stop word (since it looked like the pronoun us) in a longer query.
Looking at 🥩 intermediate outputs allows you to verify that the various stages of your pipeline are doing what you expect. It’s a good way to uncover bugs🐞 and problematic components.
🍔 Individual predictions are the outputs at the end of your machine learned system. They’re essentially what the user would see in the product you’re building.
🥩 Intermediate outputs are the results of transformations the ML system applies to the inputs (e.g. text normalization, entity tagging, document re-ranking, etc.).
Keep in mind that most machine learned system are pipelines or are part of a pipeline 🐮➡️🥩➡️ 🍔.
ML experiments (or any experiments) are not just about looking at numbers 🔢 and graphs 📈 to make binary judgements:
- good ✅ (= deploy)
- bad ❌ (= don’t deploy)
💡You must look 👀 at 🍔individual predictions and 🥩 intermediate outputs in your experiments🧵
After the WHYs❓for checklists ✅, now the HOWs ⚙️
1️⃣ Discovery 🔎: establish a single place where checklists will go. This can be an individual document (e.g. a Google doc) that contains all checklists. It can also be an index that points to individual checklists.
Checklists ✅ to the rescue!
If your team finds itself doing the same sequence of actions repeatedly 🔁, then ideally these should be automated 🤖.
If automation is not possible, the next best thing is CHECKLISTS ✅
Here are the WHYs of checklists, the HOWs in a future 🧵
8️⃣ Preach the gospel of checklists 📣: You & your team don’t work in a vacuum❗️If you find something that helps you, spread the word. If checklists help you and your team, tell others around you. Checklists are used to save lives 🩺 and debug planes ✈️ mid-flight ⬇️