For years, I was hyperparameter tuning XGBoost models wrong. In 3 minutes, I'll share one secret that took me 3 years to figure out. When I did, it cut my training time 10X. Let's dive in.
1. XGBoost: XGBoost (eXtreme Gradient Boosting) is a popular machine learning algorithm, especially for structured (tabular) data. It's claim to fame is winning tons of Kaggle Competitions. But more importantly, it's fast, accurate, and easy to use. But it's also easy to screw it up.
2. Hyperparameter Tuning: To stabilize your XGBoost models, you need to perform hyperparameter tuning. Otherwise XGBoost can overfit your data causing predictions to be horribly wrong on out of sample data.
3. My 3-Year "Beginner" Mistake: XGBoost has tons of parameters. The mistake I was making was treating all of the parameters equally. This caused me hours of tuning my models. And my results weren't half as good until I started doing this.
4. How I improved my hyperparameter tuning: XGBoost has one parameter that rules them all. And after 3 years, I noticed that model stability was 80% driven by this parameter. What was it? Learning rate. When I figured this out that's when things started to change. My models got better. My training times were reduced. Win win.
5. My Simple 2 Step Hyperparameter Tuning Method for XGBoost: What I was doing wrong was doing random grid search over all of the parameters. This took hours. So I made a key change. I began isolating Learning Rate, tuning it first. This was Step 1. The search space for one parameter is super fast to tune.
6. What about the other parameters? Once learning rate was tuned, I then opened the search space to more parameters. This is Step 2. The rest of the parameters have maybe 20% contribution to performance, so that means I can reduce the search space dramatically.
7. The big benefit: Separating tuning into 2 steps cut my training times by a factor of 10X. And my models actually became better. Faster training, better models. Win win.
There you have it- my top 7 concepts on hyperparameter tuning xgboost models. The next problem you'll face is how to apply data science to business.
I'd like to help.
I’ve spent 100 hours consolidating my learnings into a free 5-day course, How to Solve Business Problems with Data Science. It comes with:
300+ lines of R and Python code
5 bonus trainings
2 systematic frameworks
1 complete roadmap to avoid mistakes and start solving business problems with data science, TODAY.
👉 Here it is for free: https://t.co/YXG4pL97ZN
"El efecto Dunning-Krueger: "Las personas con baja calificación llegan a conclusiones equivocadas y fallan en las decisiones, pero no pueden darse cuenta de sus errores debido a su baja calificación".
No comprender los errores conduce a creer en uno mismo y, por tanto, a
Personal de la PNC, Bomberos Municipales y PMT, oraron e instalaron una manta en el Puente El Incienso, lugar donde numerosas personas se han quitado la vida.
Además elementos de infraestructura de la PNC, repararon la malla metálica, informó el Ministerio de Gobernación.
Ohh Ciudad Real en Villanueva se quedó sin Domino’s.
Habría que etiquetar al alcalde para que cuente por qué pasan estas cosas y sigue buscando reelegirse.