Com todo respeito ao Joey (que descansa em paz) e ao Jay (excelente baterista que saiu da banda recentemente) mas o Eloy e o Slipknot foram feitos um pro outro. Era dessa agressividade na batera que a banda precisava. Eloy é GIGANTE. Que momento. É o brazuca no Slipknot. 🇧🇷
Olha issoooooo!!!! Que registro espetacular feito pelos astronautas a bordo da Estação Espacial Internacional mostrando a sombra da Lua projetada na Terra durante o eclipse total do Sol de 8 de abril de 2024!!! Uma das grandes maravilhas da natureza!!!!
In 1890, Henri Poincaré proved the non-existence of the uniform first integral of the three-body problem and the sensitive dependence to initial conditions of its trajectories.
Yet, stable solutions to it do exist and these are some of them.
https://t.co/haECKnWsfp
Time series analysis has been critical in my career. But it took me 3 years to get comfortable. In 3 minutes, I'll share 3 years of experience in time series:
1. Time Series Analysis: Time series analysis is a statistical technique that deals with time-ordered data points. It's commonly used to analyze and interpret trends, patterns, and relationships within data that is recorded over time (e.g. with timestamps).
2. Uses: Understanding and applying time series analysis concepts is critical for forecasting, detecting anomalies, and drawing insights on data that varies over time.
3. The 3 Core Concepts: There are 3 areas of time series that have been super helpful. Understanding 1. Autocorrelation, 2. Seasonal Decomposition, and 3. Calendar Effects. Let's break them down.
4. Autocorrelation: This refers to the correlation of a time series with its own past and future values. It measures the relationship (correlation) between a variable's current value and its past values.
5. Partial Autocorrelation: Autocorrelation has a problem. Some of the correlation is confounded by earlier lags. Enter Partial Autocorrelation. This removes the correlation effect of earlier lags.
6. Seasonal Decomposition (STL): Seasonal decomposition decomposes a time series into three components: trend, seasonal, and residual (irregular). STL stands for Seasonal-Trend-Loess. It uses a "LOESS" smoother to remove seasonal and trend effects. STL is flexible and can handle any type of seasonality, not just fixed seasonal effects. The residuals can be analyzed for outliers since they have been de-trended and de-seasonalized.
7. Calendar Effects: Calendar effects refer to variations in a time series that can be attributed to the calendar itself. This can include effects due to day of the week, month of the year, or holidays tied to the calendar.
Understanding and applying these concepts allows analysts to better forecast future values, detect anomalies, and draw insights from data that varies over time.
===
Ready to learn Data Science for Business?
I put together a free on-demand workshop that covers the 10 skills that helped me make the transition to Data Scientist: https://t.co/6Ji4GtOTzy
And if you'd like to speed it up, I have a live workshop where I'll share how to use ChatGPT for Data Science: https://t.co/Ydsmzv7trP
P.S. - If you are interested in time series, I have a High-Performance Time Series Course here: https://t.co/2H9HEuORi3
If you like this post, please reshare ♻️ it so others can get value.
When I was first learning data science, one of the things that tripped me up the most was Cross Validation. In 5 minutes, I'll share 5 years of experimentation with dozens of Cross Validation techniques. Let's dive in.
1. Goal: Cross-validation is a statistical method used to estimate the accuracy of machine learning models. It's also used to measure the stability of models when combined with hyperparameter tuning of machine learning models.
2. Principle: The main principle behind cross-validation is partitioning a sample of data into complementary subsets, performing the analysis on one subset, and validating the analysis on the other subset (called the assessment set).
3. Types of Cross Validation: There are many ways to perform cross validation. Some of the most common are K-Fold, Stratified K-Fold, Leave One Out, Group K-Fold, and Time Series Cross Validation. We'll tackle these one at a time.
4. K-Fold Cross-Validation: The data set is divided into 'k' number of subsets (folds). The holdout method is repeated 'k' times, with each of the 'k' subsets serving as the test data one by one and the remaining 'k-1' subsets as the training data. The average of the 'k' testing experiments is used as the overall result.
5. Stratified K-Fold Cross-Validation: Similar to K-Fold, but the sampling method ensures that each fold of the dataset has the same proportion of observations with a given label. This is particularly useful for imbalanced datasets.
6. Leave-One-Out Cross-Validation (LOOCV): A special case of k-fold cross-validation where 'k' is equal to the number of data points in the dataset. I never use this method because it's very time-consuming. I prefer K-fold and stratified k-fold.
7. Time Series Cross-Validation (TSCV): In time series data, the sequence of observations is important. A common approach is to use a "rolling" or "expanding" window for training and testing. Important point, if your model does not require the sequence to be kept intact, it's sometimes better to use K-Fold. I've seen this with XGBoost, where date features are used rather than lags. K-fold outperforms TSCV. But for ARIMA, TSCV is needed because of the algorithm depends don't the sequence of the time series being maintained.
8. Group K-Fold Cross-Validation: The data is split into groups, and these groups are used to ensure that a group is entirely in the training or test set. This is useful for problems where the data is naturally divided into groups (e.g., customers from different store locations).
Happy Holidays,
-Matt
===
Ready to learn Data Science for Business?
I put together a free on-demand workshop that covers the 10 skills that helped me make the transition to Data Scientist: https://t.co/LR39RJ5XKB
And if you'd like to speed it up, I have a live workshop where I'll share how to use ChatGPT for Data Science: https://t.co/EaMpKrJiqX
If you like this post, please reshare ♻️ it so others can get value.