I will be traveling a lot in the next few weeks and couldn’t think of a better piece of tech to bring with me! Consider subscribing if you enjoy these kinds of content.
https://t.co/gpvPaOnj1N
The TMCF Resume is the format I use and others who've applied use it as well. Make sure you're aligning your bullet points to the needs of the job and showcasing your skills in alignment with the DUTIES and QUALIFICATIONS section of the job description.
I discovered a compound interest calculator a few years back and I’ve been hooked ever since. A Roth IRA account is the perfect place to watch your money compound. Here’s how to open one and get started on your investing journey: https://t.co/at49pMU6u4
One mistake I’ve made in the past was encouraging people they could/should do more or better
Some people are actually ok with their current reality, even if they have the potential and opportunity to do more
I'm calling for a more minimalistic approach to machine learning:
Regression? Predict the mean
Robust regression? Median
Classification? Always predict majority class
Time series? Same as yesterday: Y(t) = Y(t-1)
Text generation? Lorem ipsum
Generalized Linear Models (GLM) - Simplified! 📊
💎 Intro to GLM
You know how in school we learned about straight lines (y = mx + b)? GLM is like that, but for more complex relationships in data. It’s a mathematical magician!
💎 Why Not Just Linear Models?
Imagine trying to fit a straight ruler on a curvy path. It doesn't always work, right? Life isn't always straight, and neither is data. GLM lets us handle those curves!
💎 Three Magic Parts of GLM:
1. The Link: This is a function that "straightens" our response variable, making it easier to model.
2. Linear Predictor: Our good old straight line equation, but jazzed up with more variables.
3. Probability Distribution: Tells us how our data is spread. It could be normal (like a bell curve), binomial (like a coin toss) or many others.
💎 Real World Example
Suppose we want to predict if a customer will buy (yes or no). A GLM can handle this "binary" outcome and tell us how different factors (age, preferences) influence that decision.
💎 Predicting Many Outcomes
Sometimes we don't just predict 'yes' or 'no'. Like predicting if a fruit is ripe, overripe, or underripe. GLMs are versatile and can handle these scenarios!
💎 Benefits of GLM:
1. Flexibility with different types of data.
2. Handles complex relationships.
3. Provides insights into which factors are influential.
💎 Different Flavors of GLM
Depending on our data, GLMs can come in several styles:
��� Logistic: Predict binary outcomes, like 'win' or 'lose'. It's all about odds.
• Poisson: Counting stuff? Like number of emails you get daily? Poisson's your go-to.
• Normal: Good ol’ classic, for continuous data like heights or weights.
• Gamma: Useful for positive continuous values, like the time between events.
💎 Assumptions? Yep, GLM Has Those!
Just like superheroes have weaknesses, GLMs have assumptions:
• Correct Link Function: We pick a link based on our data's distribution.
• No Multicollinearity: Our predictor variables shouldn't be too related to each other.
• No Overdispersion: For count data, if variance > mean, it's a sign we might need to adjust our model (maybe a Negative Binomial Regression).
• Independence: Observations should be independent.
💎 GLM vs. Other Models
Why not just use another model? GLMs are a middle ground. They're more flexible than basic linear models but less complex than neural networks. They're the versatile multitool in a statistician's toolkit!
💎 Interpretation & Insights
After fitting a GLM, you get coefficients. Each coefficient tells how much a factor influences the outcome. Positive? It increases the chances. Negative? Decreases them. Size? Shows the strength of influence.
💎 In Summary
Think of GLM as the Swiss Army knife for data analysis. It can handle curves, twists, and various outcomes. So next time data isn't playing nice and straight, remember GLM might just be your answer!
Happy modeling and remember, life isn't linear, but with GLMs, we can navigate the curves! 📊🧮
#Statistics #GLM #DataScience
ChatGPT influencers keep saying DATA SCIENCE IS OVER!
That's wrong.
Now that anybody can use machines to WRITE code, people who deeply UNDERSTAND what the code is doing are more VALUABLE than ever.
Here are my favorite books for data science beginners:
I just heard a student describe “preliminary analysis” as a “vibe check”
😂 Then instead of conclusions, “what is it giving?”
Talk about making research accessible lol
🚨 Hi #DataFam, I recently expanded my @Tableau x @Figma workshop into a two-part series, now all vids and learning resources are freely available: https://t.co/3u34xYSXoJ
Dashboard: https://t.co/jiPYm2lZhA
May the materials help you learn both tools! 😊
#dataviz#design
@teneikaask_you And the perks don’t stop there. You can have multiple Amex cards and stack benefits. For example 3 Hilton aspire cards means 3 free nights every year anywhere in the world 🙃
Great data engineers deliver data sets that fit needs not stakeholder asks!
Stakeholders often ask for answers to narrow, specific, questions.
Your job as a data engineer is to build robust data sets that answer broad classes of questions.
If you don’t do this, you’ll be writing and rewriting SQL until the end of time!
How do you do this properly? 1/3
OLTP and OLAP data modeling and querying are fundamentally different!
OLTP:
- focused on latest state of data
- normalization and 3rd normal form are powerful
- optimized for point queries or single user queries
- query latency matters a lot
- using CTEs can sometimes produce extra latency that you don’t want
OLAP:
- care about latest state and historical data
- normalization can cause some queries to be painfully slow
- optimized for GROUP BY statements across a large number of records
- query latency matters a little
- use CTEs instead of subqueries every single time
#dataengineering
#sql