Rick Pack

@rick_pack2

@[email protected] Data Scientist (#rstats, SAS, #Alteryx). #TriPASS board member. Population health and business analysis focus, oft-geospatial.

RTP, NC

Joined February 2011

1.2K Following

532 Followers

1.5K Posts

Rick Pack @rick_pack2

over 2 years ago

Pondering: “[Roughly], causal inference tells us whether based on our causal beliefs, the association between two variables is bigger or smaller than their true casual relationship.”…and another sentence concerning beliefs about causal relationships affecting estimates

Dr Kareem Carr

@kareem_carr

over 2 years ago

What is Causal Inference? Causal Inference is a new science of causation. This field is nothing less than a revolution in how scientists understand data. Read on to learn more. This is the first post in a series based on the Book of Why by Judea Pearl. I will be reading the book and sharing the big insights with my followers. When I first started learning causal inference, I didn't have a clear idea of the problems that casual inference was trying to solve. My misconceptions made it harder to understand the material than it would have been otherwise. So, before we get into the ideas of the book, I want to help you avoid these common misconceptions. 1. Causal Inference is NOT just regular science All sciences strive to infer causes within their domain of expertise. Therefore, it might not be obvious to you what makes casual inference any different. This is the reason why I sometimes call this new field mathematical causal inference. This term emphasizes that what sets causal inference apart is the mathematical framework it uses to describe causation. 2. Causal Inference is NOT directly about inferring causes Based on the name, new learners often think causal inference is solving the following problem: Given a list of candidate variables, how can we select the ones that have a real causal effect on our outcome of interest? This is not what causal inference does. Causal inference is solving a different problem: Assuming our beliefs about the causal relationships between all the variables is accurate, what is the best estimate of the causal relationship between a particular candidate variable and the outcome of interest? Very roughly speaking, causal inference tells us whether based on our causal beliefs, the association between two variables is bigger or smaller than their true casual relationship. 3. The Example of Alice and Bob Alice thinks genes strongly affect addictive behaviors like smoking. She also thinks genes have an effect on who gets cancer. Bob agrees that genes very likely have an effect on cancer, but Bob thinks complicated social behaviors like addiction are completely due to social factors, not genes. Causal inference allows us to evaluate the same data according to both Alice's and Bob's beliefs about the underlying causal relationships. This allows for various outcomes: 1. Avoiding unnecessary arguments. If Alice and Bob get very similar estimates for the causal relationship between genes and cancer, this implies that the disagreement about the relationship between genes and behavior is not that important. This allows scientists to move forward by focusing on the factors that really matter. 2. Agreeing to disagree. If the difference in estimates of the casual relationship between genes and cancer is large, causal inference allows both Alice and Bob to continue to explore the same data according to their very different assumptions about the causal relationships. This gives scientists and policy makers autonomy to pursue different interpretations of the same data. 4. Casual Inference builds doesn't replace statistics. It makes it more powerful. Causal inference allows us to adjust our statistical estimates of the strength of particular casual relationships based on our beliefs about the casual relationships between the variables. This is why some experts in causal inference (like the epidemiologist @epiellie) prefer to use the term causal effect estimation to refer to the field causal inference. That's it for now. My next post (coming soon!) will explore how causal inference creates a mathematical model of causation and what makes this approach so special. (You can find these posts using the hashtag #KareemReads) Follow me (@kareem_carr) for more content like this. If you want to show support, like and retweet the thread.

kareem_carr's tweet photo. What is Causal Inference?

Causal Inference is a new science of causation. This field is nothing less than a revolution in how scientists understand data. Read on to learn more.

This is the first post in a series based on the Book of Why by Judea Pearl. I will be reading the book and sharing the big insights with my followers.

When I first started learning causal inference, I didn't have a clear idea of the problems that casual inference was trying to solve. My misconceptions made it harder to understand the material than it would have been otherwise.

So, before we get into the ideas of the book, I want to help you avoid these common misconceptions.

1. Causal Inference is NOT just regular science

All sciences strive to infer causes within their domain of expertise. Therefore, it might not be obvious to you what makes casual inference any different. This is the reason why I sometimes call this new field mathematical causal inference. This term emphasizes that what sets causal inference apart is the mathematical framework it uses to describe causation.

2. Causal Inference is NOT directly about inferring causes

Based on the name, new learners often think causal inference is solving the following problem:

Given a list of candidate variables, how can we select the ones that have a real causal effect on our outcome of interest?

This is not what causal inference does. Causal inference is solving a different problem:

Assuming our beliefs about the causal relationships between all the variables is accurate, what is the best estimate of the causal relationship between a particular candidate variable and the outcome of interest?

Very roughly speaking, causal inference tells us whether based on our causal beliefs, the association between two variables is bigger or smaller than their true casual relationship.

3. The Example of Alice and Bob

Alice thinks genes strongly affect addictive behaviors like smoking. She also thinks genes have an effect on who gets cancer. Bob agrees that genes very likely have an effect on cancer, but Bob thinks complicated social behaviors like addiction are completely due to social factors, not genes.

Causal inference allows us to evaluate the same data according to both Alice's and Bob's beliefs about the underlying causal relationships. This allows for various outcomes:

1. Avoiding unnecessary arguments. If Alice and Bob get very similar estimates for the causal relationship between genes and cancer, this implies that the disagreement about the relationship between genes and behavior is not that important. This allows scientists to move forward by focusing on the factors that really matter.

2. Agreeing to disagree. If the difference in estimates of the casual relationship between genes and cancer is large, causal inference allows both Alice and Bob to continue to explore the same data according to their very different assumptions about the causal relationships. This gives scientists and policy makers autonomy to pursue different interpretations of the same data.

4. Casual Inference builds doesn't replace statistics. It makes it more powerful.

Causal inference allows us to adjust our statistical estimates of the strength of particular casual relationships based on our beliefs about the casual relationships between the variables. This is why some experts in causal inference (like the epidemiologist @epiellie) prefer to use the term causal effect estimation to refer to the field causal inference.

That's it for now. My next post (coming soon!) will explore how causal inference creates a mathematical model of causation and what makes this approach so special. (You can find these posts using the hashtag #KareemReads)

Follow me (@kareem_carr) for more content like this. If you want to show support, like and retweet the thread.

513

797K

160

Rick Pack @rick_pack2

over 2 years ago

@dataandme @posit_pbc How are you doing over there @dataandme?

225

rick_pack2 retweeted

Dr Kareem Carr

@kareem_carr

over 2 years ago

Science Twitter is dying. Good science content is getting harder to find on our timelines. I think I know why. Science content doesn't look like what the new Twitter algorithm wants to see. The algorithm wants long chains of comments. Real discussions. Discussions keep us engaged with the site. They're also a more effective place for Twitter to put their ads. Great science content tends to generate plenty of likes and bookmarks, but very little real discussion. People say "Thanks!" and quickly move on. If we want Twitter to surface more scientific content, science twitter needs to evolve. We need to transition to having deeper more substantive discussions about science. This isn't a bad change. If we were having deeper more engaging discussions about science on Twitter, we'd probably all learn more and enjoy Twitter more at same time. This weekend I conducted an experiment. I tweeted out a bunch of provocative claims. Not as "bait", as some people called it, but as a jumping off point for debate and discussion. There were hundreds of comments. Three days later, it's still going. There was also quite a bit of negativity. Twitter has trained us to react badly to strong disagreement. I learned a lot from this experiment. I learned that discussions can work but we need to keep an eye on the negativity. Here is my call to action: HELP ME SAVE SCIENCE TWITTER! I'm a large account. I can be a platform for regular science discussions, but I can't do it by myself. I NEED YOU. I want to be a platform for "Question of the Day" discussions. Here's how it would work: • I post a prompt for discussion • You respond, interact with others, and have fun. Please engage with this Tweet! Share potential discussion ideas in the comments! Anything related to data analysis, math, statistics, machine learning, computer science, AI and philosophy science is fair game. If a topic idea interests you, like the comment so I know what to focus on.

142

172

158

300K

rick_pack2 retweeted

Dan Shipper 📧

@danshipper

almost 3 years ago

Using Edge’s GPT-4 chat function to ask questions about papers is such an underrated super power

Who to follow

Gergely Daróczi

@daroczig

Co-founder of @SpareCores, @RxStudioInc; @ceu lecturer; #rstats #python #data #emacs #linux #qtile maniac; Ex @system1co, @CARD, @rapporters; Father of 👦👧👧👶

Alice Broadhead

@AliceBroadhead

rick_pack2 retweeted

Rene Daughtry @RDaughtry

almost 3 years ago

I don't need to add anything to my brother @rick_pack2 thoughts. He and I think alike. The Power of Words. https://t.co/yXq0huwjE3

Rick Pack @rick_pack2

almost 3 years ago

@AIForeground If anyone sees an odd website after verification attempt, close the window and you may see your status has been verified.

rick_pack2's tweet photo. @AIForeground If anyone sees an odd website after verification attempt, close the window and you may see your status has been verified. https://t.co/CoFsAbdynD

246

rick_pack2 retweeted

Dr Kareem Carr

@kareem_carr

about 3 years ago

ChatGPT consistently gets this very basic question wrong. Does that mean ChatGPT is useless? Not necessarily! Using this as an example, let me show you how to take your prompts to the next level. 👇

kareem_carr's tweet photo. ChatGPT consistently gets this very basic question wrong.

Does that mean ChatGPT is useless? Not necessarily!

Using this as an example, let me show you how to take your prompts to the next level. 👇 https://t.co/7ImFmdG7np

392

373

410K

Rick Pack @rick_pack2

about 3 years ago

New zctaCrosswalk package by Ari Lamstein gets zip-like ZCTAs by state or county for easier focus with #tidycensus. #rstats https://t.co/SfWdrFpwqG

455

rick_pack2 retweeted

Albert Rapp @rappa753

about 3 years ago

The newest version of {gt} lets you create interactive web tables 🥳 And it's super easy to manage. Everything is funneled through one function opt_interactive() 👌 #rstats

$rappa753's tweet photo. The newest version of {gt} lets you create interactive web tables 🥳 And it's super easy to manage. Everything is funneled through one function opt_interactive() 👌 #rstats https://t.co/NUFR2WPBom$

252

104

32K

Rick Pack @rick_pack2

about 3 years ago

I recently used this package to pair mentors and mentees.

One R Package a Day @RLangPackage

over 6 years ago

matchingR - Computes matching algorithms quickly using Rcpp. Implements the Gale-Shapley Algorithm to compute the stable matching for two-sided markets, such as the stable marriage problem and the college-admissions problem. Implements Irving's ... #rstats https://t.co/OJqAAZK964

Rick Pack @rick_pack2

over 3 years ago

Anyone at least skimmed these papers? Thoughts? I plan to at least skim this weekend. I’ve used ChatGPT at least one x per week to aid programming.

Ethan Mollick

@emollick

over 3 years ago

👀Two early papers find the effects of generative AI on knowledge work are completely unprecedented in modern history Separate studies of both writers and programmers find 50% increases in productivity with AI, and higher performance and satisfaction. And this is just the start.

emollick's tweet photo. 👀Two early papers find the effects of generative AI on knowledge work are completely unprecedented in modern history

Separate studies of both writers and programmers find 50% increases in productivity with AI, and higher performance and satisfaction. And this is just the start. https://t.co/r8TeOv0Uui

713

978K

132

rick_pack2 retweeted

Kyle Walker @kyle_e_walker

over 3 years ago

Did you miss my @UMSSDAN workshop on using 2021 American Community Survey @uscensusbureau data in #rstats a few weeks ago? The workshop video is now posted, check it out! https://t.co/Q293OESCzW

Rick Pack @rick_pack2

over 3 years ago

@chrisprener I appreciate your honesty. Honest appraisals help us focus our energy on solutions we believe in

118

Rick Pack @rick_pack2

over 3 years ago

Happy to say I won a ticket to @d4conio in Tampa, FL Apr 19 - Apr 21! Thank you, @nyhackr.

NYC Data Hackers @nyhackr

over 3 years ago

This month's #rstats meetup is on Tues Feb 28th w/ David Smith (@revodavid) who will be talking about Copilot! 💻 Fully VIRTUAL 🎫 Also we'll be giving away a FREE ticket to @DataCouncilAI Austin & D4 Conference (@d4con) to 2 lucky attendees! RSVP now➡ https://t.co/b2imkKmKu5

nyhackr's tweet photo. This month's #rstats meetup is on Tues Feb 28th w/ David Smith (@revodavid) who will be talking about Copilot!

💻 Fully VIRTUAL

🎫 Also we'll be giving away a FREE ticket to @DataCouncilAI Austin & D4 Conference (@d4con) to 2 lucky attendees!

RSVP now➡ https://t.co/b2imkKmKu5 https://t.co/jQlVLUYpDq

205

Rick Pack @rick_pack2

over 3 years ago

A great read. I endorse Scott DeGeest easily after consulting with him.

Scott DeGeest, MBA PhD 📊🏳️‍🌈 @dscottdegeest

over 3 years ago

Really happy to share this essay I wrote about career advocacy and its benefits. I hope it debunks some myths and misconceptions that exist about mentoring. #ds4a #datascience #datafam #datacareers https://t.co/yvfu2sozuq

178

Rick Pack @rick_pack2

over 3 years ago

I have often used the #tidycensus R package in analyses and am happy to support author Kyle Walker’s book.

Kyle Walker @kyle_e_walker

over 3 years ago

Writing a book... can take a long time, with years of chapter overhauls and revisions. Today is finally release day of "Analyzing US Census Data"! You can buy a copy from Routledge (https://t.co/6F32TSm6IA) or Amazon (https://t.co/zdG0wmrkVy) - I appreciate it! #rstats

kyle_e_walker's tweet photo. Writing a book... can take a long time, with years of chapter overhauls and revisions. Today is finally release day of "Analyzing US Census Data"!

You can buy a copy from Routledge (https://t.co/6F32TSm6IA) or Amazon (https://t.co/zdG0wmrkVy) - I appreciate it!

#rstats https://t.co/j1udD6dTAH

218

39K

843

rick_pack2 retweeted

Kyle Walker @kyle_e_walker

over 3 years ago

My book Analyzing US Census Data is now in print! I’m really happy with how it turned out. Thanks so much to @lara_crc and the @CRCPress team. You can order your copy today! https://t.co/6F32TSlyT2; https://t.co/zdG0wmqN60 #rstats

kyle_e_walker's tweet photo. My book Analyzing US Census Data is now in print! I’m really happy with how it turned out.

Thanks so much to @lara_crc and the @CRCPress team. You can order your copy today! https://t.co/6F32TSlyT2; https://t.co/zdG0wmqN60

#rstats https://t.co/tRHP8icmRF

345

39K

rick_pack2 retweeted

Bert Wagner @bertwagner

over 3 years ago

Join us at this Tuesday's upcoming @ONSSUG meeting with @rick_pack2 on "R and Python Uses of U.S. Census Data in Kaggle GoDaddy Forecasting Competition" https://t.co/8nasjRCn1C

Rick Pack @rick_pack2

over 3 years ago

@GabeOsterhout @kyle_e_walker Nice use of packages including tidycensus and @chrisprener’s biscale!

rick_pack2 retweeted

James Wade

@JamesHWade

over 3 years ago

Do you want feedback on your #rstats code but are too 🙈 shy 🙈 to ask? 🤖 gpttools 📦 can now suggest code improvements and explains what it did 🤯! #ChatGPT It's almost like a modern-day rubber 🦆 that talks back https://t.co/9f54V4oiMs

JamesHWade's tweet photo. Do you want feedback on your #rstats code but are too 🙈 shy 🙈 to ask?

🤖 gpttools 📦 can now suggest code improvements and explains what it did 🤯! #ChatGPT

It's almost like a modern-day rubber 🦆 that talks back

https://t.co/9f54V4oiMs https://t.co/LFi76aexWh

506

306

63K

Rick Pack

@rick_pack2

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users