The latest in my #rstats / #sql adventures:
✨Using SQL in RStudio✨
Featuring SQL RMarkdown chunks, query previews in RStudio, and my first make-my-own cartoon
https://t.co/MKLsRNjq2I
@EmilyRiederer I end up using pyspark for datasets I could probably manage in pandas just bc it feels more similar to the tidyverse. Trying polars has been on my to-do for a while...I guess 2024 could be the year! 😁
@KimIversenShow@shaig Look, there was a 23andme data breach specifically targeting those of Jewish and Chinese descent some months ago. Clearly there is genetic data on this.
https://t.co/19JJ18vQPX
🚨Update on the 23andMe Data Breach:
A hacker known as Golem leaked millions more user records from the genetic testing company.
This new dataset contains records of four million users and was published on a known cybercrime forum, BreachForums.
10 days of writing in one tweet: I am sad and angry and appalled. Revenge is not a strategy. A child is a child is a child. Freedom and dignity for all. Release the civilian hostages now. Let the aid in. Stop and think. Without a vision for a shared future, this does not end.
If you want to stand with and for Palestinians, you need to put your body between any thug and Jewish individuals they want to harm or harass. You need to shut down and push back against any antisemitic bullshit. Jewish people are not the enemy. So many are our staunchest allies
Israel has the right to defend itself. We must make sure they have what they need to protect their people today and always.
At the same time, Prime Minister Netanyahu and I have discussed how Israel must operate by the laws of war. That means protecting civilians in combat as best as they can.
We can’t ignore the humanity of innocent Palestinians who only want to live in peace. That’s why I secured an agreement for the first shipment of humanitarian assistance for Palestinian civilians in Gaza.
And we cannot give up on a two-state solution.
@IsraelinUSA there are claims that the following message is fake. Can you confirm/deny?
The Israeli Embassy in Washington DC is looking for English-speaking Israelis...write an email to [email protected]
I love when some new AI text model comes out and everyone is putting up crazy screenshots like “This weighted string generator is so terrifying, it will overthrow the world order” and then it spits out something like, “Random forests are common throughout Northern Europe.”
What is the craziest thing you've built or seen in SQL?
Mine is probably the single select statement with 22 joins... I've also written SQL that writes dynamic SQL to do pivoting, which is also pretty gnarly.
Data science cannot be fully automated.
No amount of parameter tuning, benchmarking, automation, model comparison, automated feature engineering, etc., can automatically figure out what data column location_123_old contains and whether it should be a feature or not.
@MilesMcBain We've done PySpark for heavier data computation and then R for reporting/analysis of outputs (usually dumped to a DB or S3). We tried sparklyr/SparkR early on but it was smoother with PySpark (bigger online community, immediate access to new features, etc)
Psych folks who know #rstats but are intimidated by SQL databases (i.e., industry work)--you already know the core concepts. Let me help you translate. 🧵
1. The "database" is analogue to your R environment. It can store many tables. (1/n)