Statsguyphd

@statsguyphd

Just created this to answer a student's question. I do not care for social media, but maybe I can help demystify or correct some misconceptions about statistics

Joined June 2020

17 Following

2.1K Followers

169 Posts

Pinned Tweet

Statsguyphd @statsguyphd

over 5 years ago

I meant to provide this link (someone put this kind of work in Jupyter and began expanding the datasets): https://t.co/8WBT390l6m

Statsguyphd @statsguyphd

over 5 years ago

@SamKoebrich @qcsoarer This is actually something I was thinking of talking about this morning. A good thing to do would be to separate precinct counts from all States into urban and non-urban areas (roughly blue/red). Then conduct a Benford's analysis to determine goodness-of-fit and compare.

Statsguyphd @statsguyphd

over 5 years ago

I meant to provide this link (someone put this kind of work in Jupyter and began expanding the datasets): https://t.co/8WBT390l6m

Statsguyphd @statsguyphd

over 5 years ago

@ReposeGuru I think it is clear we have work to do to improve our sampling processes. I would hate to think the errors we see are intentional. I think it is more likely that previous'y tried and true sampling techniques have not caught up to the modern world-both technologically and socially

Who to follow

Hugo Chrost

@chrost_hugo

Founder at @Solvemed PuRe Pupillometry System | Thiel Fellow

Chris Laub

@ChrisLaubAI

AI subsidy abuser | Trilingual surfer living overseas since '13

Richard Hanania

@RichardHanania

I write a newsletter you should subscribe to. https://t.co/fTgbdbgWYE. Monthly columns @unherd @BostonGlobe

Statsguyphd @statsguyphd

over 5 years ago

Thank you to whoever took this over and developed an interest in the data collection and exploration/analysis. I am glad people are getting interested in learning this stuff!

131

Statsguyphd @statsguyphd

over 5 years ago

@VirtuArete I really am going to bed now but I saw this and can't help but enthusiastically say: anything by Charu Aggarwal (Outlier Analysis is particularly good!)

Statsguyphd @statsguyphd

over 5 years ago

Because all the ladies knew he could...be discrete beforehand!

Statsguyphd @statsguyphd

over 5 years ago

For the few of you that liked the stats jokes, I will end the night with one: Why was Mr. Bernoulli Bayes known as the "man about town"? (answer in the reply)

Statsguyphd @statsguyphd

over 5 years ago

@ninjatannerman No, this is the most ever.

Statsguyphd @statsguyphd

over 5 years ago

Who knew that with less than 100 lines of code you could make half the country wish you were dead, the other half appreciate math, and approximately 0.00000001% laugh at what you thought were high quality stats jokes.

299

Statsguyphd @statsguyphd

over 5 years ago

@ReznoirA Also, don't do coke. You're better than that.

Statsguyphd @statsguyphd

over 5 years ago

@ReznoirA You are absolutely correct from an efficiency standpoint. I would even say that I can do it in one line with Perl (because I can do anything in Perl in one line). But I wrote the code the way I did for instructional purposes. Feel free to hate it.

Statsguyphd @statsguyphd

over 5 years ago

@thingcreator Yes, but Benford's is a discrete distribution and is really easy to create. Chi-squared is also easy to code (I say that even though I used scipy, less typing). Also, in anything I am mocking up as an example that needs to collect data, I am going to use Python. But I do love R!

Statsguyphd @statsguyphd

over 5 years ago

@Prof_JTaylor I want to say thanks but....I don't think having followers is a good thing. I would rather people learned some cool statistical techniques and how to code (I mean, come on, Python is so incredibly accessible, I grew up on C and Perl).

Statsguyphd @statsguyphd

over 5 years ago

@halfacanuck Oh my, I don't think I will be posting any others here. I'll take data collection requests and I can write analytic code, but you post the results lest someone call me Putin again.

Statsguyphd @statsguyphd

over 5 years ago

@thecatalvarado I think a lot of people (and yes that includes you) are reading into what I've posted with their own bias. I clearly have talked about goodness-of-fit to Benford's distribution and the anomalousness relative to parallel in-context sets (i.e. - 1 set of frequencies vs another).

Statsguyphd @statsguyphd

over 5 years ago

I am making these tweets to explain in one place some analysis that was done last night. 1 - I was asked offline about doing Benford's on election data. I explained that this is common and a useful way to detect anomalies in data that are driven by artificial process (e.g. fraud)

167

854

313

Statsguyphd @statsguyphd

over 5 years ago

@thecatalvarado Note, the code scrapes the data for you and runs the analysis, but feel free to evaluate it. Also, keep in mind that it's not meant to be the most efficient code (got some valid criticism there). It is written to be instructional.

Statsguyphd @statsguyphd

over 5 years ago

@thecatalvarado Totally get it, econ rocks, you guys are basically stat cousins (and many people fail to realize so are the agriculture guys, stats foundations there!). Here's the link to the data: https://t.co/2CfNJvp0WJ and here is the link to the code: https://t.co/zO7lqY1ZGA

Statsguyphd @statsguyphd

over 5 years ago

@thecatalvarado If my data was subject to an IRB you would be somewhat correct (there are actually other more important hurdles in that case). This was a just a brief explanation of how to perform the data collection and analysis. There is no peer review in this scenario other than the PUBLIC.

Statsguyphd

@statsguyphd

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users