Do you want to become a world-class data analyst in just 6 months?
I’ve been there, and I know how overwhelming it can be to learn data analysis from scratch.
That’s why I’ve created a 6-month roadmap of free high-quality resources that will teach you everything you need to know, from statistics and Excel to Python and cloud computing.
In this thread, I’ll share with you my ultimate roadmap and how you can download it for free.
Let’s get started! 👇
Spoke at Leeds Data Science meetup last night - From £8,000 to £40: What I Learned Shipping AI in a UK Engineering Company.
4,700 engineering drawings. 4 weeks of manual work → 45 minutes. The model was the least interesting part.
Sharp room, great questions. Thanks @jumping_uk for the invite
Spoke at Leeds Data Science meetup last night - From £8,000 to £40: What I Learned Shipping AI in a UK Engineering Company.
4,700 engineering drawings. 4 weeks of manual work → 45 minutes. The model was the least interesting part.
Sharp room, great questions. Thanks @jumping_uk for the invite
Interested in the intersection of data science, electoral politics, and scenario analysis? Don't miss @DataSenseiObi's latest exploration, written just ahead of the May 7 UK elections. https://t.co/g5dPDrmTvF
Hey #datafam
UK local elections today. Just published in @TDataScience
Across 64 English authorities and six modelled scenarios, the strongest shock is only 13% of the calibrated uncertainty band.
Scenarios sit inside the noise.
https://t.co/LibEFMTnmV
Ahead of the May 7 elections in the UK, @DataSenseiObi continues his analysis and scenario modeling series, focusing on calibrated uncertainty, historical error, and why some models are most useful when they refuse to forecast. https://t.co/g5dPDrmTvF
Hey #datafam
Ahead of the UK local elections tomorrow, I just published a Tableau scenario dashboard for the 2026 cycle.
Finding: the strongest modelled shock is only 13% of the median uncertainty band. Scenarios sit inside the noise.
https://t.co/TWg77P2gFA
Hey #datafam
Ahead of the UK local elections tomorrow, I just published a Tableau scenario dashboard for the 2026 cycle.
Finding: the strongest modelled shock is only 13% of the median uncertainty band. Scenarios sit inside the noise.
https://t.co/TWg77P2gFA
For his latest deep dive, @DataSenseiObi presents a data-quality case study on English local elections, covering categorical normalisation, metric validation, and why raw labels should never define analytical groups. https://t.co/xN0YcbBJpb
I got it wrong.
A bug flipped my entire result: fragmentation didn’t rise in 66/67 councils. It rose in just 18.
What actually changed? Voter churn doubled. The party system didn’t fragment.
@TDataScience https://t.co/JDCS8eVj54
@tableau https://t.co/0qplL03M9P
I got it wrong.
A bug flipped my entire result: fragmentation didn’t rise in 66/67 councils. It rose in just 18.
What actually changed? Voter churn doubled. The party system didn’t fragment.
@TDataScience https://t.co/JDCS8eVj54
@tableau https://t.co/0qplL03M9P
Whether you're interested in British politics or in smart data-analysis techniques, @DataSenseiObi's data quality case study is a must-read. https://t.co/xN0YcbBJpb
Follow along @DataSenseiObi's thorough project walkthrough to learn how a hybrid PyMuPDF + GPT-4 Vision pipeline replaced £8,000 in manual engineering effort, and why the latest models weren’t the answer. https://t.co/D31qMtWVNz
We're thrilled to welcome @DataSenseiObi to TDS! Click below to explore his comprehensive walkthrough on designing a streamlined, rapid document-extraction system. https://t.co/D31qMtWVNz
I replaced £8,000 of manual work with ~$15 in API calls.
4,700 PDFs. 45 minutes. 96% accuracy.
Published on @TDataScience 👇
https://t.co/ujAFYYgDvc
Here’s the architecture (and why GPT-5 wasn’t the answer) 🧵
I replaced £8,000 of manual work with ~$15 in API calls.
4,700 PDFs. 45 minutes. 96% accuracy.
Published on @TDataScience 👇
https://t.co/ujAFYYgDvc
Here’s the architecture (and why GPT-5 wasn’t the answer) 🧵
@DataSenseiObi This article is very well explained. As someone new to the field, I was able to clearly understand the problem the approach solves. I especially appreciated the discussion around why different approaches were chosen and how they align with stakeholder needs.
What actually broke the system:
→ revision tables mistaken for current values
→ grid letters (A, B, C) read as revisions
→ PDFs rotated with incorrect metadata
None of this showed up in small tests.