Adding macOS 27 inspired sidebar glow to https://t.co/0AoQbS3QmQ
It is watching the canvas and generating glow in realtime using typegpu.
Had a lot of fun cooking it.
Refactored single analysis flow to multi-agent approach - orchestrate in a 3-step pipeline: understanding data → creating charts → generating insights.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
used my own tool to analyze the dataset I’m training a model on.
Sierra Graph surfaced what matters:
→ OverallQual is the strongest pricing driver
→ neighborhood premium can triple the base price
→ 3-car garage adds a significant jump vs 2-car
#MachineLearning #DataVisualization
Second Kaggle submission - House Prices regression.
key factors:
→ train on log(SalePrice), not raw price
→ engineer HouseAge, TotalSF, TotalBath features
→ fastai tabular_learner with embeddings for categoricals
→ layers=[1000, 500, 250]
current score: 15,161 RMSE
more optimization to come.
https://t.co/SaGXx6mFmc
25,000 people used @screenstudio last month.
We’re still a team of 5. No investors. Every hire came directly from our revenue.
There’s something humbling about that. But also something that’s been holding us back.
It’s time to think bigger.
Delta Lake stores data as Parquet files.
The _delta_log directory is what makes it a lakehouse.
each JSON file = one atomic commit:
→ version 0: initial write
→ version 1–6: subsequent pipeline runs
this log gives you:
- ACID transactions
- time travel (read any version)
- schema evolution
- upserts via MERGE
same Parquet files. completely different guarantees.
#DataEngineering #DeltaLake
304k car listings. 41 makes. 50 states. $14,995 median.
all from a pipeline I built in a few days: Spark → Delta Lake → dbt → DuckDB → Streamlit.
#DataEngineering
Medallion architecture explained with real data:
🟤 Bronze - raw Delta Lake on MinIO. immutable. never touched. 49 parquet files, 1.5GB, _delta_log for ACID transactions.
⚪ Silver - Spark cleans it. removes duplicates, filters bad prices ($500–$200k), normalizes strings.
🟡 Gold - dbt aggregates it. median price by make, depreciation curves, state heatmap.
#DataEngineering #DeltaLake #dbt #Medallion