Module 7 of Data Engineering Zoomcamp done!
- Kafka producers and consumers
- PyFlink tumbling and session windows
- Real-time taxi data analysis
- Redpanda as Kafka replacement
My solution: https://t.co/RFZPTlKLsn
Free course by @DataTalksClub: https://t.co/WHjU41wPQU
β‘ Module 6 of Data Engineering Zoomcamp done!
- Batch processing with Spark π₯
- PySpark & DataFrames
- Parquet file optimization
- Spark UI on port 4040
My solution: https://t.co/t1dn4GdKiU
π Module 5 of Data Engineering Zoomcamp done!
- Data Platforms with Bruin
- End-to-end ELT pipelines
- Data quality & lineage
- Deployment to BigQuery
My solution: https://t.co/MHIcnl6NEt
Free course by @DataTalksClub: https://t.co/WHjU41wPQU
π Module 4 of Data Engineering Zoomcamp done!
- Analytics Engineering with dbt
- Transformation models & tests
- Data lineage & dependencies
- NYC taxi revenue analysis
My solution: https://t.co/VHAfeyxLRT
Free course by @DataTalksClub: https://t.co/WHjU41xnGs
π Module 3 of Data Engineering Zoomcamp done!
- BigQuery & GCS
- External vs materialized tables
- Partitioning & clustering
- Query optimization
My solution: https://t.co/JldVW0MUCQ
Free course by @DataTalksClub: https://t.co/WHjU41wPQU
Module 2 of DE Zoomcamp by @DataTalksClub done!
- kestra workflow orchestration
- ETL pipelines for taxi data
- Backfill & scheduling
- Variables & dynamic flows
My solution: https://t.co/adf1F0ml06
If anyone is interested in this course:
https://t.co/WHjU41wPQU