It looks like the https://t.co/LHWS1eZnFo Maven repo was shutdown by its owners.
If you are having build failures, DM @cwensel for options to get back up.
Do you miss meetups of the Big Data era, when you could learn about RAFT protocol or map-side join over pizza and beer? Us too! My friend @cwensel has launched SF Distributed Systems Meetup.
First meetup is next week, with talks by me and @conor_power23 https://t.co/MmPRkvtPq8
If your company is still running Cascading+HadoopMR applications and are itching to drop MR for Spark, dm me.
Some companies have 100s of @Cascading apps in production and moving them off MR in a couple lines of code would be a huge cost savings.
Also true for @scalding.
I always suspected that Parquet and Orc weren't very good, but TIL from @ViktorLeis that one can do much much better with BtrBlocks. https://t.co/GpuZvQ0jaw
Cascading 4.5.1 was just released to Maven Central.
https://t.co/lAWvrayV91
This release includes minor bug fixes and dependency updates.
https://t.co/ottloXkhwo
I'm hanging my shingle out again.
I have some new insights I want time to work out into a developer tool for cloud data engineers, but I want to bootstrap it with contract/consulting revenue.
This is how I started @Cascading in 2007.
https://t.co/NfE0NE7UMx
Please boost!
Scala folks: if there are open source GitHub Twitter libraries you like, you might want to make a clone/fork now. Seems like we are in a situation where anything can happen.
really interested in adding mergeable summary as a datatype to @cascading. think having daily/hourly aggregate tables, then daily/hourly performing a last N day/hour aggregation against the partials. cost efficient way to have rolling percentiles/aggregates
cross your fingers, we might have a @cascading 4.1 release tonight, and a quick 4.5 release a day later w/ Hadoop 3 support. Both will include native @ApacheParquet support to replace the support dropped by Parquet. Plus some well worn enhanced Parquet support in 4.6
For fun I really want to fork the @cascading local mode planner to support RocksDB as the backend for joins instead of the memory only hashmaps. Local mode was never intended to scale in joins.
@cwensel@PinterestEng Google gs connector does do aligned block reads...both that and this one work well for backwards seeks (footers, stripe footers -> stripes)
/2
Improving efficiency and reducing runtime using S3 read optimization by @PinterestEng https://t.co/IBGHYnYQo5 << do you have a link to the git repo?