Independent, critical and technical thinking on the use cases, architectural patterns and technologies relating to the preparation of data for exploitation.
Cloudera have announced the release of #ClouderaStreamsManagement - bundling their Kafka management console and replication tool #odenews https://t.co/dTon4OP3FN
#InfluxDBCloud (their time series database as a service on AWS, Azure and GCP) has hit 2.0 and has now gone serverless #odenews https://t.co/cm1oNc1n6I
#QuboleDataService is now available on #GoogleCloud if you're looking for a cloud agnostic Hadoop as a service offering (that runs on Google Cloud) #odenews https://t.co/DHhVUICNqZ
Interesting in replicating data between #Kafka clusters - Cloudera have a post on on MirrorMaker 2 which is based on #KafkaConnect#odenews https://t.co/0t9UjsQhEk
#ApacheCalcite 1.21 is out - you might never have heard of it, but it's probably being used by many of the data tools you use on a daily basis for query parsing and optimization #odenews https://t.co/d6uny7WAnp
And Amazon have also announced #AmazonEMR 6.0, with support for Hadoop 3.1 and running Spark jobs in Docker containers #odenews https://t.co/acYUdFLPoA
Looking for an open source object storage gateway that supports the Amazon S3 protocol - #ZenkoCloudserver has just released version 8.2 #odenews@zenko https://t.co/YB2Hlxd7HP
Using #GoogleCloudStorage with #Hadoop - Google have a new version of their Cloud Storage Connector for Hadoop out with a bunch of performance improvements and locking for directory modifications #odenews https://t.co/dybFTpk0WE
From the ever excellent The Morning Paper, a review of a paper that used "the TPC-H benchmark to assess Redshift, Redshift Spectrum, Athena, Presto, Hive, and Vertica to find out what works best and the trade-offs involved" #odenews https://t.co/Zl8ajREP6r