Solution Engineer @Cloudera. Formerly @Hortonworks and @SpringSource. Passionate about open source, data, integration, and analytics. Thoughts are my own.
IDC Vendor Profile | Cloudera: IoT: https://t.co/TfkS6APQQw <- "Most IoT projects will require a hybrid architecture, where some data is processed and analyzed at the edge in near real time, and other data is analyzed and processed in the cloud (or other centralized datacenter)."
CDE Triggered By AWS Lambda: https://t.co/e7XHxApFCh <-- How to trigger #cloudera Data Engineering Experience #saas job via #aws#lambda. Useful trigger for immediate processing via #Spark when data lands in #s3
Using CDE to Analyze the PPP Data: https://t.co/LnhrYBMY3z <-- how CDE, using Apache Spark, can be used to produce reports based on the PPP data while addressing the challenges of working across a multi-stage analytics process against a very large, continuously evolving dataset
Automated Deployment of Apache Spark Jobs in Cloudera Data Engineering: https://t.co/MK8dbvYdqb <-- using CDE to extract, transform, and load data from an S3 bucket into Hive and then report off it using the Cloudera Data Warehouse
Spark Structured Streaming example with Cloudera Data Engineering (CDE): https://t.co/2Zde6204MM <-- using CDE to stream data from Kafka with Spark Structured Streaming
Getting Started with Cloudera Data Engineering on CDP: https://t.co/LQZtzQ5B5o <-- a new way for Data Engineers (both admins and users) to provision, track, schedule, deploy, monitor, and troubleshoot spark workloads in a centralized production environment
I spent last year building this code. It allows you to learn how to fully automate Cloudera Data Platform in AWS and Azure. I think it's worth sharing if you are using CDP today. #cloudera#cdp#hybridcloud#aws#azure#devops https://t.co/21IBQIlgrw
Securely transfer data from anywhere to CDP Public Cloud using Nifi site-to-site protocol. How? This is how:
#cloud#hybridcloud#nifi https://t.co/6Vm6nxPeB0
Apache Submarine 0.4.0 Release: What’s New and Coming? https://t.co/oWURqcbfEu <— Submarine is ONE PLATFORM to support Data Scientists from exploring data pipeline creation, model training (experiments), to pushing the model to production, including model serving and monitoring
I’ve interviewed hundreds of people for numerous companies over the past 20 years of building businesses.
I’ve experimented with many interview questions and most are only semi useful, but one, above all, has been the most useful.
A thread...
The introduction of CDP Private Cloud, built on @RedHat OpenShift, accelerates data-driven #digitaltransformation across private and #hybridcloud with cloud-native speed, scale, and economics. Learn more here: https://t.co/s2YiSldLO7
With latest CDP-DC 7.1 release, I just spun up Kafka 2.4 compute cluster w/ SMM(monitoring),SRM(replication), Registry(schema), new Cruise Control(rebalancer) & Connect support. All integrated with SDX. Powerful new Kafka mgmt services in new release!
https://t.co/hD970JCNIk