Scaling multimodal workloads ≠ scaling rows of numbers.
🖼️ PDFs, images, video = 💥 OOMs + endless tuning.
That’s why we rebuilt Daft’s distributed engine, Flotilla.
Here’s what broke, what we built, and what we learned 👇
Wow the team at @daftengine cooked!
You can now read/write to 🤗Hugging Face with Daft!
> DataFrame engine in 🦀 runs distributed and supports multimodal datasets to train/eval models
Best part: it's optimized for Xet, the dedupe-based HF storage that makes uploads crazy fast!
🚨 Announcing Open Engines™, a quick + reliable way to deploy @trinodb, @raydistributed, and @ApacheFlink making it easy to choose the right engine for analytics, streaming, or ML/DS.
Read the details👉
https://t.co/Gvg7J95NYR
📢 NEW RELEASE
Hudi 1.0.0-beta1 is out 📢! This release brings database-like qualities to the data lake!
🔐 Non-blocking concurrency control for high streaming writes
🎾 Filegroup reader & Page Skips
🗂️Functional indexes
🌲 LSM-tree-based timeline
This @AWSOpen blog w/ Onehouse makes it easy to leverage open source on #AWSCloud! Onehouse accelerates build of a Universal Data Lakehouse with MSK, RDS, S3, EMR, Athena, Glue, Redshift. Powered by #apachehudi#apacheiceberg, #deltalake and @OnetableOSS
https://t.co/K5HI71mh4P
Are you building a data lakehouse? Check out this guide to help you consider tradeoffs between a DIY approach vs. a managed solution. Learn how you can get your lakehouse off the ground faster 🚀
https://t.co/HpflpCsgKE
#datalakehouse#apachehudi#apacheiceberg#deltalake
There's that saying that if you "wrote the book on something" you know your stuff. It's not true. You know your stuff when you wrote the thing that someone else wrote the book on.
Below is how Kafka actually works by the guy who (more than anyone) actually wrote Kafka!
I am extremely fortunate to be working with such inspirational women across Confluent and working together to #BreakTheBias and built an equitable workplace! Happy International Women's Day!
#BreaktheBias#WIN-erg #IWD#WHM#Confluent
Six years ago we built an entirely new data infrastructure to set #datainmotion. Our journey is built on the passion, energy, and dedication of our team, community, customers, and partners. We’re here because of you, THANK YOU! https://t.co/La89FhXAVg #ConfluentIPO