Data replication into @ApacheIceberg isn’t easy from CDC to schema changes, it can get tricky fast.
That’s why at OLake, we’re building not just the fastest replication framework for Iceberg, but also the resources to help you do it right.
Check out this thread 🧵
In our recent session with Arsham (co-founder , greybeam), we dove deep into the evolving @ApacheIceberg catalog ecosystem.
One highlight: Apache Polaris
Here’s what it is — and why it matters 👇
#DataEngineeringStudy
Today we’ll be diving into one of the most important concepts of the Iceberg ecosystem — Catalogs.
Expect deep dives into @apachepolaris , Lakekeeper, and AWS Glue, along with demos, why they matter, and how they fit into the bigger picture.
👉 Don’t miss out — Register now!
Iceberg in production? Your catalog choice decides governance, cost, performance & interoperability.
Join Arsham Eslami (Greybeam) on Sept 4 for a deep dive on catalogs:
Raw Iceberg vs. catalog-backed
Live demos: Glue, Polaris, Lakekeeper
2025 updates: Polaris 1.0, Snowflake
Are people here using or planning to use Iceberg V3?
We are planning to use Iceberg in production, just a quick questio before we start the development.
Has anybody done the deployment in production, if yes: What are problems you faced? Are the integrations enough to start?
Kickoff session for @_olake week 🚀
Shivji + Saurabh from @nutanix dive into:
- @ApacheIceberg table ops
- @ClickHouseDB best practices
- Performance tips
-Live demos + Q&A
📅 Aug 28
Save your seat https://t.co/8MZnLCRWJz
Make sure to keep an eye on the week!
166M rows synched in a single day on a V0.1 version of an OSS, only means one thing @ApacheIceberg is here to stay and is the future of Open Data in the AI world.
Optimizing Queries in Apache Iceberg
As data grows, queries slow down. Sorting helps, but not always with multi-column filters.
That’s where Z-ordering comes in it clusters data, reduces file scans, and speeds up queries.
The key is knowing when to apply it.
🧵 A thread
At our @ApacheIceberg NYC Meetup, we discussed the problems teams face with traditional tools like @fivetran and @AirbyteHQ and why solving them is integral for companies scaling to PBs of data.
From pain points →to Iceberg’s capabilities →how OLake achieves them.
A thread🧵
Hey devs, OLake Community Week is here!
Bringing together industry leaders, speakers and the community!
📅 Aug 28 – Clickhouse + Iceberg session
🤝 Aug 29 – 8th Community Call
📅 Sep 4 – Current landscape & Future Catalog insights
📌 Register here👉 https://t.co/5Z1tkiOaaM
The SC’s directive to remove all stray dogs from Delhi-NCR is a step back from decades of humane, science-backed policy.
These voiceless souls are not “problems” to be erased.
Shelters, sterilisation, vaccination & community care can keep streets safe - without cruelty.
Blanket removals are cruel, shortsighted, and strip us of compassion.
We can ensure public safety and animal welfare go hand in hand.
Supreme Court bans them for biting, scaring, and threatening.
For a second, I thought we were discussing about some human beings. And aren’t they a bigger menace to society than these stray dogs?
DoorDash saved millions switching to Iceberg format
25-49% storage reduction
40-70% compute cost savings
But Iceberg or delta lake what's right for your ML pipelines?
Our comparison breaks down when to choose what
https://t.co/WNEZJwJfso
Day 1: learnt about apache iceberg's architecture and how the metadata layer of metadata files, manifest list and manifest helps with ACID properties in data lake.
Humans took thousands of years to reach where we are today, AI took just took couple years.
The rate of evolution is definitely exciting. Is 2025 the year where AI take over human for logical thinking in most domains?
Goodbye ChatGPT
It’s only been 5 days since Deepseek R1 dropped, and the World is already blown away by its potential.
13 examples that will blow your mind (Don't miss the 5th one):