Vinod Chelladurai

@vintechie

I write about Data, Java, Apache Kafka, Apache Flink & Tennis • YouTube @ Data Engineering Minds • Kafka Catalyst 2023-2024

Berlin, Germany

Joined October 2020

190 Following

84 Followers

684 Posts

Pinned Tweet

Vinod Chelladurai @vintechie

over 2 years ago

I’m very thrilled to share that I’ve been selected as a Kafka catalyst in the Kafka community by Confluent for the year 2023-2024 ❤️🎉 Many thanks to @ena_9428 for the amazing swag 🙏 Not able to make the Kafka Current this year, but hope to make the Kafka Summit in 2024

vintechie's tweet photo. I’m very thrilled to share that I’ve been selected as a Kafka catalyst in the Kafka community by Confluent for the year 2023-2024 ❤️🎉

Many thanks to @ena_9428 for the amazing swag 🙏

Not able to make the Kafka Current this year, but hope to make the Kafka Summit in 2024 https://t.co/OmJPSXT4bY

845

Vinod Chelladurai @vintechie

2 months ago

How good is the idea of writing e-books at this point of time? I ask because at least most of the folks I know just use AI to skim everything and read only the summary.

Vinod Chelladurai @vintechie

12 months ago

@zettibar10 @ItzzZain10 You are almost close but Fritz choked

211

Vinod Chelladurai @vintechie

12 months ago

@purosanantonio_ @Wimbledon I don’t think Carlos got lucky but Fritz choked. He was not too brave to go for the shots when he had chances on every point after 5-4 in the tie break

315

Who to follow

Mickael Maison

@MickaelMaison

Working on @ApacheKafka at @IBM. Opinions expressed are my own.

Simon Aubury

@SimonAubury

And this is where you add a funny & creative bio 🔗 Mastodon: @[email protected]

TileDB

@tiledb

TileDB is foundational software designed by scientists for scientific discovery.

Vinod Chelladurai @vintechie

over 1 year ago

Event Streaming at scale is hard. Real-time Analytics at scale is hard. Building and maintaining Lakehouse at scale is hard. Life as a Data Engineer in the future without knowing above is hard. Choose your hard 🤷‍♂️

Vinod Chelladurai @vintechie

over 1 year ago

Dear Flink users and Data Engineers, I feel the managed Flink from AWS is not really suitable for intensive stream loads with respect to both low memory and low compute nodes offered by AWS for their Flink clusters. Does anyone feel the same? #Flink #dataengineering

vintechie retweeted

Clive Chan

@itsclivetime

over 1 year ago

wtf is in apple silicon? my single threaded code is 3x faster on macbook than on server while burning like half the power. if apple bothered making a 64-core part they'd take over every datacenter

158

10K

283

vintechie retweeted

Santiago

@svpino

over 1 year ago

Data pipelines will put you in the top 1% of the market. If you could only learn one skill for the next decade, I can't think of anything more critical than learning to move and process data at scale. I like to tell people I'm a Machine Learning Engineer, but in reality, 90% of the value I produce comes from my ability to move data around consistently and correctly. In the field, we like to use the term "orchestration" when talking about coordinating workflows that move and process data. At a high level, there are three main steps you need to worry about: 1. Getting the data from its source 2. Processing and cleaning that data 3. Delivering the cleaned data to the right place You might have also heard about "ETL" (Extract, Transform, Load). That's how most people refer to the process above. Of course, building a simple ETL system isn't complex; most developers can do it without too much trouble. The problem is designing resilient, scalable, and fault-tolerant systems. You can't code your way to a production-ready orchestration platform (ask me how I know.) I started with AirFlow and eventually moved to @kestra_io because of its event-driven architecture. Event-driven means you can kick off a workflow automatically based on different triggers. For instance, when somebody uploads a new file to a folder, an app updates a database table, or there's a new message in a queue. It's hard to summarize everything you get from Kestra, but here are some of the highlights: • Kestra is free and open-source • You install it from a Docker container • Workflows as Code using YAML <--- this is awesome • Scales to millions of executions • It integrates with every cloud platform you've seen • Language agnostic (but I still like Python the most) Here is a link to their GitHub repository: https://t.co/h8mWoYcymb Here are the three things I recommend: 1 - Take a look at their live demo in their GitHub repo 2 - Build a simple workflow (it will take 5 minutes) 3 - Talk to your boss. Where can you plug this into your company? I started using Kestra at the height of the pandemic. It's an awesome tool, and I'm proud that they are sponsoring my writing. I hope you find it helpful as well.

svpino's tweet photo. Data pipelines will put you in the top 1% of the market.

If you could only learn one skill for the next decade, I can't think of anything more critical than learning to move and process data at scale.

I like to tell people I'm a Machine Learning Engineer, but in reality, 90% of the value I produce comes from my ability to move data around consistently and correctly.

In the field, we like to use the term "orchestration" when talking about coordinating workflows that move and process data. At a high level, there are three main steps you need to worry about:

1. Getting the data from its source
2. Processing and cleaning that data
3. Delivering the cleaned data to the right place

You might have also heard about "ETL" (Extract, Transform, Load). That's how most people refer to the process above.

Of course, building a simple ETL system isn't complex; most developers can do it without too much trouble. The problem is designing resilient, scalable, and fault-tolerant systems.

You can't code your way to a production-ready orchestration platform (ask me how I know.) I started with AirFlow and eventually moved to @kestra_io because of its event-driven architecture.

Event-driven means you can kick off a workflow automatically based on different triggers. For instance, when somebody uploads a new file to a folder, an app updates a database table, or there's a new message in a queue.

It's hard to summarize everything you get from Kestra, but here are some of the highlights:

• Kestra is free and open-source
• You install it from a Docker container
• Workflows as Code using YAML <--- this is awesome
• Scales to millions of executions
• It integrates with every cloud platform you've seen
• Language agnostic (but I still like Python the most)

Here is a link to their GitHub repository:

https://t.co/h8mWoYcymb

Here are the three things I recommend:

1 - Take a look at their live demo in their GitHub repo
2 - Build a simple workflow (it will take 5 minutes)
3 - Talk to your boss. Where can you plug this into your company?

I started using Kestra at the height of the pandemic. It's an awesome tool, and I'm proud that they are sponsoring my writing. I hope you find it helpful as well.

149

91K

vintechie retweeted

Madza 👨‍💻⚡

@madzadev

over 1 year ago

16 Open Source Alternatives for Popular SaaS Software (save them):

851

111

216K

Vinod Chelladurai @vintechie

over 1 year ago

@BarclayCard18 Did they shake hands after delpotro withdrew? I remember watching the YouTube video of this one like a decade back but not able to find the full match. And yeah, at US open in the same year, when Murray won, they were embracing at the net.

126

vintechie retweeted

Gunnar Morling 🌍

@gunnarmorling

over 1 year ago

The biggest problem of #Java is poor perception. It's technically super-solid, but too often folks discard it based on misconceptions or information outdated years ago.

127

771

164K

Vinod Chelladurai @vintechie

over 1 year ago

@Archimusik @Lil_Miss_AM @AustinTunnell Even when an employee wants to quit, they need to follow a fair process. Serve notice period of 3 months. They can’t do it as they wish the next day. I recently had a conversation with my friend who is in US and can understand where this is coming from.

Vinod Chelladurai @vintechie

over 1 year ago

Most of the Data Engineers & Analysts don’t realise their power and the impact they can create organisation-wide despite working in this field for several years. #DataEngineers #data #dataengineering

Vinod Chelladurai @vintechie

over 1 year ago

To all those aspiring Data Engineers, Kindly make sure you put enough effort to get the computer basics right - CPU, main memory, caching, multithreading, hyper threading, I/O, etc.. This topic is very commonly overlooked by many DEs unfortunately. #dataengineering #data

Vinod Chelladurai @vintechie

over 1 year ago

How do I tackle this issue? Go for multiprocessing. With multiprocessing, you are guaranteed to launch multiple processes which is then set to run on dedicated CPU cores available on your machine with its own memory thus bypassing the limitations of the GIL per process.

Vinod Chelladurai @vintechie

over 1 year ago

You have a CPU bound task to implement in Python such as computing a sum of squares of 50 million integers. You want to run on your macOS/linux with multiple cores. You want to smartly do your task by making use of all the available cores. #Python 🧵

Vinod Chelladurai @vintechie

over 1 year ago

And when each thread attempt to read the Python byte code, they are blocked by the current thread reading the byte code since the Global Interpreter Lock allows only one thread per process to read the Python byte code at a time.

Vinod Chelladurai @vintechie

over 1 year ago

If you would like read more data engineering and data analytics content like this, please follow me 🙏

Vinod Chelladurai @vintechie

over 1 year ago

What is a graph database and why should you use it when you’re already doing analytics efficiently with relational databases like PostgreSQL or Snowflake?* 🧵 #graph #graphdatabase

110

Vinod Chelladurai @vintechie

over 1 year ago

If your analytics is based on a graph over a window period or you want to serve real-time graph analytics, you need to consider databases like memgraph which processes the entire graph in memory. Choosing the right database has its own trade-offs.

Vinod Chelladurai

@vintechie

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users