Daniel Pérez Atanacio @daniel_pro4 - Twitter Profile

2 months ago

📚 Colocation matters. Cognitive load matters. Boundaries matter. High cohesion matters. Yes, even in the age of AI (maybe even more so). Enter the vertical codebase: https://t.co/mkbwB9p3Km

8

337

31

218

19K

daniel_pro4 retweeted

PyQuant News 🐍

@pyquantnews

4 months ago

10 free Python PDF ebooks for download:

10

1K

228

2K

89K

daniel_pro4 retweeted

Prisma

@prisma

4 months ago

If your Prisma schema is getting messy, don’t keep everything in one file. Prisma ORM supports multiple schema files, it's easier to set up and much cleaner to maintain. Small change, big clarity 👇

7

69

11

17

3K

daniel_pro4 retweeted

Dominik 🔮 @TkDodo

4 months ago

📚 Creating thin abstractions is easy, until you’re trying to build them on top of functions that heavily rely on generics. I wrote about the tradeoffs of wrapping useQuery and why type inference makes this trickier than it looks. https://t.co/IHaPOtqodI

15

363

41

213

34K

Who to follow

Bitcoin ₿, Acciones, trading

laloh

@lalocarrerah

hi

daniel_pro4 retweeted

Carlos Torres

@CarlosTorresF_

5 months ago

Si van a Puebla mejor evítenlo. En el @PueblaAyto donde gobierna morena con @pepechedrauimx las 🐀 de autopartes operan con total impunidad porque saben que no les pasa nada. Ni los parquímetros de la 4T ni el soldadito que trajeron a la @SSC_Pue sirvieron. #CapitalImparable de delitos.

126

2K

1K

79

37K

daniel_pro4 retweeted

Miguel Ángel Durán

@midudev

6 months ago

Chuleta de SQL para todos los niveles:

4

2K

215

1K

68K

daniel_pro4 retweeted

Alex Xu

@alexxubyte

almost 2 years ago

What is the best way to learn SQL? In 1986, SQL (Structured Query Language) became a standard. Over the next 40 years, it became the dominant language for relational database management systems. Reading the latest standard (ANSI SQL 2016) can be time-consuming. How can I learn it? There are 5 components of the SQL language: - DDL: data definition language, such as CREATE, ALTER, DROP - DQL: data query language, such as SELECT - DML: data manipulation language, such as INSERT, UPDATE, DELETE - DCL: data control language, such as GRANT, REVOKE - TCL: transaction control language, such as COMMIT, ROLLBACK For a backend engineer, you may need to know most of it. As a data analyst, you may need to have a good understanding of DQL. Select the topics that are most relevant to you. Over to you: What does this SQL statement do in PostgreSQL: “select payload->ids->0 from events”? – Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://t.co/uc5M7CdXXC

4

693

155

627

67K

daniel_pro4 retweeted

Nelson Djalo | Amigoscode

@AmigosCode

almost 2 years ago

Top System Design Building Blocks You Must Know 1. Distributed Messaging Queues 2. DNS (Domain Name System) 3. Load Balancer 4. Distributed Caching 5. Database 6. Distributed Task Scheduler 7. Observability 8. Unstructured Data Storage 9. Scaling Services 10. Publish-Subscribe Model (Pub-Sub) 11. Unique ID Generator 12. Rate-Limiting 👍🏿 Subscribe to our newsletter - https://t.co/c8wbDRZiXN #systemdesign #coding #interviewtips

3

263

63

239

12K

daniel_pro4 retweeted

Math Hub

@mathhub_vn

almost 3 years ago

Build and Deploy a Full Stack Next.js 13 Application with React, TypeScript, & Tailwind CSS #nextjs #react #typescript #tailwindcss https://t.co/BovAb3meP1

0

151

45

75

11K

daniel_pro4 retweeted

Master.dev (Formerly Frontend Masters)

@MasterDotDev

almost 3 years ago

Observer pattern in JavaScript by @lydiahallie From her "Tour of JavaScript & React Patterns" course: https://t.co/IMtNtQJ48g

4

593

67

269

74K

daniel_pro4 retweeted

Aurimas Griciūnas

@Aurimas_Gr

almost 3 years ago

What is the difference between 𝗣𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻𝗶𝗻𝗴 𝗮𝗻𝗱 𝗕𝘂𝗰𝗸𝗲𝘁𝗶𝗻𝗴 𝗶𝗻 𝗦𝗽𝗮𝗿𝗸? When working with big data there are many important concepts we need to consider about how the data is stored both on disk and in memory, we should try to answer questions like: ➡️ Can we achieve desired parallelism? ➡️ Can we skip reading parts of the data? ✅ The question is addressed by partitioning and bucketing procedures ➡️ How is the data colocated on disk? ✅ The question is mostly addressed by bucketing. So what are the procedures of Partitioning and Bucketing? 𝗟𝗲𝘁'𝘀 𝘇𝗼𝗼𝗺 𝗶𝗻. 𝗣𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻𝗶𝗻𝗴. ➡️ Partitioning in Spark API is implemented by .partitionBy() method of the DataFrameWriter class. ➡️ You provide the method one or multiple columns to partition by. ➡️ The dataset is written to disk split by the partitioning column, each of the partitions is saved into a separate folder on disk. ➡️ Each folder can maintain multiple files, the amount of resulting files is controlled by the setting spark.sql.shuffle.partitions. ✅ Partitioning enables Partition Pruning. Given we filter on a column that we used to partition the dataframe by, Spark can plan to skip the reading of files that are not falling into the filter condition. 𝗕𝘂𝗰𝗸𝗲𝘁𝗶𝗻𝗴. ➡️ Bucketing in Spark API is implemented by .bucketBy() method of the DataFrameWriter class. 𝟭: We have to save the dataset as a table since the metadata of buckets has to be saved somewhere. Usually, you will find a Hive metadata store leveraged here. 𝟮: You will need to provide number of buckets you want to create. Bucket number for a given row is assigned by calculating a hash on the bucket column and performing modulo by the number of desired buckets operation on the resulting hash. 𝟯: Rows of a dataset being bucketed are assigned to a specific bucket and collocated when saving to disk. ✅ If Spark performs wide transformation between the two dataframes, it might not need to shuffle the data as it is already collocated in the executors correctly and Spark is able to plan for that. ❗️There are conditions that need to be met between two datasets in order for bucketing to have desired effect. 𝗪𝗵𝗲𝗻 𝘁𝗼 𝗣𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝘄𝗵𝗲𝗻 𝘁𝗼 𝗕𝘂𝗰𝗸𝗲𝘁? ✅ If you will often perform filtering on a given column and it is of low cardinality, partition on that column. ✅ If you will be performing complex operations like joins, groupBys and windowing and the column is of high cardinality, consider bucketing on that column. ❗️Bucketing is complicated to nail as there are many caveats and nuances you need to know when it comes to it. More on it in future posts. -------- Follow me to upskill in #MLOps, #MachineLearning, #DataEngineering, #DataScience and overall #Data space. Also hit 🔔to stay notified about new content. 𝗗𝗼𝗻’𝘁 𝗳𝗼𝗿𝗴𝗲𝘁 𝘁𝗼 𝗹𝗶𝗸𝗲 💙, 𝘀𝗵𝗮𝗿𝗲 𝗮𝗻𝗱 𝗰𝗼𝗺𝗺𝗲𝗻𝘁! Join a growing community of Data Professionals by subscribing to my 𝗡𝗲𝘄𝘀𝗹𝗲𝘁𝘁𝗲𝗿: https://t.co/qgNCnGtF4A

Aurimas_Gr's tweet photo. What is the difference between 𝗣𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻𝗶𝗻𝗴 𝗮𝗻𝗱 𝗕𝘂𝗰𝗸𝗲𝘁𝗶𝗻𝗴 𝗶𝗻 𝗦𝗽𝗮𝗿𝗸?

When working with big data there are many important concepts we need to consider about how the data is stored both on disk and in memory, we should try to answer questions like:

➡️ Can we achieve desired parallelism?

➡️ Can we skip reading parts of the data?
✅ The question is addressed by partitioning and bucketing procedures

➡️ How is the data colocated on disk?
✅ The question is mostly addressed by bucketing.

So what are the procedures of Partitioning and Bucketing? 𝗟𝗲𝘁'𝘀 𝘇𝗼𝗼𝗺 𝗶𝗻.

𝗣𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻𝗶𝗻𝗴.

➡️ Partitioning in Spark API is implemented by .partitionBy() method of the DataFrameWriter class.
➡️ You provide the method one or multiple columns to partition by.
➡️ The dataset is written to disk split by the partitioning column, each of the partitions is saved into a separate folder on disk.
➡️ Each folder can maintain multiple files, the amount of resulting files is controlled by the setting spark.sql.shuffle.partitions.

✅ Partitioning enables Partition Pruning. Given we filter on a column that we used to partition the dataframe by, Spark can plan to skip the reading of files that are not falling into the filter condition.

𝗕𝘂𝗰𝗸𝗲𝘁𝗶𝗻𝗴.

➡️ Bucketing in Spark API is implemented by .bucketBy() method of the DataFrameWriter class.

𝟭: We have to save the dataset as a table since the metadata of buckets has to be saved somewhere. Usually, you will find a Hive metadata store leveraged here.
𝟮: You will need to provide number of buckets you want to create. Bucket number for a given row is assigned by calculating a hash on the bucket column and performing modulo by the number of desired buckets operation on the resulting hash.
𝟯: Rows of a dataset being bucketed are assigned to a specific bucket and collocated when saving to disk.

✅ If Spark performs wide transformation between the two dataframes, it might not need to shuffle the data as it is already collocated in the executors correctly and Spark is able to plan for that.

❗️There are conditions that need to be met between two datasets in order for bucketing to have desired effect.

𝗪𝗵𝗲𝗻 𝘁𝗼 𝗣𝗮𝗿𝘁𝗶𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝘄𝗵𝗲𝗻 𝘁𝗼 𝗕𝘂𝗰𝗸𝗲𝘁?

✅ If you will often perform filtering on a given column and it is of low cardinality, partition on that column.
✅ If you will be performing complex operations like joins, groupBys and windowing and the column is of high cardinality, consider bucketing on that column.

❗️Bucketing is complicated to nail as there are many caveats and nuances you need to know when it comes to it. More on it in future posts.

--------

Follow me to upskill in #MLOps, #MachineLearning, #DataEngineering, #DataScience and overall #Data space.

Also hit 🔔to stay notified about new content.
𝗗𝗼𝗻’𝘁 𝗳𝗼𝗿𝗴𝗲𝘁 𝘁𝗼 𝗹𝗶𝗸𝗲 💙, 𝘀𝗵𝗮𝗿𝗲 𝗮𝗻𝗱 𝗰𝗼𝗺𝗺𝗲𝗻𝘁!

Join a growing community of Data Professionals by subscribing to my 𝗡𝗲𝘄𝘀𝗹𝗲𝘁𝘁𝗲𝗿: https://t.co/qgNCnGtF4A

3

368

104

216

53K

daniel_pro4 retweeted

Math Hub

@mathhub_vn

almost 3 years ago

Master React in 10 Hours: Build and Deploy 4 Real-World Apps #react #javascript https://t.co/vLAOhcIRAm

0

246

62

146

19K

daniel_pro4 retweeted

George Moller

@_georgemoller

almost 3 years ago

❌ Don't import server components directly from client components ✅ Instead send them as props to client components

7

435

67

120

43K

daniel_pro4 retweeted

Wes Bos

@wesbos

almost 3 years ago

Here is a neat API that doesn't mess with your async/await when working with Web Streams. The first response of fetch() is a Web Stream! Great for logging progress of a large download. .clone() a stream, then log the progress and then return the original

wesbos's tweet photo. Here is a neat API that doesn't mess with your async/await when working with Web Streams.

The first response of fetch() is a Web Stream! Great for logging progress of a large download.

.clone() a stream, then log the progress and then return the original https://t.co/U0Hj3tuw9U

12

531

60

282

71K

daniel_pro4 retweeted

DevTalles @DevTalles

almost 3 years ago

📝 ¡Raúl Luján de la comunidad DevTalles nos comparte una aplicación practica que realiza un login completo, utilizando #deeplink para restablecer contraseñas Este ejemplo sigue la estructura de nuestros cursos de Flutter 😉Les dejo el enlace en el hilo: https://t.co/tzRMY4YqLo

1

30

5

7

2K

daniel_pro4 retweeted

Héctor de León (El loco de los perros) ⛧

@powerhdeleon

almost 3 years ago

Curso de C# GRATIS Puede dar RT, puede servir a uno de tus seguidores: https://t.co/X0OL39AIEB

6

276

115

59

24K

daniel_pro4 retweeted

Fernando Herrera

@Fernando_Her85

over 3 years ago

Practicas en general gratuitas (YouTube) https://t.co/a70ypn9AwZ Tip: si buscan un video en YouTube y ponen un "-" entre la t y la u, se muestra en pantalla completa y sin anuncios.

Fernando_Her85's tweet photo. Practicas en general gratuitas (YouTube)
https://t.co/a70ypn9AwZ

Tip: si buscan un video en YouTube y ponen un "-" entre la t y la u, se muestra en pantalla completa y sin anuncios. https://t.co/ZKbTYjgoDC

6

248

19

50

23K

daniel_pro4 retweeted

Gabriel Vergnaud @GabrielVergnaud

over 3 years ago

TypeScript Tip 👇 Writing _type guard functions_ by hand is cumbersome and unsafe. Consider using TS-Pattern instead, it's terser, safer and narrows types for you:

GabrielVergnaud's tweet photo. TypeScript Tip 👇

Writing _type guard functions_ by hand is cumbersome and unsafe.

Consider using TS-Pattern instead, it's terser, safer and narrows types for you: https://t.co/nrnIGGG0AQ

6

362

30

191

66K

daniel_pro4 retweeted

Abraham John 🦄🦓

@Abmankendrick

over 3 years ago

If you are a UI/UX designer and looking for wireframing software to create your wireframes. Here are some 05 amazing wireframe tools that can help you when designing. I trust you're having a great weekend! Retweet are highly appreciated 💜

Abmankendrick's tweet photo. If you are a UI/UX designer and looking for wireframing software to create your wireframes. Here are some 05 amazing wireframe tools that can help you when designing. I trust you're having a great weekend!

Retweet are highly appreciated 💜 https://t.co/6uDxr684ac

31

905

299

932

97K

Daniel Pérez Atanacio

@daniel_pro4

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users