arjun @naagarjunsa - Twitter Profile

sometimes deep research does pay off in adapting technology stack for scaling systems. ran into similar dev exp vs efficiency rabbit hole during research of setting up db infra. understanding persistent layer is so important, be it indexing, cost management or just efficiency of the software product. vibe coders can never..

Vicent Martí

@vmg

8 months ago

Most takes about ORM performance I see online are way off base. I hear about abstraction overhead (very unlikely to matter in practice) and sub-optimal queries (also quite rare, because database query planners are very good these days). The main way an ORM is going to nuke your app’s performance is with implicit transactions. Take a look at this attached graph that shows the Query Plan Types for a very busy Vitess cluster deployed on PlanetScale. You can see that roughly 30% of all queries are either planned as Begin or Commit (i.e. transaction starts and ends). This is not strange by itself, except for the fact that we’re not explicitly using transactions in our codebase at all. Not even once. So what’s going on here? Well, we’re using Prisma. That’s what’s going on. And I don’t particularly want to put Prisma on blast here because this issue affects a lot of ORMs (and I quite like the DX of Prisma in practice — it’s a good ORM overall), but let’s take a look at the implications of the way it generates some queries. ORMs must always walk a thin line between being convenient for the developers and being efficient. Prisma is definitely in the camp of “convenient”: all high level write operations, unless carefully tweaked, default to returning the data that is being modified. When you run an `insert`, `update`, `delete`, etc, the output of these APIs are the full affected rows. This is possibly an acceptable default when you’re using Postgres, because it supports `INSERT ... RETURNING` and `UPDATE ... RETURNING`, so it can return the modified row as part of the query. But this is not standard SQL and hence it’s not available in MySQL (it is on MariaDB though, if that’s your poison). If your ORM wants to have the same behavior when accessing a MySQL database, you will necessarily require every `INSERT`, `UPDATE` and `DELETE` to be in a explicit transaction followed by a `SELECT` of the last affected row. This is a performance footgun, and some very surprising behavior if you’re not aware of this and are e.g. trying to benchmark your application’s performance between Postgres and MySQL (particularly for update performance, where MySQL often blows Pg out of the water — if you’re not adding hidden transactions, that is). How do you work around this? Obviously, try not to use transactions unless you actually _need_ to. Transactions are the enemy of throughput in a hyper-scalar system (whether Vitess, Aurora Limitless, whatever). They need to be orchestrated at the proxy layer, and they pin connections to the backend, limiting other queries whether they’re transactional or not. Watch carefully what your ORM is doing, and try to design your application so it doesn’t require (and doesn’t retrieve) the full contents of a row after each insert or update. This often means having no dynamic `DEFAULT` values in the row, and generating the contents of all columns in your app. Most importantly: in MySQL, the ID of the row being inserted is always returned in a packet header as part of the wire protocol, if it’s a numeric ID. This is implicit and super efficient, so a very good reason to use integer PKs in your database. You can get very far (and insert A LOT of data per second) simply by being aware of this feature. Unfortunately, this is quite hard to do in Prisma, because the last inserted row ID is not exposed even when falling back to the raw SQL query builder. That’s the cost of having an ORM that abstracts the behaviors of different databases!

vmg's tweet photo. Most takes about ORM performance I see online are way off base. I hear about abstraction overhead (very unlikely to matter in practice) and sub-optimal queries (also quite rare, because database query planners are very good these days). The main way an ORM is going to nuke your app’s performance is with implicit transactions.

Take a look at this attached graph that shows the Query Plan Types for a very busy Vitess cluster deployed on PlanetScale. You can see that roughly 30% of all queries are either planned as Begin or Commit (i.e. transaction starts and ends). This is not strange by itself, except for the fact that we’re not explicitly using transactions in our codebase at all. Not even once.

So what’s going on here? Well, we’re using Prisma. That’s what’s going on. And I don’t particularly want to put Prisma on blast here because this issue affects a lot of ORMs (and I quite like the DX of Prisma in practice — it’s a good ORM overall), but let’s take a look at the implications of the way it generates some queries.

ORMs must always walk a thin line between being convenient for the developers and being efficient. Prisma is definitely in the camp of “convenient”: all high level write operations, unless carefully tweaked, default to returning the data that is being modified. When you run an `insert`, `update`, `delete`, etc, the output of these APIs are the full affected rows.

This is possibly an acceptable default when you’re using Postgres, because it supports `INSERT ... RETURNING` and `UPDATE ... RETURNING`, so it can return the modified row as part of the query. But this is not standard SQL and hence it’s not available in MySQL (it is on MariaDB though, if that’s your poison). If your ORM wants to have the same behavior when accessing a MySQL database, you will necessarily require every `INSERT`, `UPDATE` and `DELETE` to be in a explicit transaction followed by a `SELECT` of the last affected row.

This is a performance footgun, and some very surprising behavior if you’re not aware of this and are e.g. trying to benchmark your application’s performance between Postgres and MySQL (particularly for update performance, where MySQL often blows Pg out of the water — if you’re not adding hidden transactions, that is).

How do you work around this? Obviously, try not to use transactions unless you actually _need_ to. Transactions are the enemy of throughput in a hyper-scalar system (whether Vitess, Aurora Limitless, whatever). They need to be orchestrated at the proxy layer, and they pin connections to the backend, limiting other queries whether they’re transactional or not.

Watch carefully what your ORM is doing, and try to design your application so it doesn’t require (and doesn’t retrieve) the full contents of a row after each insert or update. This often means having no dynamic `DEFAULT` values in the row, and generating the contents of all columns in your app.

Most importantly: in MySQL, the ID of the row being inserted is always returned in a packet header as part of the wire protocol, if it’s a numeric ID. This is implicit and super efficient, so a very good reason to use integer PKs in your database. You can get very far (and insert A LOT of data per second) simply by being aware of this feature.

Unfortunately, this is quite hard to do in Prisma, because the last inserted row ID is not exposed even when falling back to the raw SQL query builder. That’s the cost of having an ORM that abstracts the behaviors of different databases!

5

163

14

84

19K

0

173

naagarjunsa retweeted

Vicent Martí

@vmg

8 months ago

Most takes about ORM performance I see online are way off base. I hear about abstraction overhead (very unlikely to matter in practice) and sub-optimal queries (also quite rare, because database query planners are very good these days). The main way an ORM is going to nuke your app’s performance is with implicit transactions. Take a look at this attached graph that shows the Query Plan Types for a very busy Vitess cluster deployed on PlanetScale. You can see that roughly 30% of all queries are either planned as Begin or Commit (i.e. transaction starts and ends). This is not strange by itself, except for the fact that we’re not explicitly using transactions in our codebase at all. Not even once. So what’s going on here? Well, we’re using Prisma. That’s what’s going on. And I don’t particularly want to put Prisma on blast here because this issue affects a lot of ORMs (and I quite like the DX of Prisma in practice — it’s a good ORM overall), but let’s take a look at the implications of the way it generates some queries. ORMs must always walk a thin line between being convenient for the developers and being efficient. Prisma is definitely in the camp of “convenient”: all high level write operations, unless carefully tweaked, default to returning the data that is being modified. When you run an `insert`, `update`, `delete`, etc, the output of these APIs are the full affected rows. This is possibly an acceptable default when you’re using Postgres, because it supports `INSERT ... RETURNING` and `UPDATE ... RETURNING`, so it can return the modified row as part of the query. But this is not standard SQL and hence it’s not available in MySQL (it is on MariaDB though, if that’s your poison). If your ORM wants to have the same behavior when accessing a MySQL database, you will necessarily require every `INSERT`, `UPDATE` and `DELETE` to be in a explicit transaction followed by a `SELECT` of the last affected row. This is a performance footgun, and some very surprising behavior if you’re not aware of this and are e.g. trying to benchmark your application’s performance between Postgres and MySQL (particularly for update performance, where MySQL often blows Pg out of the water — if you’re not adding hidden transactions, that is). How do you work around this? Obviously, try not to use transactions unless you actually _need_ to. Transactions are the enemy of throughput in a hyper-scalar system (whether Vitess, Aurora Limitless, whatever). They need to be orchestrated at the proxy layer, and they pin connections to the backend, limiting other queries whether they’re transactional or not. Watch carefully what your ORM is doing, and try to design your application so it doesn’t require (and doesn’t retrieve) the full contents of a row after each insert or update. This often means having no dynamic `DEFAULT` values in the row, and generating the contents of all columns in your app. Most importantly: in MySQL, the ID of the row being inserted is always returned in a packet header as part of the wire protocol, if it’s a numeric ID. This is implicit and super efficient, so a very good reason to use integer PKs in your database. You can get very far (and insert A LOT of data per second) simply by being aware of this feature. Unfortunately, this is quite hard to do in Prisma, because the last inserted row ID is not exposed even when falling back to the raw SQL query builder. That’s the cost of having an ORM that abstracts the behaviors of different databases!

5

163

14

84

19K

arjun @naagarjunsa

9 months ago

@trashh_dev watch zubimendi score a banger 🥹

0

87

arjun @naagarjunsa

9 months ago

two years since sapta released; time does fly always felt the lyrics and ambience of the song die for you by joji feel apt to the mood of certain aspects of the movie here is an interpretation of dreams and motifs in sapta sagaradaache ello as a short music video. @hemanthrao11 @rakshitshetty

0

4

0

123

arjun @naagarjunsa

9 months ago

@GabbbarSingh someone please tell me this is AI

0

3K

arjun @naagarjunsa

9 months ago

@sriniously damn this made sense

0

29

arjun @naagarjunsa

10 months ago

@omarsar0 should try this sounds fun

0

187

arjun @naagarjunsa

10 months ago

@arpit_bhayani nobody has

0

699

arjun @naagarjunsa

10 months ago

@GadagHeritage @HKPatilINC @SpGadag @GadagZoo @ZP_Gadag @DIPRGadag @dcfgadag @hublimandi this is amazing.

0

1

0

65

arjun @naagarjunsa

10 months ago

@pmddomingos im working on db migration and this is my new nightmare.

0

3

0

318

arjun @naagarjunsa

10 months ago

@Keshavatearth isn't that x pro?

1

0

29

arjun @naagarjunsa

10 months ago

@sriniously what a great read. well said. I was grappling with this and the post arrived at the right time. your work on YouTube is great btw.

0

1

0

79

arjun @naagarjunsa

10 months ago

@jorandirkgreef - garbage collection is heavily solved for which inturn helps choice of (not so heavy) runtime -> efficient use of resources.

0

1

0

234

arjun @naagarjunsa

10 months ago

caught #SuFromSo today. the emotional core struck me, loved the performances… the project had a soul which was unmistakably there. i so rooted for the win in the end, and characters were so believable despite the seemingly absurd situation comedy going full throttle. and the music was so fresh and so good, we are past the ravi basrur loud thumping soundtracks for middle aged movie goer in me. so the formula to take away is authentic gen z inspired rooted stories with emotional core and believable characters. fun stuff. worth watching for sure.

0

12

0

981

arjun

@naagarjunsa

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users