Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This seems fundamentally unfixable with current ORMs. You either have to pick between lazy loading per entity, or eager load and carthesian product everything, which breaks badly if you are dealing with multiple one-to-many relations.

Our solution was to write our own ORM-like system that “remembers” which entities you’ve loaded within the context of the transaction, and then will fetch the relation for all of them. So if you access a1.b then it will also fetch a2.b and cache it if you loaded a2 within that same transaction. The call to a2.b will then resolve instantly. So instead of n queries with n being the number of entities loaded in lazy loading, you’re doing n queries with n being the number of relations touched.

The one bad failure case is if you update entities one-by-one but also accessing their relations in a loop since you invalidate the cache for that entity type on every write.



> So if you access a1.b then it will also fetch a2.b and > cache it if you loaded a2 within that same transaction.

Ah ha! I'd implemented the same trick in this older Java ORM:

https://github.com/stephenh/joist-java/blob/master/domain/sr...

It worked well!


Django (python) does lazy loading by default, cartesian product with "select_related()", and has a third option called "prefetch_related()" where it does only 2 queries then does the join programmatically for you instead of in the database. It's kind of like your custom system except you do have to specify ahead of time which fields to prefetch.

It's also had this for over a decade so I have to wonder how rare it actually is in ORMs in general...


Interesting. Does it do this separate query for all entities in scope, or per entiry like Hibernates SELECT FetchMode? I find a separate SELECT is usually possible, but doesn’t quite solve this in the general case.

Perhaps I missed Django, but as far as I could tell, Hibernate and jooq can’t do this, Ecto can’t do it, ActiveRecord can’t, and the entirety of ORMs for JS can’t (TypeORM, Prisma, Drizzle, Sequelize and a bunch more).


I'm not sure I understand the question. If I'm reading the Hibernate examples right, I already answered that in my first comment: Lazy loading (separate query for each row, 1+N problem) is the default, one additional query per each field with prefetch_related() (1+1 if you only need 1 relationship, 1+2 for 2 relationships, etc).

Django also specifies foreign keys on the model, which are used by name when doing queries. From their examples in prefetch_related() [0], this query would involve 2 levels of lazy loading, except prefetch_related() tells Django to do 3 total queries and join them together in code:

  Restaurant.objects.prefetch_related("best_pizza__toppings")
[0] https://docs.djangoproject.com/en/5.0/ref/models/querysets/#...


jOOQ doesn't get involved in any such prefetch/eager fetch shenanigans, which are often wrong. They work for some cases and do too little/too much for others. So, specifying the exact data to fetch in every query is a much better approach, ideally with nested collections: https://blog.jooq.org/jooq-3-15s-new-multiset-operator-will-...


Not really true. You can have separate queries to avoid cartesian explosions. EF Core implements it and its standard practice to use - but not enabled by default, because if you don't order by something unique you'll have a bad time.

https://learn.microsoft.com/en-us/ef/core/querying/single-sp...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: