Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

(not the OP but...) I have had 3 cases in the last year where a postgres instance with less than millions of rows per table has decided to join with fancy hash algorithms that result in tens of seconds per query instead of the 5ms that it would take when it uses nested loops (i.e. literally start with the table in the from clause, apply some where clause, join to next table, apply more where clause, join to next table, and so on)

I do believe the planner was coming up with vast mis-estimates in some of those cases. 2 of the 3 were cases where the fully joined query would have been massive, but we were displaying it in a paged interface and only wanted 100 rows at a time.

One was a case where I was running a “value IN (select ...)” subquery where the subquery was very fast and returned a very small number of rows, but postgres decided to be clever and merge that subquery into the parent. I fixed that one by running two separate queries, plugging the result of the first into the second.

For one of the others, we actually had to re-structure the table and use a different primary key that matched the auto-inc id column of its peer instead of using the symbolic identifier (which was equally indexed). In that case we were basically just throwing stuff at the wall to see what sticks.

I have no idea what we’d do if one of these problems just showed up suddenly in production, which is kind of scary.

I’m sure the postgres optimizer is doing nice things for us in places of the system that we don’t even realize, but I’m sorely tempted to just find some way to disable it entirely and live with whatever performance we get from nested loops. Our data is already structured in a way that matches our access patterns.

The most frustrating part of it all is how much time we can waste fighting the query planner when the solution is so obvious that even sqlite could handle it faster.

For context, I’ve only been using postgres professionally for about a year, having come from mysql, sql server, and sqlite, and I’m certainly still on the learning curve to figure out how the planner works and how to live with it. Meanwhile, postgres feature set is so much better than mysql or sql server I’d never consider going back.



The feature set from an application perspective is killer. I love window functions especially; all sorts of clever things can be done in a single query which would otherwise require painful self-joins or multiple iterated queries and application-side joins in less sophisticated dialects.


My favorite killer feature is jsonb_agg / jsonb_object_agg which let me pull trees of data in a single query without the exponential waste you’s get from cartesian products, and even deliver it to the frontend without needing to assemble the json myself.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: