> Yes, correct, because SQL is awful at representing business logic, and foreign key constraints are business logic!
This strikes me as a unnecessary all-or-nothing argument. I would agree that you shouldn't put all your business logic in the database but that doesn't mean you shouldn't put any business logic in your database. If foreign key constraints are business logic then so are table definitions and column types. It's not awful for those things as it's literally built right in.
My advice is to ensure, as best as possible, that invalid states are impossible to represent. If you can't have an address without being attached to a client, make that impossible to represent in the database. It's one less thing you'll ever have to worry about.
> This strikes me as a unnecessary all-or-nothing argument
I completely understand why it comes across that way. HN Comment length is limited and the current practice is a comfy local optimum, so any one change away from it is going to feel bad, and it's hard to get across my full theory of database design in comment form. I would not recommend most existing teams simply stop using foreign keys, unless they also adapted a lot of other changes that coexist with it. Still, there is a better world for database design, and that world probably does not use foreign key constraints.
> I would agree that you shouldn't put all your business logic in the database but that doesn't mean you shouldn't put any business logic in your database.
I actually would (lightly) argue that does mean you shouldn't put any business logic in your database, as I find that highly coherent design philosophy leads to highly coherent code. The most important question to me is about how the system can handle change over time. Foreign keys are one part of a larger set of "design principles" that everyone seems to take for granted that lead to a system that cannot handle change.
> If foreign key constraints are business logic then so are table definitions and column types.
I disagree on this one. Table definitions and column types are a "level below" business logic, as evidenced by the fact that you could probably refactor a lot of schemas for better/different performance characteristics without changing any of the actual business logic that drive them. The business logic doesn't (shouldn't) care whether you've used a bitfield or a bunch of boolean columns or a child set-inclusion table to represent a bunch of yes/no facts about some entity. They're "persistence logic", and the database is exactly the right place for that.
> My advice is to ensure, as best as possible, that invalid states are impossible to represent.
The problem is that an "invalid state" is highly dependent on your perspective. A state that is invalid to use in one part of your codebase is perfectly valid in other parts. The ubiquity of "draft", "draft final", and "final" shows that humans want to represent "invalid states" all the time! Aren't you glad your IDE lets you represent invalid code? Okay, fine, that data is not ready for the annual submittal, but does that mean I'm really not allowed to save it at all?
And these "invalid states" are extremely volatile as your focus expands around your domain. All contact info must have a primary contact method -- well, unless they're just "contact stubs". All facilities need a facility manager -- well, unless they're 3rd party facilities, or storage facilities, or in Region 9, or start with the letter 'P'.
Haven't you been 300 hours into the development of a system when suddenly someone springs on you, "Oh yeah, Bob has two bosses. Why should that be a problem?" And you're screaming internally, "Of course it's a problem! You assured me six months ago that everyone only has one supervisor, and now you're telling me this one guy has two!? This changes EVERYTHING!"
But why? Why does it change everything? Why does every little change to our database schemas always seem to propagate to the entire system? It's because we're designing databases wrong. And foreign keys are a (small) part of that wrongness.
> The problem is that an "invalid state" is highly dependent on your perspective. A state that is invalid to use in one part of your codebase is perfectly valid in other parts. The ubiquity of "draft", "draft final", and "final" shows that humans want to represent "invalid states" all the time! Aren't you glad your IDE lets you represent invalid code? Okay, fine, that data is not ready for the annual submittal, but does that mean I'm really not allowed to save it at all?
It's perfectly possible to define a schema that can represent draft or incomplete data, syntactically invalid code (as a VARCHAR/TEXT), etc. while still also enforcing that a complete and valid domain object must be available when the record is in a non-draft state.
> Haven't you been 300 hours into the development of a system when suddenly someone springs on you, "Oh yeah, Bob has two bosses. Why should that be a problem?" And you're screaming internally, "Of course it's a problem! You assured me six months ago that everyone only has one supervisor, and now you're telling me this one guy has two!? This changes EVERYTHING!"
Schema updates and migrations should not be this difficult in any reasonably designed system.
> Haven't you been 300 hours into the development of a system when suddenly someone springs on you, "Oh yeah, Bob has two bosses. Why should that be a problem?" And you're screaming internally, "Of course it's a problem! You assured me six months ago that everyone only has one supervisor, and now you're telling me this one guy has two!? This changes EVERYTHING!"
I have had this issue and it does change everything -- all the systems that depend on that data, the calculations involved, the user interface. The change in the database schema is actually relatively minor compared to the rest of the code. But with an enforced schema there's a never a case where it will be wrong -- the code with either work all the way one way or the other but never both at the same time.
Neither of examples provided has anything to do with foreign key relationships. If Bob can have two bosses, that's changing a column into a relation. That's going to have an effect beyond the database. If the facilities manager is optional then that's changing a column to be nullable. Again most the work after that change is outside of the database.
Even if drafts or temporary invalid states exist, I'm never going to want orphaned data. You said table definitions are column types are a "level below" business logic but the table definition is directly tied to what you're trying to store and how you're storing it. I think I would simply argue that foreign key definitions are also part of that "level below". It might be that not all facilities need a facility manager but if one is specified it better exist and you better not be able to remove it without affecting the facility.
I have a bug right now where in some race-condition situation the database is catching the issue with the constraints. I haven't found the bug yet (it's low priority) but at least it's not filling my database with invalid data -- it is just crashing that operation.
> I have a bug right now where in some race-condition situation the database is catching the issue with the constraints. I haven't found the bug yet (it's low priority) but at least it's not filling my database with invalid data -- it is just crashing that operation.
This right here. The thing that all proponents for nosqls and "this one trick
will turn sql hell into sql heaven" don't seem to take into account is that a
properly designed schema will save your ass when the front end(s) have bugs;
and yes, properly designing a schema is work, up-front work that needs to be
thought about before writing the first line of code (or sql).
Thanks for posting this, at least I know I'm not the only one that feels
this way.
For whatever it is worth, I have reached the same conclusion and I have been building systems like you describe for the last few years.
Recently I changed jobs. In the new team they love their ORM with all the foreign keys and direct mapping between business logic data objects and the database schema.
Needless to say, it is a mess. Despite all the automated tooling they have around migrations that "should make it easy" to change things, the reality is that the whole application depends on what the database schemas look like and migrating the existing representation to a new one would break everything.
This has become an elephant in the room that nobody talks about and everyone works around.
This strikes me as a unnecessary all-or-nothing argument. I would agree that you shouldn't put all your business logic in the database but that doesn't mean you shouldn't put any business logic in your database. If foreign key constraints are business logic then so are table definitions and column types. It's not awful for those things as it's literally built right in.
My advice is to ensure, as best as possible, that invalid states are impossible to represent. If you can't have an address without being attached to a client, make that impossible to represent in the database. It's one less thing you'll ever have to worry about.