I still don't get why they didn't separate clients on a database level. Sure, put many clients on one database server to save resources. But why not use different databases? They cost nothing and provide perfect separation. It also drastically lowers the attack surface as you can set all permissions via database software. And if they had done that, this would've never been a multi-day outage.
If Jira was a product used by individuals I'd get it. Maybe a database is overkill for a sole developer. But pretty much all users of Jira are companies with tens or hundreds of users on average. I don't see how separating on a db level is overkill in that situation.
Using separate databases or schemas per tenant comes with the following problems
* Managing schema migrations across every DB
* You cant query across the DB, want to know some cross tenant thing for ops? That's now a lot harder
* Connection pooling and resource usage can be harder to manage
Most systems I've worked on use a single DB with a `tenant_id` col on every relevant table, it's easy to have your query builder slap in the auth'd tenant I'd. This approach does come with issues like saving and restoring an individual tenants data
> why not use different databases? They cost nothing and provide perfect separation.
I understand the sentiment, but This is a pretty simplistic take that I very much doubt will hold true for meaningful traffic. Many databases have licensing considerations that arent amenable. Beyond that you get in to density and resource problems as simple as IO, processes, threads etc. But most of all theres the time and effort burden in supporting migrations, schema updates, etc.
Yes layered logical separation is a really good idea. Its also really expensive once you start dealing with organic growth and a meaningful number of discrete customers.
Disclaimer: Principal at AWS who was helped build and run services with both multi tenant and single tenant architectures.
Don't you usually license based on server resources? Or do you know really have to pay per database/schema? At least on-prem licenses tend to be based on resource usage, not on the number of databases or schemas. I'm not talking about different db processes, just databases/schemas within a database.
And for migrations and schema updates I'd see this as a huge advantage. Migrating customers one by one is much easier than everyone at once. You also never have the issue that operations at one customer could cause a global lock affecting other customers.
Of course resource sharing isn't easy in this scenario, but you'd never want to connect data between customers anyway so I don't see the issue with that.
But maybe it works harder in a cloud environment where more is abstracted away.
Ah, when you said "database" I assumed you meant a dedicated single tenant instance of an RDMBS (or similar), and not necessarily something like dedicated tables. I will admit to being a decade out of touch with the vagaries of "processor", server, and client access licensing. In my relevant past I've only worried about (RDS/EMR/Redshift/etc) instances and tables.
Very fair call out on having more granular, discrete, instances for things like DML/schema updates and expensive queries. I love fault isolation and have had many sad days oncall when we exceeded the capabilities of The Database.
I wouldnt say it's harder because it's more abstract. I think the general motivation is to desperately avoid anything that scales cost/effort with the number of users. Even if it's sublinear a team can really drown under the cost of scaling up a service. And that's a serious consideration when a baseline expectation is to go from 0 to 10,000 or 50,000 active customers in just a few years. The care and feeding of (for example) 10 multi tenant partitions is just simpler than having to monitor & operate 10,000 independent databases with wildly divergent usage profiles. I will grant this hyper growth is not a common scenario for the industry, or if it is then its "one of them good problems."
I'd also say I have worked on a project that did have independent data tables for each customer instance. And we spent a meaningful amount of time abstracting away table creation/migration/etc, a common DAL that abstracted away the multitude of tables, common monitoring, etc. It has made some things around data migration & management easier but I honestly don't know if it's more efficient than multi tenant clusters in the long term. But the only way the economics and operational effort has worked is by going "all in" on using "serverless" technologies that efficiently scale to zero and have no carrying cost when idle
If Jira was a product used by individuals I'd get it. Maybe a database is overkill for a sole developer. But pretty much all users of Jira are companies with tens or hundreds of users on average. I don't see how separating on a db level is overkill in that situation.