Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My question was more about hard splits: what to do when you have 2 DC's and they lose connectivity between them?

The larger pool can still have majority, but the smaller pool can not. Does the smaller pool change membership (because it can't see the larger pool) and continues on it's own? What if the smaller pool has a cold start and has no idea about the larger pool?

My point is that there needs to be some consensus on pool size to decide what constitutes a majority vote.

Implementations can make reasonable assumptions and rules for this, but afaict it's not covered by the protocol itself.



If you lose connectivity then the non-majority pool will just continually try to reconnect to the larger pool, which will happily continue on as it has a majority. The non-majority pool will not change membership to itself. Membership changes require consensus among the original cluster participants, which the non-majority pool cannot achieve. If connection is re-established the protocol gracefully handles bringing the nodes up to speed on any transactions they missed. If the majority pool changes membership to kick out the non-majority pool nodes, when reconnection is established they'll just be ignored.


A 2 datacenter configuration doesn't give you much if you want to survive a temporary outage of said datacenters. Observe that if you wanted to survive the loss of the dc with the minority of nodes you could place all nodes in the other dc. This is one of the reasons aws tries to have three availability zones per region. In practice things are messier than this because dcs don't always fail as a unit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: