That is a much stronger guarantee than your documentation currently claims. One site falling over and being rebuilt without loss is great. One site losing power, corrupting the local state, then propagating that corruption to the rest of the cluster would not be fine. Different behaviours.
I think this is one where the behaviour is obvious to you but not to people first running across the project. In particular, whether power loss could do any of:
- you lose whatever writes to s3 haven't finished yet, if any
- the local node will need to repair itself a bit after rebooting
- the local node is now trashed and will have to copy all data back over
- all the nodes are now trashed and it's restore from backup time
I've been kicking the tyres for a bit and I think it's the happy case in the above, but lots of software out there completely falls apart on crashes so it's not generally a safe assumption. I think the behaviour is sqlite on zfs doesn't care about pulling the power cable out, lmdb is a bit further down the list.