Am I misreading or did he create a non-redundant pool spanning the 2 SSD drives? I don't think scrubbing will keep him from losing data when one of those drives fails or gets corrupted.
Edit: Looked again and he's getting redundancy by running multiple NASes and rsyncing between them. Still seems like a risky setup though.
In the just-released 2.2.0 you can correct blocks from a remote backup copy:
> Corrective "zfs receive" (#9372) - A new type of zfs receive which can be used to heal corrupted data in filesystems, snapshots, and clones when a replica of the data already exists in the form of a backup send stream.
> I don't think scrubbing will keep him from losing data when one of those drives fails or gets corrupted.
It pains me to see a ZFS pool with no redundancy because instead of being "blissfully" ignorant of bit rot, you'll be alerted to its presence and then have to attempt to recover the files manually via the replica on the other NAS.
I appreciate that the author recognizes what his goals are, and that pool level redundancy is not one of them, but my goals are very different.
In my setup I combine one machine with ZFS with another running btrsfs, therefore using rsync/restic. And Apple devices uses APFS. I'd rather use ZFS solely (or in future: bcachefs) but unfortunately Apple went w/their native next-gen CoW filesystem and I don't think ZFS is available for Synology DSM. Though perhaps I should simply replace DSM with something more my liking (Debian-based, Proxmox).
On the other hand, I myself use a similar approach. For home purpose I prefer having redundancy across several cheaper machines without raid1,10,5,6,z than a more expensive single machine with disk redundancy. And I'd rather pay the additionnal money on putting ECC ram on all machines.
For home-use this is a reasonable tradeoff. Imagine my non tech-savvy wife for some reason has to get access to the data when one of the NAS has malfunctioned because the zfs pool encryption was badly configured. Explaining "zfs receive" or let alone what zfs, linux and ssh is going to be grounds for divorce. Heck, i don't even want to read man pages about zfs myself on weekends, i have enough such problems at work. Besides, you still want 2 physical locations.
It's going to be less optimal and less professional. That's ok, for something as important as backups, keep it boring. Simply starting up the second box is simple, stupid and has well-understood failure modes. Maybe someone like me should just buy an off-the shelf NAS.
That seems like a pretty reasonable plan actually. If I ever do a NAS I was thinking to have one disk for storing files, and a pair of of disks for backing up both the first disk and my laptop.
That way everything has 3 copies, but I'm not backing up a compressed backup that might not deduplicate well and might be a little excessive to have 4 copies.
It wasn't mentioned in the blog, but you can set `copies=N` on ZFS filesystems and ZFS will keep N copies of user data. This provides redundancy that protects against bitrot or minor corruption, even if your zpool isn't mirrored. Of course, it provides no protection against complete drive failure.
Clustering filesystems are old. RedHat at some point took over the company behind GFS perhaps worth looking into. I've also seen people in edge computing space go for Longhorn instead of Ceph. But I don't know anything about it apart that Ceph is available in Proxmox.
My homelab with Turing Pi 2 is running RPi4 CM and an Nvidia Jetson. Each also have a small SSD (one native, two miniPCI to SATA, and one USB). They could be used with k8s or a clustering filesystem but haven't played with it as of yet.
> ceph is never the answer. Its a nice idea, but in practice its just not that useful.
Depends. At my last job we used it for our OpenStack block/object store and it was performant enough. When we started it was HDD+SSD, but after a while (when I left) the plan was to go all-NVMe (once BlueStore became a thing).
My understanding after reading and testing is that we're talking at least two orders of magnitude difference unless you have a lot of disks (certainly more than the 6 I tried with) and quite beefy hardware.
Haven't tried all-SSD cluster yet though, only spinning rust.
Edit: Looked again and he's getting redundancy by running multiple NASes and rsyncing between them. Still seems like a risky setup though.