Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have a script that creates a hash based on all files in a directory - photos 2004. Then save the hash separately to a text file.

I have 3 copies so I can check the archive version, active storage volume, and local version to see if any lost integrity in the transfer process.

I’m curious how it would compare against my old CDs and DVDs that were previous backups. My work does something similar for tape drive data.



If you are willing to sacrifice some storage space on the disk, then dvdisaster (https://en.wikipedia.org/wiki/Dvdisaster) can add extra ECC data to the disk that will allow recovery even if some percentage of the disk errors out upon read later.

Granted, if one no longer has the mechanical drive, or if the disk errors out beyond the threshold where the extra ECC can correct the errors, the data's still lost. But it (dvdisaster) does provide some protection from the "bit-rot" case where the disk slowly degrades.


Par2 is also very good for resilient storage. It uses parity files that can reconstruct bitrotted files. https://en.wikipedia.org/wiki/Parchive


DVDs use Reed–Solomon coding, so they effectively store a hash and recovery data for you. When a sector is irrecoverable, reading that sector fails.


For this purpose, I think it would be nice to access the raw data, to see any errors that would be otherwise masked. As someone in the comments suggested, one might compare number of corrected errors in 1, 2, 5 years and compare to the number of redundant bits stored to estimate the expected longevity of the medium


dvdisaster might already be able to do this analysis.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: