Handling the transfer of 25tb of data
May 24, 2008 at 10:51 pmSo, your RAID becomes slightly unreliable, you’ve spent 2 months using rsync to transfer 25m files off the failing array, and having rebuilt it, you want to transfer those files back, in a faster way. How to go about it?
The above is a problem that I’m currently dealing with, and I found something of a hacky, yet elegant solution to this. I’m a huge believer in lftp, it’s a brilliant piece of software, and happily lends itself to many different situations. lftp also supports permissions/ownership setting on files it creates, which is a key feature in handling the above problem; along with all of that, it parallelizes transfers particularly well, meaning that small files aren’t stalled because of large many-gb files slowly transferring.
Unfortunately, I discovered that lftp doesn’t handle directory ownership/permissions _at all_. My original idea was just to set off lftp, walk away and in a few days marvel at the 25tb that had been transferred; however, I really needed to maintain the perms across the board. rsync came into play again:
rsync -avz --stats --progress --include "*/" --exclude "*" ip.address.goes.here:/path/to/files/* /path/
The above command creates just the directories that exist on the other end, skipping all files - this sets up the structure that I need for lftp to work.
Once that’s completed, it’s a simple case of:
lftp sftp://root@ip.address.here -e "mirror -c --parallel=50 --allow-chown --allow-suid /path/to/files ./"