Data backup thread

Stadtaffe

Woodpecker
Orthodox
Gold Member
About a week ago I did something I've been meaning to do for a while - synced my whole linux user folder to an encrypted usb hard drive in a clean way. I don't bother with the system partition as all that stuff is replaceable, but what is now backed up is very valuable and it could avert disaster having it there now. This is also relevant as many of us don't want to use Google or any other clouds.

I thought a thread here might help us encourage and inspire each other to back up as it is easy to put off. Certainly the mechanical hard drives are capable of mechanical failure, not sure if the SSD's can suffer a similar fate, but there's always the possibility of theft or loss and in linux, you can accidentally just type in the wrong command and it's all gone.

When I do a --dry-run and the first sync will post the experience of that.

About 6 months ago I accidentally deleted a file. Realised immediately, and was able to recover it with a special command, although it took a while. Same thing happened a few weeks ago, but didn't realise for a while and it was gone for good. So that was the inspiration to get on with this neglected but important task.

So I was going past an electronics store and called in to get an additional drive. There was a choice between 1, 2, 3, 5 or 7 TB. My user partition is 400MB but I decided to go with 2TB so there is a bit spare for things like the monero blockchain and who knows what else. It was about €70. The larger sizes were also physically fatter, as well as more expensive. I imagine they are for people who like to keep a collection of full length movies.

I'm not proud to say that this was the first time I ever created an encrypted folder in linux. Just one big partition on the drive, possibly safer than several. Used this guide:


The important commands are:

sudo apt install cryptsetup
sudo cryptsetup --verbose --verify-passphrase luksFormat /dev/sdb1
sudo cryptsetup luksOpen /dev/sdb1 sdb1
sudo mkfs.ext4 /dev/mapper/sdb1

Actually, the thing mounts automatically and asks for the password in the gui.

Then, I knew rsync was going to probably be the tool of choice. It is not the only thing but the one I have used a fair bit before for smaller syncs but not the whole user folder. There is a lot of customisable options with rsync, for example whether it just checks file names during a sync or hashes as well. There's a lot, I don't want to get stressed out by trying to know what all of it is. If anyone reading is an expert with it, share any thoughts. Used this guide:


..and adapted the command slightly to:
time rsync -aAXv --stats --progress /home/username/ /media/username/6b3ce0e5-491c-43d0-b15b-b81731e19721/my-PC-user-folder-synced/ && date

..that way it times itself and prints the date after, you can copy the last bit of the output to a logfile, could even pipe it there if you cared enough.

It went quicker than expected - just under 400GB took two hours. I thought it would go all night..

In keeping with the old fashioned, analogue way of doing things (no clouds), there happens to be a mechanical only key operated safe where I'm staying built into the wall and it's in there now, locked up. Who knows if it will be safe if the place burns down but anyway it's an improvement. I used to work with someone who used to do this, and he had two drives in two separate locations for redundancy.
 

aynrus

Pelican
I need stuff to be decryptable/encryptable on both Linux and Windows, so I used Veracrypt GUI tool to encrypt. cryptsetup is for Linux only.
I'm having too many headaches from Linux, traveling all the time right now I need quick working functionality without becoming a system administrator and digging through pages of debugging threads on stackexchange. Stuff definitely doesn't work out of the box much.
 

Stadtaffe

Woodpecker
Orthodox
Gold Member
I just reran the rsync command from my first post here and then did some tests. Basically I wanted to be sure about what it is doing so as not to fool myself, and also because I find the rsync manual too complicated and lack the patience to sit down with it. Can confirm the following about the behaviour of that command:
  • It seems to be additive in every sense, in that it never deletes anything in the destination folder. It does update things there however. Exactly what rule it uses I am not sure, but the important thing is that it is additive which is what I wanted.
  • So just to check the above, if you delete a file in the source folder, it will not delete in the destination.
  • If you change the name of a file in the source folder, the name in the destination folder will update.
  • If you change the contents of a file in the source folder, the contents of the file in the destination folder will change.
  • If you have done a cp on a file in the first folder to somewhere else, it will be synced in both places in the destination folder. Not sure what would happen if you did a mv in the source folder, I didn't test it, but guess it will also be in both places as the command does not seem to remove anything.
That is all rather boring, but something a bit more interesting prompted me to post here. Just discovered the fdupes command. Install with :
Code:
sudo apt install fdupes
Test it on the documents directory with
Code:
fdupes -r ~/Documents
Display manual with
Code:
man fdupes
Homepage is https://github.com/adrianlopezroche/fdupes

It is quite unbelievable. It looks at the md5 hashes of everything and tells you what is duplicated, perhaps with different names. It can delete duplicates or replace them by links. It took under five minutes to go through 400 GB and there were a lot of duplicates.

Have been looking for something like that. One day I will use it for real one day. The thing is, I organise my hard drive as a human, and just as you can clutter up your appartment, you can clutter up your hard drive, but also with redundant copies of stuff. I think the people who use Google Drive, which I have never used, probably have tools that help them with that, but if it is self-maintained, this will help.

The thing is, the hard drive fills up, so you think the best bet is to just go buy a bigger one. I have done that. But if you were not careful you could end up with a pile of them and they would effectively be filled mostly with rocks and sand, ie dead weight. You don't know anymore which one really matters..
 

soli.deo.gloria

Woodpecker
Orthodox Inquirer
Gold Member
I thought a thread here might help us encourage and inspire each other to back up as it is easy to put off. Certainly the mechanical hard drives are capable of mechanical failure, not sure if the SSD's can suffer a similar fate, but there's always the possibility of theft or loss and in linux, you can accidentally just type in the wrong command and it's all gone.
An interesting thing about SSD's that many people may not know is that they can lose data if left unpowered for long periods of time (and this can happen MUCH faster if the temperature is high - as few as 7 days). It may even make sense to store them in the fridge when not in use (I'd suggest inside of a heavy-duty ziploc bag).

 

Mike0060

Sparrow
I back up my laptops HDD daily.

As well I have a local network storage running on my old desktop with an SSD boot drive and 2 HDD's for storage. Back up to this daily,

Also using sync.com for cloud backup. (People can whine about cloud storage privacy but I they're adequate)

The setup is a bit of overkill but once you setup the process and app settings for everything,

I don't have to think about anything. It's automatic.

I could learn about encrypting my files but that's a lesson for another day I guess...
 

Valentine

Kingfisher
Gold Member
rsync is superb but rclone is better suited for off-site backup because it is compatible with many cheap cloud storage hosts such as Amazon S3, Google Drive, Mega, OneDrive, Sia, etc. Just setup the crypt backend first in order to client-side encrypt your files before they're uploaded in order to maintain your files privacy.

However note that rsync and rclone are sync tools, not backup - i.e. they don't store a history of your deleted files. If you want to be able to retrieve a file, folder or your entire disk state from a specific date 3 months ago then you'll need a backup tool instead. borg and restic are great choices for this.
 

Stadtaffe

Woodpecker
Orthodox
Gold Member
@Valentine thanks for the commands, had not heard of crypt, borg, restic, rclone. Yes, probably I have been doing a sync rather than a backup, except that as mentioned above the command is additive. Probably not optimal but better than nothing and does the job.

There is a political aspect to backing up your data. OneDrive is Microsoft, you have listed all the big tech clouds. Even if you put something encrypted there, it is probably helping them, giving them a chance to play with it and try to crack it. My data is on a hard drive in a safe. Ideally there should be two copies of it in two separate buildings for redundancy in case one burns down.

If it were going to go in a cloud and I had to choose would probably go with :
or
 
I don’t encrypt my local data myself. I don’t really see the need for it unless I induvidually was to be persued by goverment agencies. No encryption also makes any data recovery from failing hardware easier.

I have my synology diskstation that auto-sync any changes in my home folder to the diskstation. The critical data on the diskstation is again automatically periodically backed up to a external usb hdd.

This way I have four copys of my data at any time. Two laptops, the diskstation and the external hdd. I’m sure this could be done better, but it has been working for me for six or seven years. Had a laptop hdd become corrupted and a lightning strike take out my synology box, hdds were fine though.
 
Top