Finally have btrfs setup in RAID1

A little under 3 years ago, I started exploring btrfs for its ability to help me limit data loss. Since then I’ve implemented a snapshot script to take advantage of the Copy-on-Write features of btrfs. But I hadn’t yet had the funds and the PC case space to do RAID1. I finally was able to implement it for my photography hard drive. This means that, together with regular scrubs, I should have a near miniscule chance of bit rot ruining any photos it hasn’t already corrupted.

Here’s a documentation of some commands and how I got the drives into RAID1:

 

Before RAID:

# btrfs fi df -h /media/Photos
Data, single: total=2.31TiB, used=2.31TiB
System, DUP: total=8.00MiB, used=272.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=3.50GiB, used=2.68GiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

# btrfs fi usage /media/Photos
Overall:
    Device size:                   2.73TiB
    Device allocated:              2.32TiB
    Device unallocated:          423.48GiB
    Device missing:                  0.00B
    Used:                          2.31TiB
    Free (estimated):            425.29GiB      (min: 213.55GiB)
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 5.64MiB)

Data,single: Size:2.31TiB, Used:2.31TiB
   /dev/sdd1       2.31TiB

Metadata,single: Size:8.00MiB, Used:0.00B
   /dev/sdd1       8.00MiB

Metadata,DUP: Size:3.50GiB, Used:2.68GiB
   /dev/sdd1       7.00GiB

System,single: Size:4.00MiB, Used:0.00B
   /dev/sdd1       4.00MiB

System,DUP: Size:8.00MiB, Used:272.00KiB
   /dev/sdd1      16.00MiB

Unallocated:
   /dev/sdd1     423.48GiB

   
[root@supermario ~]# btrfs device add /dev/sda1 /media/Photos/
/dev/sda1 appears to contain an existing filesystem (btrfs).
ERROR: use the -f option to force overwrite of /dev/sda1
[root@supermario ~]# btrfs device add /dev/sda1 /media/Photos/ -f

[root@supermario ~]# btrfs fi usage /media/Photos
Overall:
    Device size:                   6.37TiB
    Device allocated:              2.32TiB
    Device unallocated:            4.05TiB
    Device missing:                  0.00B
    Used:                          2.31TiB
    Free (estimated):              4.05TiB      (min: 2.03TiB)
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 0.00B)

Data,single: Size:2.31TiB, Used:2.31TiB
   /dev/sdd1       2.31TiB

Metadata,single: Size:8.00MiB, Used:0.00B
   /dev/sdd1       8.00MiB

Metadata,DUP: Size:3.50GiB, Used:2.68GiB
   /dev/sdd1       7.00GiB

System,single: Size:4.00MiB, Used:0.00B
   /dev/sdd1       4.00MiB

System,DUP: Size:8.00MiB, Used:272.00KiB
   /dev/sdd1      16.00MiB

Unallocated:
   /dev/sda1       3.64TiB
   /dev/sdd1     423.48GiB


[root@supermario ~]# btrfs balance start -dconvert=raid1 -mconvert=raid1 /media/Photos/

Done, had to relocate 2374 out of 2374 chunks

Post-RAID:

[root@supermario ~]# btrfs fi usage /media/Photos
Overall:
    Device size:                   6.37TiB
    Device allocated:              4.63TiB
    Device unallocated:            1.73TiB
    Device missing:                  0.00B
    Used:                          4.62TiB
    Free (estimated):            891.01GiB      (min: 891.01GiB)
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 0.00B)

Data,RAID1: Size:2.31TiB, Used:2.31TiB
   /dev/sda1       2.31TiB
   /dev/sdd1       2.31TiB

Metadata,RAID1: Size:7.00GiB, Used:2.56GiB
   /dev/sda1       7.00GiB
   /dev/sdd1       7.00GiB

System,RAID1: Size:64.00MiB, Used:368.00KiB
   /dev/sda1      64.00MiB
   /dev/sdd1      64.00MiB

Unallocated:
   /dev/sda1       1.32TiB
   /dev/sdd1     422.46GiB
   
   
[root@supermario ~]# btrfs fi df -h /media/Photos
Data, RAID1: total=2.31TiB, used=2.31TiB
System, RAID1: total=64.00MiB, used=368.00KiB
Metadata, RAID1: total=7.00GiB, used=2.56GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

And here’s the status of my first scub to test out the commands:

[root@supermario ~]# btrfs scrub status /media/Photos/
scrub status for 27cc1330-c4e3-404f-98f6-f23becec76b5
 scrub started at Tue Mar 21 17:18:13 2017, running for 00:09:10
 total bytes scrubbed: 145.57GiB with 0 errors

Profiting from Inefficiencies?

I went with Backblaze first because they were highly recommended by LifeHacker. Then I chose Crashplan for my main Linux computer because Backblaze doesn’t do Linux. Crashplan offers a family plan that covers 2-10 computers, but I only need to cover 2 computers (my laptops don’t have anything that needs backing up). Covering two computers on Crashplan is more expensive than doing one computer on Crashplan and one on Backblaze. So the less efficient and more complicated setup is the cheaper one; oh well.

Exploring btrfs for backups Part 6: Backup Drives and changing RAID levels VM

Hard drives are relatively cheap, especially nowadays. But I still want to stay within my budget as I setup my backups and system redundancies. So, ideally, for my backup RAID I’d take advantage of btrs’ ability to change RAID types on the fly and start off with one drive. Then I’d add another and go to RAID1. Then another and RAID5. Finally, the fourth drive and RAID6. At that point I’d have to be under some sort of Job-like God/Devil curse if all my drives failed at once, negating the point of the RAID. The best thinking right now is that you want to have backups, but want to try not to have to use them because of both offline time and the fact that a restore is never as clean as you hope it’ll be.

Let’s get started! I’ve added a third drive to my VM. Time to format and partition the drive. I do this with gParted. Interestingly, after the last post – gParted is a bit confused about what’s going on – again, btrfs isn’t exactly transparent with what it’s doing, especially when you have RAID setup. It shows the data as being stored on the second hard drive and nothing on the first one. The fact that sdb1 and sdb2 are unknown file systems and empty seems to go along with what I said yesterday about the fact that I don’t think it was properly set up to be able to boot from that system should the main hard drive die. So if you followed along with part 5 – make sure you take care of that if you’re doing RAID on your boot hard drive.

OK, now that the partition is created, I create a backup directory under /media/backup. Then I create a btrfs subvolume there.

$ sudo mount -t btrfs /dev/sdc1 /media/backup/
$ sudo btrfs subvolume create /media/backup/backups
Create subvolume '/media/backup/backups'
$ sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.83GiB
 devid 1 size 6.71GiB used 3.62GiB path /dev/sda3
 devid 2 size 6.71GiB used 3.62GiB path /dev/sdb3
Label: 'Backup' uuid: 7042f4b7-9815-44f4-aef9-81103fc5855b
 Total devices 1 FS bytes used 208.00KiB
 devid 1 size 8.00GiB used 855.00MiB path /dev/sdc1

Looks like I’m in a good place. Just need to add this to fstab. Alright, everything should now be set to grow this into a RAID1. Time to shut off the VM to add another hard drive.

Alright, so far everything is working correctly. I created a couple folders and a file to make sure the data survives in tact. So, in yesterday’s post I said I wouldn’t be using the sfdisk thing, but I think it’s probably a great shortcut to make sure everything is right in case I do something weird in gParted.

$ sudo sfdisk -d /dev/sdc | sudo sfdisk /dev/sdd
sfdisk: Checking that no-one is using this disk right now ...
sfdisk: OK
Disk /dev/sdd: 1044 cylinders, 255 heads, 63 sectors/track
sfdisk: /dev/sdd: unrecognized partition table type
Old situation:
sfdisk: No partitions found
New situation:
Units: sectors of 512 bytes, counting from 0
Device Boot Start End #sectors Id System
/dev/sdd1 1 16777215 16777215 ee GPT
/dev/sdd2 0 - 0 0 Empty
/dev/sdd3 0 - 0 0 Empty
/dev/sdd4 0 - 0 0 Empty
sfdisk: Warning: partition 1 does not end at a cylinder boundary
sfdisk: Warning: no primary partition is marked bootable (active)
This does not matter for LILO, but the DOS MBR will not boot this disk.
Successfully wrote the new partition table
Re-reading the partition table ...
sfdisk: If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
Looks good. Time to add it to btrfs. Strangely, it doesn't appear to work. (Complains there isn't an sdd1)  Ok, let's try gParted, then.
$ sudo btrfs device add -f /dev/sdd1 /media/backup/

I had to use -f because I’d put a btrfs partition on there – I probably should have selected unallocated or something. Let’s make sure this makes sense:

$ sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.84GiB
 devid 1 size 6.71GiB used 3.62GiB path /dev/sda3
 devid 2 size 6.71GiB used 3.62GiB path /dev/sdb3
Label: 'Backup' uuid: 7042f4b7-9815-44f4-aef9-81103fc5855b
 Total devices 2 FS bytes used 208.00KiB
 devid 1 size 8.00GiB used 855.00MiB path /dev/sdc1
 devid 2 size 8.00GiB used 0.00 path /dev/sdd1

Looks good to me. Time to RAID1 it.

$ sudo btrfs balance start -dconvert=raid1 -mconvert=raid1 /media/backup/
Done, had to relocate 5 out of 5 chunks

It took 30 seconds because there wasn’t really anything to move.

$ sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.84GiB
 devid 1 size 6.71GiB used 3.62GiB path /dev/sda3
 devid 2 size 6.71GiB used 3.62GiB path /dev/sdb3
Label: 'Backup' uuid: 7042f4b7-9815-44f4-aef9-81103fc5855b
 Total devices 2 FS bytes used 464.00KiB
 devid 1 size 8.00GiB used 1.28GiB path /dev/sdc1
 devid 2 size 8.00GiB used 1.28GiB path /dev/sdd1

And, checking the RAID levels:

$ sudo btrfs fi df /media/backup/
Data, RAID1: total=1.00GiB, used=320.00KiB
System, RAID1: total=32.00MiB, used=16.00KiB
Metadata, RAID1: total=256.00MiB, used=128.00KiB
unknown, single: total=16.00MiB, used=0.00

Excellent! But this isn’t anything special over yesterday’s post. Now let’s add a third backup harddrive. This time I go straight for gParted and use a partition type of unformatted.

Now a quick check that my files are there:

$ tree /media/backup/
/media/backup/
├── test1
└── test2
 └── Iamintest2

Yup! Let’s keep going.

$ sudo btrfs device add /dev/sde1 /media/backup/

No errors this time. As always, a double-check.

$ sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.85GiB
 devid 1 size 6.71GiB used 3.62GiB path /dev/sda3
 devid 2 size 6.71GiB used 3.62GiB path /dev/sdb3
Label: 'Backup' uuid: 7042f4b7-9815-44f4-aef9-81103fc5855b
 Total devices 3 FS bytes used 464.00KiB
 devid 1 size 8.00GiB used 1.28GiB path /dev/sdc1
 devid 2 size 8.00GiB used 1.28GiB path /dev/sdd1
 devid 3 size 8.00GiB used 0.00 path /dev/sde1

Perfect. Time for RAID5!

$ sudo btrfs balance start -dconvert=raid5 -mconvert=raid5 /media/backup/
Done, had to relocate 3 out of 3 chunks

Again, a quick finish because not many files there; 10-20 seconds. So, the checks:

$ sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.85GiB
 devid 1 size 6.71GiB used 3.62GiB path /dev/sda3
 devid 2 size 6.71GiB used 3.62GiB path /dev/sdb3
Label: 'Backup' uuid: 7042f4b7-9815-44f4-aef9-81103fc5855b
 Total devices 3 FS bytes used 720.00KiB
 devid 1 size 8.00GiB used 1.16GiB path /dev/sdc1
 devid 2 size 8.00GiB used 1.16GiB path /dev/sdd1
 devid 3 size 8.00GiB used 1.16GiB path /dev/sde1

And RAID check:

$ sudo btrfs fi df /media/backup/
Data, RAID5: total=2.00GiB, used=576.00KiB
System, RAID5: total=64.00MiB, used=16.00KiB
Metadata, RAID5: total=256.00MiB, used=128.00KiB
unknown, single: total=16.00MiB, used=0.00

Alright. No issues expected with that. So let’s see if RAID6 is just as easy. Created another file so see if RAID5 overhead caused any issues. None that I could see, but I’m not exactly running this on a critical database or something.

$ tree /media/backup/
/media/backup/
├── test1
│   └── hahaha
└── test2
 └── Iamintest2

OK. let’s get to it! gParted again. Add the device:

$ sudo btrfs device add /dev/sdf1 /media/backup/

And the usual checks:

$ sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.85GiB
 devid 1 size 6.71GiB used 3.62GiB path /dev/sda3
 devid 2 size 6.71GiB used 3.62GiB path /dev/sdb3
Label: 'Backup' uuid: 7042f4b7-9815-44f4-aef9-81103fc5855b
 Total devices 4 FS bytes used 720.00KiB
 devid 1 size 8.00GiB used 1.16GiB path /dev/sdc1
 devid 2 size 8.00GiB used 1.16GiB path /dev/sdd1
 devid 3 size 8.00GiB used 1.16GiB path /dev/sde1
 devid 4 size 8.00GiB used 0.00 path /dev/sdf1

Looks fine. Also, just realized they don’t do the CS thing of counting from 0. OK, moment of truth:

sudo btrfs balance start -dconvert=raid6 -mconvert=raid6 /media/backup/
Done, had to relocate 3 out of 3 chunks

Alright! Balance check:

$ sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.85GiB
 devid 1 size 6.71GiB used 3.62GiB path /dev/sda3
 devid 2 size 6.71GiB used 3.62GiB path /dev/sdb3
Label: 'Backup' uuid: 7042f4b7-9815-44f4-aef9-81103fc5855b
 Total devices 4 FS bytes used 720.00KiB
 devid 1 size 8.00GiB used 1.16GiB path /dev/sdc1
 devid 2 size 8.00GiB used 1.16GiB path /dev/sdd1
 devid 3 size 8.00GiB used 1.16GiB path /dev/sde1
 devid 4 size 8.00GiB used 1.16GiB path /dev/sdf1

And RAID check:

$ sudo btrfs fi df /media/backup/
Data, RAID6: total=2.00GiB, used=576.00KiB
System, RAID6: total=64.00MiB, used=16.00KiB
Metadata, RAID6: total=256.00MiB, used=128.00KiB
unknown, single: total=16.00MiB, used=0.00

Perfect. So, it’s just that easy. This has been a great demonstration for me because it means I can buy my harddrives little by little instead of all at once. Sure, all at once is nicer in that you don’t have to take hours doing your balancing, but sometimes that’s just not an option and it’s nice to know that btrfs can handle it on a live system. No offline time necessary.

With my current grad school load, the next post will most likely be me converting my home subvolume on SuperMario to RAID1. Then, probably this winter, setting up my backup drive on SuperMario and setting up my Snap-in-Time scripts to send snapshots from the main system to the backup drive.  See ya then!

Exploring btrfs for backups Part 5: RAID1 on the Main Disks in the VM

So, back when I started this project, I laid out that one of the reasons I wanted to use btrfs on my home directory (don’t think it’s ready for / just yet) is that with RAID1, btrfs is self-healing. Obviously, magic can’t be done, but a checksum is stored as part of the data’s metadata and if the file doesn’t match the checksum on one disk, but does on the other, the file can be fixed. This can help protect against bitrot, which is the biggest thing that’s going to keep our children’s digital photos from lasting as long as the ones printed on archival paper. So, like I did the first time, I’ll first be trying it out in a Fedora VM that mostly matches my version, kernel, and btrfs-progs version. So, I went and added another virtual hard drive of the same size to my VM.

btrfs-RAID1-two hard drives in VirtualBox Next comes a part that I won’t be doing on my real machine because I don’t have root on the non-VM system I want to RAID1:

#>sudo sfdisk -d /dev/sda | sudo sfdisk /dev/sdb
[sudo] password for ermesa: [sudo] password for ermesa: 
sfdisk: Checking that no-one is using this disk right now ...
sfdisk: OK
Disk /dev/sdb: 1044 cylinders, 255 heads, 63 sectors/track
sfdisk: /dev/sdb: unrecognized partition table type
Old situation:
sfdisk: No partitions found
Sorry, try again.
New situation:
Units: sectors of 512 bytes, counting from 0
Device Boot Start End #sectors Id System
/dev/sdb1 * 2048 1026047 1024000 83 Linux
/dev/sdb2 1026048 2703359 1677312 82 Linux swap / Solaris
/dev/sdb3 2703360 16777215 14073856 83 Linux
/dev/sdb4 0 - 0 0 Empty
sfdisk: Warning: partition 1 does not end at a cylinder boundary
sfdisk: Warning: partition 2 does not start at a cylinder boundary
sfdisk: Warning: partition 2 does not end at a cylinder boundary
sfdisk: Warning: partition 3 does not start at a cylinder boundary
sfdisk: Warning: partition 3 does not end at a cylinder boundary
Successfully wrote the new partition table
Re-reading the partition table ...
sfdisk: If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)

OK, so now I have to install grub. Again, I wouldn’t do this on SuperMario, but since the VM has btrfs on the whole system, I’m going to do it here.

#>sudo grub2-install /dev/sdb
Installation finished. No error reported.

Excellent. Now the btrfs-specific parts.

#> sudo btrfs device add /dev/sdb1 /

Before the (hopefully) last step, let’s see what this gives us in the current btrfs filesystem:

 #>sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.82GiB
 devid 1 size 6.71GiB used 4.07GiB path /dev/sda3
 devid 2 size 500.00MiB used 0.00 path /dev/sdb1

Oh, this allowed me to catch something. It should have been sdb3 before not 1. Let me see if this fixes things.

#>sudo btrfs device delete /dev/sdb1 /
#> sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 1 FS bytes used 2.82GiB
 devid 1 size 6.71GiB used 4.07GiB path /dev/sda3

OK, good. That appears to have put us back where we started. Let me try the correct parameters this time.

#>sudo btrfs device add /dev/sdb3 /
#> sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.82GiB
 devid 1 size 6.71GiB used 4.07GiB path /dev/sda3
 devid 2 size 6.71GiB used 0.00 path /dev/sdb3

Much better. See that both devices are the same size? Good. A df shows me that /boot is on sda1. So I’m not 100% convinced we’ll end up with a system that can boot no matter which hard drive fails. I’m not going to worry about that since in SuperMario I’ll just be doing a home drive, but you may want to check documenation if you’re doing this for your boot hard drive as well. Time for the final command to turn it into a RAID 1:

#>sudo btrfs balance start -dconvert=raid1 -mconvert=raid1 /

That hammers the system for a while. I got the following error:

ERROR: error during balancing '/' - Read-only file system

I wonder what happened. And we can see that it truly was not balanced.

#>sudo btrfs fi show
Label: 'fedora' uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
 Total devices 2 FS bytes used 2.88GiB
 devid 1 size 6.71GiB used 5.39GiB path /dev/sda3
 devid 2 size 6.71GiB used 3.34GiB path /dev/sdb3

Hmm. Checking dmesg shows that it’s a systemd issue. I’ll reboot the VM in case the filesystem ended up in a weird state. It has definitely been acting a bit strange. The fact that it doesn’t want to reboot isn’t encouraging. Since it’s just a VM, I decide to go for a hard reset. When I tried to run it again, it said operation now in progress. I guess it saw that it wasn’t able to complete it last time? I’m not sure. If so, that’d be awesome. And maybe that’s why the reboot wouldn’t happen? But it gave me errors, so that’s a bit unintuitive if that’s what was going on.

Here’s what dmesg showed:

[ 224.975078] BTRFS info (device sda3): found 12404 extents
[ 243.538313] BTRFS info (device sda3): found 12404 extents
[ 244.061442] BTRFS info (device sda3): relocating block group 389611520 flags 1
[ 354.881373] BTRFS info (device sda3): found 14154 extents
[ 387.088152] BTRFS info (device sda3): found 14154 extents
[ 387.450010] BTRFS info (device sda3): relocating block group 29360128 flags 36
[ 404.492103] hrtimer: interrupt took 3106176 ns
[ 417.499860] BTRFS info (device sda3): found 8428 extents
[ 417.788591] BTRFS info (device sda3): relocating block group 20971520 flags 34
[ 418.079598] BTRFS info (device sda3): found 1 extents
[ 418.832913] BTRFS info (device sda3): relocating block group 12582912 flags 1
[ 421.570949] BTRFS info (device sda3): found 271 extents
[ 425.489926] BTRFS info (device sda3): found 271 extents
[ 426.188314] BTRFS info (device sda3): relocating block group 4194304 flags 4
[ 426.720475] BTRFS info (device sda3): relocating block group 0 flags 2

So it does look like it’s working on that. When it was done, I had:

#>sudo btrfs fi show
Label: ‘fedora’ uuid: e5d5f485-4ca8-4846-b8ad-c00ca8eacdd9
Total devices 2 FS bytes used 2.83GiB
devid 1 size 6.71GiB used 3.62GiB path /dev/sda3
devid 2 size 6.71GiB used 3.62GiB path /dev/sdb3

It’s encouraging that the same space is taken up on each drive. But how to confirm it’s RAID1? The answer is found in btrfs’ version of df. btrfs needs its own version because, as of right now, a lot of what makes it an awesome COW filesystem means that the usual GNU programs don’t truly know how much free space you have. So, let’s try it:

#>sudo btrfs fi df /
Data, RAID1: total=3.34GiB, used=2.70GiB
System, RAID1: total=32.00MiB, used=16.00KiB
Metadata, RAID1: total=256.00MiB, used=133.31MiB
unknown, single: total=48.00MiB, used=0.00

I’m slightly nervous about the “unknown” entry – but a quick Google shows that it’s no big deal.

3.15 has this commit, it's the cause of the unknown.  We'll roll the progs patch 
into the next progs release, but it's nothing at all to worry about.

-chris

Author: David Sterba <dsterba <at> suse.cz>
Date:   Fri Feb 7 14:34:12 2014 +0100

    btrfs: export global block reserve size as space_info

    Introduce a block group type bit for a global reserve and fill the space
    info for SPACE_INFO ioctl. This should replace the newly added ioctl
    (01e219e8069516cdb98594d417b8bb8d906ed30d) to get just the 'size' part
    of the global reserve, while the actual usage can be now visible in the
    'btrfs fi df' output during ENOSPC stress.

    The unpatched userspace tools will show the blockgroup as 'unknown'.

    CC: Jeff Mahoney <jeffm <at> suse.com>
    CC: Josef Bacik <jbacik <at> fb.com>
    Signed-off-by: David Sterba <dsterba <at> suse.cz>
    Signed-off-by: Chris Mason <clm <at> fb.com>

So, there you go, relatively simple to setup a RAID1 on a btrfs system. Took just under an hour – but it was only 3 GB to balance. Larger  drive takes longer (which is why RAID6 is better as you can have another drive fail while you are balancing in your replacement drive) The best thing is that it all runs on a live system so you don’t need to suffer being unable to use the computer while the balance runs. Again, if you’re doing this on your boot drive, use Google to confirm that the /boot and all that is setup correctly or you won’t quite have the redudancy protection you think you do. Next time’s going to get a bit interesting as I simulate what I want to do with my backup btrfs hard drives. After that it’ll either be more Snap-In-Time code or my live migration to RAID1 on my home btrfs subvolume.

Exploring btrfs for backups Part 4: Weekly Culls and Unit Testing

Back in August I finally had some time to do some things I’d been wanting to do with my Snap-in-Time btrfs program for a while now. First of all, I finally added the weekly code. So now my snapshots are cleaned up every three days and then every other week. Next on the docket is quarterly cleanups followed up yearly cleanups. Second, the big thing I’d wanted to do for a while now: come up with unit tests! Much more robust than my debug code and testing scripts, it helped me find corner cases. If you look at my git logs you can see that it helped me little-by-little figure out just what I needed to do as well as when my “fixes” broke other things. Yay! My first personal project with regression testing!

A small note: to accommodate the unit testing, I had to change the file name – so the one you want to use is the one without dashes. I am not 100% sure how to get rid of the old file without losing commit history, but I think it’s not too big of a deal for now.

If I can get my way, the next update will be when I setup the self-healing RAID 1 followed by setting up the backup harddive and btrfs send/receive.

Exploring btrfs for backups Part 3: The Script in Practice

Night of the second day:

# btrfs sub list /home
ID 275 gen 3201 top level 5 path home
ID 1021 gen 3193 top level 275 path .snapshots
ID 1023 gen 1653 top level 275 path .snapshots/2014-03-13-2146
ID 1024 gen 1697 top level 275 path .snapshots/2014-03-13-2210
ID 1025 gen 1775 top level 275 path .snapshots/2014-03-13-2300
ID 1027 gen 1876 top level 275 path .snapshots/2014-03-14-0000
ID 1028 gen 1961 top level 275 path .snapshots/2014-03-14-0100
ID 1029 gen 2032 top level 275 path .snapshots/2014-03-14-0200
ID 1030 gen 2105 top level 275 path .snapshots/2014-03-14-0300
ID 1031 gen 2211 top level 275 path .snapshots/2014-03-14-0400
ID 1032 gen 2284 top level 275 path .snapshots/2014-03-14-0500
ID 1033 gen 2357 top level 275 path .snapshots/2014-03-14-0600
ID 1035 gen 2430 top level 275 path .snapshots/2014-03-14-0700
ID 1036 gen 2506 top level 275 path .snapshots/2014-03-14-0800
ID 1037 gen 2587 top level 275 path .snapshots/2014-03-14-0900
ID 1038 gen 2667 top level 275 path .snapshots/2014-03-14-1700
ID 1039 gen 2774 top level 275 path .snapshots/2014-03-14-1800
ID 1040 gen 2879 top level 275 path .snapshots/2014-03-14-1900
ID 1041 gen 2982 top level 275 path .snapshots/2014-03-14-2000
ID 1042 gen 3088 top level 275 path .snapshots/2014-03-14-2100
ID 1043 gen 3193 top level 275 path .snapshots/2014-03-14-2200

Morning of the third day:

# btrfs sub list /home
ID 275 gen 4602 top level 5 path home
ID 1021 gen 4558 top level 275 path .snapshots
ID 1025 gen 1775 top level 275 path .snapshots/2014-03-13-2300
ID 1027 gen 1876 top level 275 path .snapshots/2014-03-14-0000
ID 1028 gen 1961 top level 275 path .snapshots/2014-03-14-0100
ID 1029 gen 2032 top level 275 path .snapshots/2014-03-14-0200
ID 1030 gen 2105 top level 275 path .snapshots/2014-03-14-0300
ID 1031 gen 2211 top level 275 path .snapshots/2014-03-14-0400
ID 1032 gen 2284 top level 275 path .snapshots/2014-03-14-0500
ID 1033 gen 2357 top level 275 path .snapshots/2014-03-14-0600
ID 1035 gen 2430 top level 275 path .snapshots/2014-03-14-0700
ID 1036 gen 2506 top level 275 path .snapshots/2014-03-14-0800
ID 1037 gen 2587 top level 275 path .snapshots/2014-03-14-0900
ID 1038 gen 2667 top level 275 path .snapshots/2014-03-14-1700
ID 1039 gen 2774 top level 275 path .snapshots/2014-03-14-1800
ID 1040 gen 2879 top level 275 path .snapshots/2014-03-14-1900
ID 1041 gen 2982 top level 275 path .snapshots/2014-03-14-2000
ID 1042 gen 3088 top level 275 path .snapshots/2014-03-14-2100
ID 1043 gen 3193 top level 275 path .snapshots/2014-03-14-2200
ID 1044 gen 3305 top level 275 path .snapshots/2014-03-14-2300
ID 1045 gen 3418 top level 275 path .snapshots/2014-03-15-0000
ID 1046 gen 3529 top level 275 path .snapshots/2014-03-15-0100
ID 1047 gen 3640 top level 275 path .snapshots/2014-03-15-0200
ID 1048 gen 3754 top level 275 path .snapshots/2014-03-15-0300
ID 1049 gen 3872 top level 275 path .snapshots/2014-03-15-0400
ID 1050 gen 3986 top level 275 path .snapshots/2014-03-15-0500
ID 1052 gen 4102 top level 275 path .snapshots/2014-03-15-0600
ID 1053 gen 4216 top level 275 path .snapshots/2014-03-15-0700
ID 1054 gen 4331 top level 275 path .snapshots/2014-03-15-0800
ID 1055 gen 4445 top level 275 path .snapshots/2014-03-15-0900
ID 1056 gen 4558 top level 275 path .snapshots/2014-03-15-1000

As you can see, it has removed the first two snapshots. Since all three snapshots for the first day were in the last quarter of the day, that is the correct behaviour. Tomorrow we will have a much better demonstration that it is 100% working as it should. To see the cron job go to part 2 of this series. To get the Python script, go to Github.

Exploring btrfs for backups Part 2: Installing on My /Home Directory and using my new Python Script

I got my new hard drive that would replace my old, aging /home hard drive. As you read in part 1, I wanted to put btrfs on it. This is my journey to get it up and running. Plugged it into my hard drive toaster and ran gparted.

Gparted for new drive
Gparted for new drive
Gparted for new drive1
Gparted for new drive1
Gparted for new drive2
Gparted for new drive2
Gparted for new drive3
Gparted for new drive3

Because of the peculiarities of btrfs and what I wanted to do with subvolumes, that wasn’t enough. I’d run an rsync, but forgot to create a home subvolume first. The rysnc still had a long way to go so it wasn’t so bad when I  deleted the rsync’d folder. Created the subvolume.

# btrfs subvolume create /run/media/username/Home1/home
Create subvolume '/run/media/username/Home1/home'

# btrfs sub list /run/media/username/Home1/
ID 275 gen 199 top level 5 path home

ran the rsync again

# nice rsync --human-readable --verbose --progress --perms  --times --links --acls --xattrs -recursive  --one-file-system --exclude=/.Trash-0 /home/ /run/media/username/Home1/home/

Even though I’d lost about an hour’s worth of transfers I took advantage to rid myself of a few pointless files. (So many images in gwibber’s cache (which I don’t even use anymore) that it was too many files for a rm * command. Sure they’re small files but I’m sure the overhead of each file was why it was in the folder for over half an hour. Then again, maybe each file was tiny but the folder wasn’t. I didn’t do a close enough comparison, but based on what I remember from my last df, I think I gained 20-30 GB by deleting that cache. WOW! And I still have other things to consider deleting in the folder like Beagle – I don’t use that for search anymore. Not sure anyone does.  It was at USB 2.0 speeds – don’t have e-sata on that computer – which meant 50MB/s max (that I saw while I was watching the commandline) and closer to mid-20s on most files. Not 100% sure why, but the rsync would only run as root. As it’s currently running, even though I passed it the –perms it appears that everything is owned by root. Not sure if rsync will fix that at the end. If it doesn’t, I’ll have to do that manually after the rsync is done. I’m live-blogging this part of the process as it happens. I’m up to 42595 forthe ir-check denominator. (Which I’d assumed was how many files it needed to copy) It’s now iterating through a fairly large Firefox cache. Lesson to others migrating their home drives with rsync – clear those caches! While waiting I looked up what the numbers mean – yes, that’s the number of files, but right now it’s ir- instead of to- which means rysync hasn’t finished counting yet. Well, enough live-blogging for now – I’m going to go shower and other things. I’ll be back when it’s done. Given that it has done 18 GB in about an hour and it has to do 760…. it’ll probably be an overnight job. It did take quite a few hours. When I looked in dmesg there were a lot of read errors on the old drive. It’d failed SMART a year ago. Looks like I replaced it in the nick of time.

Ran a df and it matched up. Unfortunately, everything was now owned by root so I had to chown and chmod the home directory.

Time to edit fstab. First it was time to figure out the block id.

# blkid
/dev/sda1: LABEL="/boot" UUID="6f51a1f6-4267-45eb-84eb-45ade094b037" TYPE="ext3" PARTUUID="000a338b-01" 
/dev/sda2: LABEL="SWAP-sda2" UUID="45897b4e-f7d6-4ae5-bdf5-6f2a23c9a769" TYPE="swap" PARTUUID="000a338b-02" 
/dev/sda3: LABEL="/" UUID="c4710ace-faf2-4f0f-8609-cb0be82dce34" TYPE="ext3" PARTUUID="000a338b-03" 
/dev/sdb1: LABEL="/home" UUID="ea558d02-2dc2-4c8e-ad82-12432050746b" TYPE="ext3" PARTUUID="000475af-01" 
/dev/sdc1: LABEL="backup" UUID="8b1f32ff-edc5-48b8-8ec5-c711fb2083b3" TYPE="ext3" PARTUUID="0008b01a-01" 
/dev/sdd1: LABEL="Home1" UUID="89cfd56a-06c7-4805-9526-7be4d24a2872" UUID_SUB="590a694e-f873-45ae-997b-0bdc4a8bc5eb" TYPE="btrfs" PARTUUID="c8a5d7c4-002f-467d-8414-722d3a65a6a5"

The fact that there’s a UUID_SUB makes me believe that will be the UUID for my /home subvolume, which would solve the problem what if I mounted the /dev/sdd1 at /home, it’d have a home directory inside the home directory. Perhaps that’s not an issue as I mounted it into a test directory and it didn’t double up on the homes. My google-fu says I should just use the regular UUID. My fstab looked like this at first:

#
# /etc/fstab
# Created by anaconda on Sat May 16 03:24:18 2009
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or vol_id(8) for more info
#
UUID=c4710ace-faf2-4f0f-8609-cb0be82dce34 /                       ext3    defaults        1 1
UUID=ea558d02-2dc2-4c8e-ad82-12432050746b /home                   ext3    defaults,user_xattr        1 2
UUID=6f51a1f6-4267-45eb-84eb-45ade094b037 /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
UUID=45897b4e-f7d6-4ae5-bdf5-6f2a23c9a769 swap                    swap    defaults        0 0

#added by Eric for share drive
babyluigi.mushroomkingdom:/fileshares	/media/nfs/babyluigi	nfs	rsize=1024,wsize=1024,auto,users,soft,intr	0 0
babyluigi.mushroomkingdom:/media/xbmc   /media/nfs/xbmc-mount	nfs	rsize=1024,wsize=1024,auto,users,soft,intr	0 0
#added by Eric for backup drive
UUID=8b1f32ff-edc5-48b8-8ec5-c711fb2083b3 /media/backup ext3 defaults 0 0

Afterwards it looked like this:

#
# /etc/fstab
# Created by anaconda on Sat May 16 03:24:18 2009
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or vol_id(8) for more info
#
UUID=c4710ace-faf2-4f0f-8609-cb0be82dce34 /                       ext3    defaults        1 1

#old home on 1TB drive
#UUID=ea558d02-2dc2-4c8e-ad82-12432050746b /home                   ext3    defaults,user_xattr        1 2

#new home on btrfs
UUID=89cfd56a-06c7-4805-9526-7be4d24a2872 /home			  btrfs	defaults,subvol=home 0 2

UUID=6f51a1f6-4267-45eb-84eb-45ade094b037 /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
UUID=45897b4e-f7d6-4ae5-bdf5-6f2a23c9a769 swap                    swap    defaults        0 0

#added by Eric for share drive
babyluigi.mushroomkingdom:/fileshares	/media/nfs/babyluigi	nfs	rsize=1024,wsize=1024,auto,users,soft,intr	0 0
babyluigi.mushroomkingdom:/media/xbmc   /media/nfs/xbmc-mount	nfs	rsize=1024,wsize=1024,auto,users,soft,intr	0 0
#added by Eric for backup drive
UUID=8b1f32ff-edc5-48b8-8ec5-c711fb2083b3 /media/backup ext3 defaults 0 0

Time to reboot and see what happens before I put the hard drive into the system. SUCCESS! Now I just need to figure out which hard drive was the old home. I’ll disconnect one at a time and check to see which block ids are missing. I’ll be rebooting each time because even though SATA is plug and play, I’d rather not risk ruining things. Wrong on the first try. Luckily it’s only a 50% chance of being wrong as I knew which one had root. Time to try again. Success. Not sure why, but my Amarok music collection was lost although all the files are there; bummer. (It’s ok, after import this time it hadn’t lost the first/last played dates! yay!) My other KDE programs seemed to be OK. Interestingly, the computer’s running a bit faster. Maybe because of all the read errors on the old drive making things slower?

Now it’s time to create the snapshot directory to get things ready for the snapshots/backups.

# btrfs sub create /home/.snapshots
Create subvolume '/home/.snapshots'

OK, it’s time to use the snapshot/backup script I’ve created. You can get it at Github. It’s good for the daily level culling at this stage.

# ./snap-in-time.py 
Create a readonly snapshot of '/home' in '/home/.snapshots/2014-03-13-2146'

# btrfs sub list /home
ID 275 gen 1653 top level 5 path home
ID 1021 gen 1653 top level 275 path .snapshots
ID 1023 gen 1653 top level 275 path .snapshots/2014-03-13-2146

OK, final step for now is to add it to cron to run every hour.

#btrfs snapshots of /home 
0 * * * * /root/bin/snap-in-time.py

I had to change the script to give the full path of btrfs for it to work in cron.

Next time we go back to the VM to learn how to setup the btrfs RAID. See you then!

Exploring btrfs for backups Part 1

Recently I once again came across an article about the benefits of the btrfs Linux file system. Last time I’d come across it, it was still in alpha or beta, and I also didn’t understand why I would want to use it. However, the most I’ve learned about the fragility of our modern storage systems, the more I’ve thought about how I want to protect my data. My first step was to sign up for offsite backups. I’ve done this on my Windows computer via Backblaze. They are pretty awesome because it’s a constant backup so it meets all the requirements of not forgetting to do it. The computer doesn’t even need to be on at a certain time or anything. I’ve loved using them for the past 2+ years, but one thing that makes me consider their competition is that they don’t support Linux. That’s OK for now because all my photos are on my Windows computer, but it leaves me in a sub-optimal place. I know this isn’t an incredibly influential blog and I’m just one person, but I’d like to think writing about this would help them realize that they could a) lose a customer and b) be making more money from those with Linux computers.

The second step is local backups. Because I really don’t want to actually USE the offsite backups I have. Getting all my data back from Backblaze would cost a few hundred dollars. (for the hard drives they’d send it back in) Right now on my Windows computer I don’t have a good solution for that at the moment, but on Linux I’m using Back In Time. Back In Time creates an incremental backup of my home drive. It works very well and has saved me a few times when settings files have gotten corrupt. But there’s an even more insidious problem that regular backups don’t solve: bit rot.

If you want to see the problem with bit rot, check out this Ars Technica article (the one that got me thinking about btrfs again). That just happens because we have ephemeral magnetic storage. Bits get flipped and your photos get ruined and other files don’t open and so forth. As the article shows, with a RAID 1 configuration, btrfs file systems can be self-healing against bit rot. (This doesn’t solve the problem with my Windows machine, but Linux has everything but my photos on it) So I want to set up a btrfs home directory in RAID 1 and have it back up to a backup drive.

Btrfs has some even cooler tricks up its sleeve. Btrfs can create snapshots which allow you to go back in time through your files in case you make a change you didn’t mean to. Think of it as stage 1 of the file backup system I’m trying to have here. So as long as your hard drive is fine, the snapshots let you recover deleted files or older versions of files that have been accidentally changed. (Say your kid wrecked it and then hit save) Stage 2 is RAID 1, which is protecting against bit rot as well as allowing one hard drive to fail without needing to dip into your backups. It allows you to keep working until the replacement hard drive arrives. Stage 3 is the backup hard drive which protects against failure of both drives from the RAID 1 as well as some file corruption protection in case btrfs’ bit rot healing fails. Finally, stage 4 is offsite backup. Don’t have that right now on Linux.

This article is meant to be a series of tutorials for myself as I set this up in the future as well as for anyone out there who wants to have a similar setup. I’m going to first do this on a test VM running Fedora 20. That will allow me to perfect things without damaging my data on my real computer. Then I’ll develop my backup script on my VM and then I’ll finally be ready to do things on my main computer.

OK, first thing’s first. Let’s get things set up with the first snapshot.

$ sudo btrfs sub list / 
ID 256 gen 443 top level 5 path root 
ID 258 gen 443 top level 5 path home 

$ sudo btrfs sub create /home/.snapshots 
Create subvolume '/home/.snapshots' 

$ sudo btrfs sub snapshot -r /home /home/.snapshots/myfirstsnapshot
 Create a readonly snapshot of '/home' in '/home/.snapshots/myfirstsnapshot'

$ sudo btrfs sub list /
 ID 256 gen 448 top level 5 path root
 ID 258 gen 447 top level 5 path home
 ID 263 gen 447 top level 5 path home/.snapshots
 ID 264 gen 447 top level 5 path home/.snapshots/myfirstsnapshot

I was able to ls into that snapshot and see that it did indeed contain  a copy of my home directory. This is where I’ll end things today. The next step is to start on the script that’s going to automatically snapshot the home directory every hour, cull the snapshots intelligently, and, finally, back it up to the backup drive. I’ll be doing this on github and I’ll have a blog post when I get the first part of it running.

Oh, a thank you to Duncan on the btrfs mailing list to helping me out with this. Also, the way this is setup, will it not recursively end up with snapshots within snapshots because .snapshots is in /home?