23 November 2008 @ 11:55 am
tar vs. dump  
Some people claim dump is irrelevant. Linus Torvalds claimed at some point that dump was a relic of the past. The real issue was that there was no way in Linux to synchronize a file system at the time due to a silly bug in the kernel.

Well, a lot of people still find dump a useful tool. Its easy to use and its fast. In fact its really fast. tar and just about every other backup tool accesses the filesystem through the directory structure. The filesystem on disk is not ordered in the same way as its directory structure and the result is a lot of time spent seeking. dump opens the underlying device and accesses the data in its native order.

I ran a primitive benchmark just now:
  1. sync the filesystem (an ext3 filesystem on a encrypted volume).
  2. flush out the page and dentry caches (echo 3 > /proc/sys/vm/drop_caches)
  3. run the backup
I did this for four different backup jobs:
  1. full backup with tar
  2. incremental backup with tar
  3. full backup with dump
  4. incremental backup with dump
The results:
tar cf - /home/perbu37m 55s
tar --after-date 2008-11-01 -cf - /home/perbu
3m 59s
dump -f - /dev/vg0/perbu13m 22s
dump -f -T 'Fri Nov 01 00:00:00 2008 +0100'   /dev/vg0/perbu2m 22s
The results are quite clear. Dump is far superior to tar performance-wise.  A lot of sysadmins have problems making the backup stay within its window and dump is a very useful tool to those people.

I would guess that on a SSD the results would more or less be the same as the seek times are more or less zero. If someone gets me an SSD I'll make a post abount it. :-)

However, there is a price for this performance. If your filesystem is very active there might be changes that are not yet flushed out to disk - these data might not be backed up completely. To be 100% sure everything is backed up you might want to take a snapshot of the devices and dump this.For a personal computer however, the risk in negligible.

Happy dumping!

iolandamignorte on November 12th, 2010 03:53 pm (UTC)
I've never used the incremental feature of a backup program so I have a question for you: usually it takes several days to acknowledge that I have a big problem on my computer so do I have the possibility to choose from a list of previous backups or the incremental feature will backup the problem as well?
