Late last year while in New Zealand, I attended a {Product} Overview session. Unfortunately, I had to leave after an hour, and during that time only managed to hear a couple of new things about {Product}. Instead, my colleagues and I spent most of the time trying to correct the {Vendor} technical specialist, who was regurgitating some old FUD about NetWorker.
One of the funniest pieces of FUD that I heard from the {Vendor} rep was that “workgroup” (ha!) products like NetWorker will waste time during the recovery process when differential backups have been run. Drawing up a simple table like the following. The argument used ran along these lines:
- Weekend – Full backup (i.e., 100%)
- Monday – Backup 5% change
- Tuesday – Backup 10% change
- Wednesday – Backup 15% change
- Thursday – Backup 20% change
- Friday – Backup 25% change
Now, as I point out in my book, while one must consider the potential that the unique changed files in set of differential backups may be 100% on each day, it’s not always going to be the case. In fact, only in fairly niche areas or situations will this be so. To be more accurate, a differential backup model may look more like:
- Weekend – Full backup (i.e., 100%)
- Monday – Backup 5% change.
- Tuesday – Backup 7% change.
- Wednesday – Backup 9% change.
- Thursday – Backup 10% change.
- Friday – Backup 11% change.
(That is – in most sites where differentials are used, the unique files that change each day will be minimal.)
Now, regardless of which model happens within an environment, the {Vendor} representative bravely then tried to assert that “with NetWorker, that means a full recovery on Friday would need to pull back 125% of the data!”
That statement is of course about as accurate as “croc shoes are cool”.
There are two types of implied FUD In this statement – and both are incorrect. They are:
- The FUD that if you backup the same file in both a full and a differential, NetWorker would recover both files, first the one from the full, then the one from the differential, in order to complete the recovery.
- The FUD that a filesystem recovery from fulls + X might pull back all files that were backed up, rather than a point in time view of the filesystem as of the last backup.
Thankfully, like Elmer, both of these FUDs are relatively easy to put to rest. I’ll do them in reverse order, since disproving the second puts us in an easy position to disprove the first FUD.
Scenario:
- Schedule called “TestDiff”: full, 5, 5, 5, 5, 5, 5
- Group called “TestDiff”: Using schedule “TestDiff”
- Client tara in group “TestDiff” has save set: /root/casestudy
Initial content of /root/casestudy:
[root@tara ~]# ls -al /root/casestudy total 30796 drwxr-xr-x 2 root root 4096 Feb 2 03:41 . drwxr-x--- 22 root root 20480 Feb 2 03:34 .. -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full1.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full2.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full3.dat
So that’s 30MB of data. Our first backup will by necessity be a full, and we’ll follow that with an mminfo so we can see how much data has been backed up:
[root@tara ~]# savegrp -l full TestDiff Feb 2 03:44:35 tara logger: NetWorker media: (waiting) Waiting for 1 writable volume(s) to backup pool 'TestDiff' tape(s) on tara.pmdg.lab [root@tara ~]# mminfo -q "name=/root/casestudy" volume client date size level name 800844L4 tara.pmdg.lab 02/02/2011 30 MB full /root/casestudy
Now that we’ve got that initial backup done, we’ll populate a couple more files into the directory, and do 2 level 5 differential backups:
[root@tara ~]# dd if=/dev/zero bs=1024k count=10 of=/root/casestudy/1stdiff-1.dat 10+0 records in 10+0 records out 10485760 bytes (10 MB) copied, 0.02353 seconds, 446 MB/s [root@tara ~]# dd if=/dev/zero bs=1024k count=10 of=/root/casestudy/1stdiff-2.dat 10+0 records in 10+0 records out 10485760 bytes (10 MB) copied, 0.080027 seconds, 131 MB/s [root@tara ~]# ls -al /root/casestudy total 51308 drwxr-xr-x 2 root root 4096 Feb 2 03:45 . drwxr-x--- 22 root root 20480 Feb 2 03:34 .. -rw-r--r-- 1 root root 10485760 Feb 2 03:45 1stdiff-1.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:45 1stdiff-2.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full1.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full2.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full3.dat [root@tara ~]# savegrp -l5 TestDiff [root@tara ~]# !mminfo mminfo -q "name=/root/casestudy" volume client date size level name 800844L4 tara.pmdg.lab 02/02/2011 30 MB full /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 20 MB 5 /root/casestudy [root@tara ~]# savegrp -l5 TestDiff [root@tara ~]# mminfo -q "name=/root/casestudy" volume client date size level name 800844L4 tara.pmdg.lab 02/02/2011 30 MB full /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 20 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 20 MB 5 /root/casestudy
All of that looks completely normal. So, now we’ll put a couple of more files in the directory – this time of differing sizes, and run a differential backup as well as the mminfo commands again:
[root@tara ~]# dd if=/dev/zero bs=512k count=10 of=/root/casestudy/3rddiff-1.dat 10+0 records in 10+0 records out 5242880 bytes (5.2 MB) copied, 0.022319 seconds, 235 MB/s [root@tara ~]# dd if=/dev/zero bs=512k count=10 of=/root/casestudy/3rddiff-2.dat 10+0 records in 10+0 records out 5242880 bytes (5.2 MB) copied, 0.011584 seconds, 453 MB/s [root@tara ~]# savegrp -l5 TestDiff [root@tara ~]# !mminfo mminfo -q "name=/root/casestudy" volume client date size level name 800844L4 tara.pmdg.lab 02/02/2011 30 MB full /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 20 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 20 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 30 MB 5 /root/casestudy
Right, next step is to delete some files – I’m going to delete the “1stdiff*” files, then run a new backup:
[root@tara ~]# rm /root/casestudy/1stdiff-* rm: remove regular file `/root/casestudy/1stdiff-1.dat'? y rm: remove regular file `/root/casestudy/1stdiff-2.dat'? y [root@tara ~]# ls -l /root/casestudy total 41032 -rw-r--r-- 1 root root 5242880 Feb 2 03:48 3rddiff-1.dat -rw-r--r-- 1 root root 5242880 Feb 2 03:48 3rddiff-2.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full1.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full2.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full3.dat [root@tara ~]# savegrp -l5 TestDiff [root@tara ~]# !mminfo mminfo -q "name=/root/casestudy" volume client date size level name 800844L4 tara.pmdg.lab 02/02/2011 30 MB full /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 20 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 20 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 30 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 10 MB 5 /root/casestudy
Right, {Vendor} FUD – are you with us now? We’ll now delete all the files in the directory and do a recovery and see what we pull back. By rights, it should be only those files in the directory as of the time of backup – full1.dat, full2.dat, full3.dat, 3rddiff-1.dat and 3rddiff-2.dat:
[root@tara casestudy]# cd /root/casestudy [root@tara casestudy]# rm * rm: remove regular file `3rddiff-1.dat'? y rm: remove regular file `3rddiff-2.dat'? y rm: remove regular file `full1.dat'? y rm: remove regular file `full2.dat'? y rm: remove regular file `full3.dat'? y [root@tara casestudy]# recover -s tara Current working directory is /root/casestudy/ recover> ls 3rddiff-1.dat 3rddiff-2.dat full1.dat full2.dat full3.dat recover> add * 5 file(s) marked for recovery recover> volumes Volumes needed (all on-line): 800844L4 at /dev/nst1 recover> recover Recovering 5 files into their original locations Volumes needed (all on-line): 800844L4 at /dev/nst1 Total estimated disk space needed for recover is 41 MB. Requesting 5 file(s), this may take a while... Requesting 1 recover session(s) from server. ./full1.dat ./full2.dat ./full3.dat ./3rddiff-1.dat ./3rddiff-2.dat Received 5 file(s) from NSR server `tara' Recover completion time: Wed 02 Feb 2011 03:51:43 AM EST recover> quit [root@tara casestudy]# ls -l total 41032 -rw-r--r-- 1 root root 5242880 Feb 2 03:48 3rddiff-1.dat -rw-r--r-- 1 root root 5242880 Feb 2 03:48 3rddiff-2.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full1.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full2.dat -rw-r--r-- 1 root root 10485760 Feb 2 03:41 full3.dat
So, {Vendor} FUD #2 is toast. If we do multiple differential backups (or for that matter, incrementals!) with file deletes happening between backups, NetWorker just recovers the filesystem as of the last point it was backed up – it doesn’t try to repopulate files that didn’t exist as of the last backup.
Let’s return now to {Vendor} FUD #1 about differential backups in NetWorker. We’ve got a bunch of files for which we’ve done differential backups with, and so far all of those files have been backed up to a single volume – 800844L4. So, what I’m going to do is unmount that tape, mark it as full, then overwrite the file ‘full3.dat’, which will mean it’ll need a new backup:
[root@tara casestudy]# nsrjb -u 800844L4 Info: Operation `Eject' in progress on device `/dev/nst1' Jukebox operation finished with status: succeeded [root@tara casestudy]# nsrmm -o full 800844L4 Mark LTO Ultrium-4 tape 800844L4 as full? y [root@tara casestudy]# dd if=/dev/zero bs=1024k count=50 of=full3.dat 50+0 records in 50+0 records out 52428800 bytes (52 MB) copied, 0.249461 seconds, 210 MB/s [root@tara casestudy]# !savegrp savegrp -l5 TestDiff Feb 2 03:56:29 tara logger: NetWorker media: (waiting) Waiting for 1 writable volume(s) to backup pool 'TestDiff' tape(s) on tara.pmdg.lab [root@tara casestudy]# !mminfo mminfo -q "name=/root/casestudy" volume client date size level name 800843L4 tara.pmdg.lab 02/02/2011 81 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 30 MB full /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 20 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 20 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 30 MB 5 /root/casestudy 800844L4 tara.pmdg.lab 02/02/2011 10 MB 5 /root/casestudy
Our new backup is sitting on 800843L4 – a different tape. I’ll now delete and recover full3.dat, and demonstrate that NetWorker doesn’t do anything so stupid as the notion of recovering the file twice:
[root@tara casestudy]# rm full3.dat rm: remove regular file `full3.dat'? y [root@tara casestudy]# recover Current working directory is /root/casestudy/ recover> add full3.dat /root/casestudy 1 file(s) marked for recovery recover> volumes Volumes needed (all on-line): 800843L4 at /dev/nst2 recover> recover Recovering 1 file into its original location Volumes needed (all on-line): 800843L4 at /dev/nst2 Total estimated disk space needed for recover is 51 MB. Requesting 1 file(s), this may take a while... Requesting 1 recover session(s) from server. ./full3.dat Received 1 file(s) from NSR server `tara.pmdg.lab' Recover completion time: Wed 02 Feb 2011 03:57:57 AM EST recover> quit
Now, just to prove that I’m not incorrectly trusting NetWorker, here’s the nsr_render_log output for the daemon.raw from the time of recovery – see if you can spot how many tapes we used:
70920 02/02/2011 03:57:42 AM 0 0 2 2426823904 26733 0 tara.pmdg.lab nsrd tara.pmdg.lab:root browsing 70919 02/02/2011 03:57:51 AM 0 0 2 2426823904 26733 0 tara.pmdg.lab nsrd tara.pmdg.lab:root done browsing 70920 02/02/2011 03:57:51 AM 0 0 2 2426823904 26733 0 tara.pmdg.lab nsrd tara.pmdg.lab:root browsing 70911 02/02/2011 03:57:53 AM 0 0 2 2426823904 26733 0 tara.pmdg.lab nsrd tara.pmdg.lab:/root/casestudy (2/02/11) starting read from 800843L4 of 51 MB 70904 02/02/2011 03:57:57 AM 0 0 2 2426823904 26733 0 tara.pmdg.lab nsrd tara.pmdg.lab:/root/casestudy (2/02/11) done reading 51 MB 70919 02/02/2011 03:57:57 AM 0 0 2 2426823904 26733 0 tara.pmdg.lab nsrd tara.pmdg.lab:root done browsing 70920 02/02/2011 03:57:57 AM 0 0 2 2426823904 26733 0 tara.pmdg.lab nsrd tara.pmdg.lab:root browsing 42506 02/02/2011 03:57:57 AM 2 0 0 2426823904 26733 0 tara.pmdg.lab nsrd recover info: User root on tara.pmdg.lab successfully recovered tara.pmdg.lab's files 70919 02/02/2011 03:58:00 AM 0 0 2 2426823904 26733 0 tara.pmdg.lab nsrd tara.pmdg.lab:root done browsing
Wait for it, wait for it … now let’s see, it used 800843L4. That’s one tape. That was our second backup of the file. Hmmm, but it didn’t pull back the first copy of the file, because that was on 800844L4, and the logs tell us it only read from a single tape.
{Vendor} FUD #1 put to rest too.
The real pity about vendors flinging FUD about other vendors products is that it takes away from time that could be otherwise used productively. In this case, I had been looking forward to getting at least an hour of a {Product} technical briefing. Lamentably, that’s not what I got.
Maybe next time.