There’s currently a bug within NetWorker whereby if you’re using a 32-bit Windows client that has a filesystem large enough such that the savesets generated are larger than 2TB, you’ll get a massively truncated size reported in the savegroup completion. In fact, for a 2,510 GB saveset, the savegroup completion report will look like this:

Start time:   Sat Nov 14 17:42:52 2009
End time:     Sun Nov 15 06:58:57 2009

--- Successful Save Sets ---
* cyclops:Probe savefs cyclops: succeeded.
* cyclops:C:\bigasms 66135:(pid 3308): NSR directive file (C:\bigasms\nsr.dir) parsed
 cyclops: C:\bigasms               level=full,   1742 MB 13:15:56    255 files
 trash.pmdg.lab: index:cyclops     level=full,     31 KB 00:00:00      7 files
 trash.pmdg.lab: bootstrap         level=full,    213 KB 00:00:00    198 files

However, when checked through NMC, nsrwatch or mminfo, you’ll find that that the correct size for the saveset is actually shown:

[root@trash ~]# mminfo
 volume        client       date      size   level  name
XFS.002        cyclops   11/14/2009 2510 GB   full  C:\bigasms
XFS.002.RO     cyclops   11/14/2009 2510 GB   full  C:\bigasms

The reporting doesn’t affect recoverability, but if you’re reviewing savegroup completion reports the data sizes will likely (a) be a cause for concern or (b) affect any auto parsing that you’re doing of the savegroup completion report.

I’ve managed to secure a fix for 7.4.4 for this, with requests in to get it ported to 7.5.1 as well, and to get it integrated into the main trees for permanent inclusion upon the next service packs, etc. If you’ve been putting up with this problem for a while or have just noticed it and want it fixed, the escalation patch number was NW110493.

(It’s possible that this problem affects more than just 32-bit Windows clients – i.e,. it could affect other 32-bit clients as well. I’d be interested in knowing if someone has spotted it on another operating system. I’d test, but my lab environment is currently otherwise occupied and generating 2+TB of data, even at 90MB/s, is a wee bit long.)

 

While it turned out to be unrelated, a recent customer question made me think back to the impact of client side compression on the reported saveset size, and for the life of me I couldn’t remember how client side compression affected saveset size reporting.

Of course, it’s relatively simple to test. So I created a 1GB file on my backup server using:

# dd if=/dev/zero bs=1024k count=1024 of=/root/test.dat

Next, to test, I configured a client entry with a saveset of just ‘/root/test.dat’, and set the backup running without any client side compression. The savegroup completion email showed the sort of size you’d expect:

--- Successful Save Sets ---

* tara.pmdg.lab:Probe savefs tara.pmdg.lab: succeeded.
 tara.pmdg.lab: /root/test.dat     level=full,   1048 MB 00:00:13      3 files
 tara.pmdg.lab: index:tara.pmdg.lab level=full,     3 KB 00:00:00      4 files
 tara.pmdg.lab: bootstrap          level=full,     91 KB 00:00:01    177 files

The next step was to enable client side compression. Being lazy and not wanting to launch NMC, I created /root/.nsr with the following content:

<< . >>
compressasm: test.dat

With the backup re-run, I got the conclusive evidence that the saveset size reported is the data written to media (or transferred from the client) not the size of the data itself:

--- Successful Save Sets ---

* tara.pmdg.lab:Probe savefs tara.pmdg.lab: succeeded.
* tara.pmdg.lab:/root/test.dat 66135:save: NSR directive file (/root/.nsr) parsed
* tara.pmdg.lab:/root/test.dat 66135:save: NSR directive file (/root/.nsr) parsed
 tara.pmdg.lab: /root/test.dat     level=full,    124 MB 00:00:07      3 files
 tara.pmdg.lab: index:tara.pmdg.lab level=full,     5 KB 00:00:00      5 files
 tara.pmdg.lab: bootstrap          level=full,    102 KB 00:00:01    186 files

So the next question is – is this a good thing?

The answer is a little fluid. The correct answer I think is that both sizes should be recorded. Clearly for the purposes of backwards compatibility, current sizing values need to continue to report the data written to media. However, logically, there is significant merit in adding another field to the database – e.g., clsize that would report the amount of data the client reads for the backup. This would save a lot of hassle. (The “totalsize” field is not used for this, by the way.)

In the meantime, we just have to keep in mind that the size reported by mminfo, the savegroup completion, etc., is the size written to media – or if you will the size transferred from the client to the storage node.

© 2012 The NetWorker Blog Suffusion theme by Sayontan Sinha