Basics – Understanding NetWorker Dependency Tracking

 Backup theory, NetWorker  Comments Off on Basics – Understanding NetWorker Dependency Tracking
Sep 162017

Dependency tracking is an absolutely essential feature within a backup product. It’s there to ensure you can recover data through the entire specified retention period for your backups, regardless of what mix of full, differential and/or incremental backups you do. It’s staggering to think there are some backup products out there (*cough* net *cough* ‘backup’), that treat backup retention with such contempt that they don’t bother to enforce dependency preservation.

Without dependency tracking, you’ve always got the risk that a recovery you want to do on the edge of your specified retention period might fail.

NetWorker does dependency tracking by default. In fact, it only does dependency tracking. To understand how dependency tracking works, and what that means for protecting your backups, check out my video below. (Make sure to switch it into High Definition – it’s not about being able to see more of my beard, but it is to make sure you can see all the screen content!)

Dependency tracking is such an important feature in data protection that you’ll find it’s also covered in my book, Data Protection: Ensuring Data Availability.

On another note, I’m starting a new project. I may work in IT, but I’ve always been a fan of philosophy, too. The new project is called Fools Rush In, and it’s going to be an ongoing weekly exploration of topics relating to ethics in IT and modern technology. It’s going to be long-form in its approach – the perfect thing to sit down and read over a cup of coffee or tea. This’ll be an exciting journey, and I’d love it if you joined me on it. The introductory article is …where angels fear to tread, and the latest post, What is Ethics? gives a bit of a primer on schools of ethical thought and how we can start approaching ethics in IT/technology.

Jan 082010

I thought it about time that I cited the two key reasons why, if faced with a choice between NetWorker and NetBackup, I would choose NetWorker every time.

As you might expect, given my focus on backup as insurance, both of these reasons are firmly focused on recovery. In fact, so much so that I still don’t really understand why EMC doesn’t go to market with these points time and time and time again and just smack Symantec around until it’s blue in the face and begging for mercy.

Reason 1: NetBackup does not implement backup dependencies

I struggle to call NetBackup an “enterprise” backup product because of this simple fact. Honestly, backup dependencies are critically important when it comes to guaranteeing anything but last-backup recoverability.

What does this mean?

In short, as soon as a backup hits its retention period in NetBackup, it’s toast – it’s a goner.

Irrespective of whether there are any backups of the same filesystem/data set that requires the “outside retention” backup for recovery purposes.

I can’t sum this up any other way: in a backup product, I see this as recklessly irresponsible. It provides a focus on media savings that even the most miserly bean cruncher would admire. Well, until the bean cruncher’s system can’t be recovered from 6 weeks ago to fulfil audit requirements.

Reason 2: True Image Recovery is “optional”

If you’ve grown up in a NetWorker world, where the emphasis has always been, and will always continue to be on recovery, this will, like the reason above, make you soil yourself. Imagine having a full backup plus six incremental backups of a directory, and wanting to recover the filesystem from last night. Now imagine just selecting the full plus the incrementals for recovery and getting back everything generated during that time.

Even the files that had been deleted between backups. I.e., you don’t get back what the filesystem looked like at the time of the backup that you’re recovering from, but what it looked like for every backup that you’re recovering from.

NetWorker, once, in the 5.5.x stream implemented this. It was called a BUG. In NetBackup, it’s a “feature”. In order to enable a correct recovery, you have to turn on “true image recovery”, something that takes extra resources, and is typically advised  that you keep the data just for a small cycle (e.g., 7 days) rather than the complete retention time for the backups.

There’s another word for this: Joke.

On another front…

As recently as December I mentioned that I wished EMC would get their act together and implement inline cloning – one of the few things where I saw that NetBackup had a distinct competitive advantage over NetWorker.

Maybe it was the glow of the cider, but I had an epiphany in Copacabana on a hill watching (probably illegal) fireworks in Avoca and Terrigal on new years eve. Inline cloning is no longer a compelling factor in a backup product. Why? Media streaming speeds have reached a point where companies with serious amounts of data just should not be implementing direct-to-tape backup solutions any more. Inline cloning was developed at a time when you’d want to generate both sets of tapes as quickly as possible, but only companies with very small data sets will find themselves not backing up to some disk unit first (be it say, ADV_FILE, or VTL, in NetWorker), and those companies won’t be constrained on backup/clone windows to a point where they’d need inline cloning anyway.

When not backing up direct-to-tape, there are several factors that mitigate the need to do inline cloning. In organisations with a very strong need for offsiting, there’s replication at a VTL or disk backup unit layer. In organisations that just need a second copy generated “as soon as possible”, doing disk/virtual tape to physical tape cloning following the backup should be fast enough to handle the cloning at appropriate performance levels.

In other words: there’s no need for EMC to implement inline cloning. As a technology, it’s a dead-end from a tape-only time. I feel somewhat silly this didn’t occur to me sooner.

%d bloggers like this: