Zeroth tier: the backup director

(OK, I just made that term up, there is within the NetWorker framework, no reference ever to a “zeroth” tier. That doesn’t preclude me from using the term though.)

The classic 3-tier architecture of NetWorker is:

  • Backup Server
  • 1 or more storage nodes (1 of which is the backup server)
  • Clients

In a standard environment, as it grows, you typically see a situation where clients are hived off to storage nodes, such that the backup server handles only a portion of the backups, with the remainder going to storage nodes.

One thing that’s not always considered is what I’d call the ability to configure the NetWorker server in a zeroth tier; that is, acting only as backup director, and not responsible for the data storage or retrieval of any client.

Is this a new tier? Well, technically no, but it’s a configuration that a lot of companies, even bigger companies, seem reluctant to engage in. It seems for the most part that this is due to the perception that by elevating the backup server to a directorial role only, the machine is ‘wasted’ or the solution is ‘costly’. Unfortunately this means many organisations that could really, really find benefit in having a backup server in this zeroth tier continue to limp along with solutions that suffer random, sporadic, periodic failures that cannot be accounted for, or require periodic restart of services just to “reset” everything, etc.

Now, the backup server still has to have at least one backup device attached to it – the design of NetWorker requires the server itself to write out its media database and resource database. There’s a good reason for this, in fact – if you allow such bootstrap critical data to be written solely to a remote device (i.e., a storage node device), you create too many dependencies and setup tasks in a disaster recovery scenario.

However, if you’re at the point where you need a NetWorker server in the zeroth tier, you should be able to find the budget to allocate at least one device to the NetWorker server. (E.g., a bit of dynamic drive sharing, or dedicated VTL drives, etc., would be one option.) Preferably of course that would be two devices so that cloning could be handled device<->device, rather than across the network to a storage node, but I don’t want to focus too much on the device requirements of a directorial backup server.

There’s actually a surprising amount of work that goes into just directing a backup. This covers such activities as:

  • Checking to see what other backups at any point need to be run (e.g., multiple groups)
  • Enumerating what clients need to be backed up in any group
  • Communicating with each client
  • Receiving index data from each client
  • Coordinating device access
  • Updating media records
  • Updating jobs records
  • Updating configuration database records
  • etc.

If the grand scheme of things where you don’t have “a lot” of clients, this doesn’t represent a substantial overhead. What we have to consider though is the two different types of communication going on – data, and meta-data. Everything in the above list is meta-data related; none of it is actually the backup data itself.

So add to the above list the data streams that have one purpose in a normal backup environment – to saturate network links to maximise throughput to backup devices.

Evaluating these two types of communication – meta-data streams and data streams, there’s one very obvious conclusion: they aren’t mutually satisfying. That is, the data stream is by necessity going to be as greedy with bandwidth as it can be, and just as equally, the meta-data stream must have the bandwidth it requires or else failures start to happen.

So, as an environment grows (or as NetWorker is deployed into a very large environment), the solution should be equally as logical – if it gets to the point where the backup server can’t facilitate meta-data bandwidth and regular data bandwidth, there’s only communications stream that can be cut from its workload – the data stream.

I’m not suggesting that every NetWorker datazone needs to be configured this way; many small datazones operate perfectly with no storage nodes at all (other than the backup server itself); others operate perfectly well with one or more storage nodes deployed and the backup server operating as a storage node. However, if the environment grows to the point where the backup server can be kept fully occupied by directing the backups, then cut the cord and let it be the director.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.