For the most part we run standard backups once every 24 hours – daily. A lot of the time if you need to meet recovery point objectives smaller than this, you’ll be looking at complimenting backups with snapshot, CDP, etc.

However, sometimes snapshots and other high-availability options aren’t really what we want – we just want to be able to run a backup more frequently than 24 hours, and have it run automatically. (For instance, on particularly busy Oracle systems, you might want archived redo logs backed up every 4 hours, with logs deleted after 2 backups.)

Thankfully, NetWorker supports this (and has done for quite some time), via the interval setting in groups. By default, this is set to “24:00″ – 24 hours. It can however be set to a smaller value, which will trigger the group to run more frequently.

Before we consider smaller intervals, lets first revisit the key timing settings involved in a traditional group:

  • Start Time – The time the group is configured to run. (Defaults to 03:33*).
  • Interval – How often the group is configured to run. (Defaults to 24 hours).
  • Restart Window – How many hours after the start time will the group, if restarted, only re-run those savesets that failed or never ran, instead of re-running the entire group. (Defaults to 12 hours.)

Now, all these options are still used (and required) under higher frequency backups, with their meaning as follows:

  • Start Time – When the group is first run. This can be anything within a standard 24 hour window.
  • Interval – How often the group will re-run. This is not affected by when the group finishes.
  • Restart Window – Same as for standard interval backups.

So, let’s go back to that sample requirement – Oracle archived redo log backups run every 4 hours. Let’s consider setting up a new group that does this, with the backups starting at 00:01 initially, then running every 4 hours after that – i.e.,

  • 00:01
  • 04:01
  • 08:01
  • 12:01
  • etc

Here’s what this group configuration would look like in NMC:

Group settings in NMC (1 of 2)

Group settings in NMC (1 of 2)

In the first pane, it looks fairly standard – setting a start time of 00:01, and enabling autostart. It’s the second pane where things are a little different:

Group settings in NMC (2 of 2)

Group settings in NMC (2 of 2)

Here, we set the interval to 4 hours, and the restart window to 2 hours.


* I’m told that there were some ‘fun’ numbers used by early NetWorker programmers. E.g., one of the original index checks used to run every ? weeks (or more correctly, every 22/7 weeks). It’s possible that the critical situation engineer who told me this may have been pulling my leg however. I do think though that given how so many people dislike backups, 03:33 may have been chosen as a start time as a play on 6:66!

 

One of the most common configuration issues I see is where multiple NetWorker groups are configured to start simultaneously. For example, you might see a situation where say:

  • Daily Servers
  • Monthly Servers
  • Yearly Servers

All start at the same time. A common response when I express concern over this is “even though they all start at once, only one group will ever be backing up”. (I.e., skips are deployed appropriately.)

This isn’t sufficient.

Starting multiple groups simultaneously cause what I like to refer to as server spikes. That is, sudden, sharp increases in server resource usage. By ‘resource usage’, I’m not necessarily referring to memory and CPU, though that can occur, but by internal NetWorker communications and resource usage.

When server spikes occur, odd things can happen – albeit randomly and often intermittently, but they can still happen. Savesets might unexpectedly drop communications with the server and need to be restarted (or worse, hang, then continue once a second saveset for the same client/data is started by the server, creating load on the client); a single media load instruction might fail, or a single nsrmmd process might get timed out and restarted.

There’s an easy solution for this, and one which everyone should follow:

Never, ever, have more than one group start at the same time.

You don’t have to have a big gap. I’ve typically found that 5-10 minutes is an ample gap. If each group starts on its own, then the server behaves considerably more smoothly, and less weird intermittent/random failures occur. (If your response is that your backup windows don’t allow a five minute gap between groups, I’d reasonably confidently argue, even having not seen your site, that your backup configuration needs to be re-evaluated.)

© 2012 The NetWorker Blog Suffusion theme by Sayontan Sinha