Smoothing Fulls

A common convention in organisations is that full backups are run on the weekend – regardless of whether that’s every weekend, or just one weekend a month.

While there are undoubtedly some businesses that must run backups like this, it is by and large done out of convention rather than consideration; i.e., it is frequently done because “it’s the done thing” rather than “it’s the necessary thing”.

This, unsurprisingly, puts pressure on backup resources, hardware requirements and data growth management – pressure that may actually be totally unnecessary. In short: if you do your full backups on the weekend because “that’s the way it’s always been done” then you might want to stop and consider the alternative.

Let’s take a sample small business, and map out their weekly backup cycle. This isn’t actually one of my customers, but it is an average of several of them:

  • Monday – 150 GB backup.
  • Tuesday – 203 GB backup.
  • Wednesday – 168 GB backup.
  • Thursday – 317 GB backup.
  • Friday – 2114 GB backup.
  • Saturday – 3619 GB backup.
  • Sunday – 744 GB backup.

If you’re doing a basic weekly backup cycle in a small to medium company then this sort of minimum, maximum and peak loading probably looks awfully familiar.

Graphing this out, we get an interesting view of the backup size peaks and troughs:

GB backup non-smoothed (bar)

“Streuth!”, an ocker Australian would say, “That’s a bloody big difference in backup sizes!”

In fact, if we look at those backup sizes on a comparative pie chart, we get the following:

GB backup non-smoothed (pie)Viewed this way, the weekend backups take up a significant percentage of the overall backup activity – which means they become a dominating factor in determining an optimum backup environment size. In fact, it shows us that 88% of the total amount of data backed up in a week is backed up in just 43% of the week – 3 out of the 7 days. The remaining 12% of data backed up during the week places no pressure on the backup environment at all.

If we come up with an average backup speed – let’s say 50MB/s for a smaller environment – we can see how long, in average terms, each day’s backup takes:

Hours to backup (non-smoothed) at 50MB/sOuch – our system barely needs to tick over Monday through to Thursday, but once the weekend hits, it’s really having to work hard to get everything backed up.

The net result? Backup windows may be regularly overrun, and even a moderate amount of data growth may necessitate new capital investment.

Or will it? Let’s instead consider the same amount of data backed up, but with full backups spread out over the entire week. Now, admittedly here I’m not averaging numbers, but spreading sizes pseudo-randomly out over the week to match the previous amount of data specified. So our numbers instead look like:

  • Monday – 983 GB
  • Tuesday – 733 GB
  • Wednesday – 842 GB
  • Thursday – 928 GB
  • Friday – 1357 GB
  • Saturday – 1536 GB
  • Sunday – 986 GB

[Edit: Qualification – an anonymous reader here questioned whether I meant doing a full backup every day. I didn’t quite explain my thinking here, sorry. I mean spreading out the full backups so that instead of trying to do them all over a short period, they’re spread out over the week. E.g., instead of every server doing a full backup on the weekend, some would do full backups on Monday, some on Tuesday, some on Wednesday, etc.]

If we graph that, using the same minimum/maximum as before, the spreading of full backups has smoothed the daily backup sizes considerably:

GB backup smoothed (bar)

Moving on to a pie graph, we can see that no single day dominates like before:

GB backup smoothed (pie)While Friday/Saturday/Sunday still create a reasonable hit in the backup sizing, it’s just 53% of the size. So the balancing has substantially reduced the strain of weekend backups – sure, each week day the system has to do a bit more, but the overall pressure is considerably less. This is strongly demonstrated by looking at the daily hours of operation, at 50MB/s:

Hours to backup at 50MB/s (smoothed)Instead of minimum run times in the order of less than an hour, but maximum run times of over 20 hours, we can now see a much more manageable peak run time of 8.74 hours.

Next time you notice that your full backups are overrunning, or causing stress on your backup windows, stop for a moment and ask yourself: can you smooth your backup load by spreading the fulls out across the week? You may be surprised by the answer.

2 thoughts on “Smoothing Fulls”

    1. I may not have been as precise as I intended 🙂

      I’m referring to spreading out the full backups over the course of a week. E.g., taking easy numbers, consider a small site where there’s 14 servers. You can either do full backups of all 14 servers over the weekend, or you could do a full backup of 2 servers every day. That way the fulls are smoothed over the course of the week, rather than being concentrated into a single small period.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.