This is part 2 in the series, “Data Lifecycle Management“.
Penny-wise data lifecycle management refers to a situation where companies take attitude that spending time and/or money on data lifecycle ageing is costly. It’s the old problem – penny-wise, pound-foolish; losing sight of long-term real cost savings by focusing on avoiding short term expenditure.
Traditional backup techniques centre around periodic full backups with incrementals and/or differentials in-between the fulls. If we evaluate a 6 week retention strategy, it’s easy to see where the majority of the backup space takes. Let’s consider weekly fulls, daily incrementals, with a 3% daily change rate, and around 4TB of actual data.
- Week 1 Full – 4TB.
- Week 1 Day 1 Incr – 123 GB
- Week 1 Day 2 Incr – 123 GB
- Week 1 Day 3 Incr – 123 GB
- Week 1 Day 4 Incr – 123 GB
- Week 1 Day 5 Incr – 123 GB
- Week 1 Day 6 Incr – 123 GB
Repeat that over 6 weeks, you have:
- 6 x 4 TB of fulls – 24 TB.
- 6 x 6 x incrs – 4.3TB.
Now, let’s assume that 30% of the data in the full backups represents stagnant data – data which is no longer being modified. It may be periodically accessed, but it’s certainly not being modified any longer. At just 30%, that’s 1.2TB of a 4TB full, or 7.2TB of the total 24 TB saved in full backups across the 6 week cycle.
Now, since this is a relatively small amount of data, we’ll assume the the backup speed is a sustained maximum throughput of 80MB/s. A 4 TB backup, at 80MB/s will take 14.56 hours to complete. On the other hand, a 2.8 TB backup at 80MB/s will take 10.19 hours to complete.
On any single full backup then, not backing up the stagnant data would save 1.2TB of space and 4.37 hours of time. Over that six week cycle though, it’s a saving of 7.2 TB, and 26.22 hours of backup time. This is not insubstantial.
There are two ways we can deal with the stagnant data:
- Delete it or
- Archive it
Contrary to popular opinion, before we look at archiving data, we actually should evaluate what can be deleted. That is – totally irrelevant data should not be archived. As to what data is relevant for archiving and what data is irrelevant will be a site-by-site decision. Some examples you might want to consider would include:
- Temporary files;
- Installers for applications whose data is past long-term and archive retention;
- Installers for operating systems whose required applications (and associated data) are past long-term archive;
- Personal correspondence that’s “crept into” a system;
- Unnecessary correspondence (e.g., scanned faxes confirming purchase orders for stationary from 5 years ago).
The notion of deleting stagnant, irrelevant data may seem controversial to some, but only because of the “storage is cheap” notion. When companies paid significant amounts of money for physical document management, with that physical occupied space costing real money (rather than just being a facet in the IT budget), deleting was most certainly a standard business practice.
While data deletion is controversial in many companies, consideration of archive can also cause challenges. The core problem with archive is that when evaluated from the perspective of a bunch of individual fileservers, it doesn’t necessarily seem like a lot of space saving. A few hundred GB here, maybe a TB there, with the savings largely dependent on the size of each fileserver and age of the data on it.
Therefore, when we start talking to businesses about archive, we often start talking about fileserver consolidation – either to a fewer traditional OS fileservers, or NAS units. At this point, a common reason to balk is the perceived cost of such consolidation – so we either have the perception that:
- Deleting is “fiddly” or “risky”, and
- Archive is expensive.
Regardless, it effectively comes down to a perceived cost, regardless of whether that’s a literal capital investment or time taken by staff.
Yet we can still talk about this from a cost perspective and show savings for eliminating stagnant data from the backup cycle. To do so we need to talk about human resources – the hidden cost of backing up data.
You see, your backup administrators and backup operators cost your company money. Of course, they draw a salary regardless of what they’re doing, but you ultimately want them to be working on activities of maximum importance. Yes, keeping the backup system running by feeding it media is important, but a backup system is there to provide recoveries, and if your recovery queue has more items in it than the number of staff you have allocated to backup operations, it’s too long.
To calculate the human cost of backing up stagnant data, we have to start categorising the activities that backup administrators do. Let’s assume (based on the above small amounts of data), that it’s a one-stop shop where the backup administrator is also the backup operator. That’s fairly common in a lot of situations anyway. We’ll designate the following categories of tasks:
- Platinum – Recovery operations.
- Gold – Configuration and interoperability operations.
- Silver – Backup operations.
- Bronze – Media management operations.
About the only thing that’s debatable there is the order in which configuration/interoperability and backup operations should be ordered. My personal preference is the above, for the simple reason that backup operations should be self-managing once configured, but periodic configuration adjustments will be required, as will be ongoing consideration of interoperability requirements with the rest of the environment.
What is not debatable is that recovery operations should always be seen to be the highest priority activity within a backup system, and media management should be considered the lowest priority activity. That’s not to say that media management is unimportant, it’s just that people should be doing more important things than acting as protein based autoloaders.
The task categorisation allows us to rank the efficiency and cost-effectiveness of the work done by a backup administrator. I’d propose the following rankings:
- Platinum – 100% efficiency, salary-weight of 1.
- Gold – 90% efficiency, salary-weight of 1.25.
- Silver – 75% efficiency, salary-weight of 1.5.
- Bronze – 50% efficiency, salary-weight of 3.
What this allows us to do is calculate the “cost” (in terms of effectiveness, and impact on other potential activities) of the backup administrator spending time on the various tasks within the environment. So, this means:
- Platinum activities represent maximised efficiency of job function, and should not incur a cost.
- Gold activities represent reasonably efficient activities that only occur a small cost.
- Silver activities are still mostly efficient, with a slightly increased cost.
- Bronze activities are at best a 50/50 split between being inefficient or efficient, and have a much higher cost.
So, if a backup administrator is being paid $30 per hour, and does 1 hour each of the above tasks, we can assign hidden/human resource costs as follows:
- Platinum – $30 per hour.
- Gold – 1.1 * 1.25 * $30 – $41.25 per hour.
- Silver – 1.25 * 1.5 * $30 – $56.25 per hour.
- Bronze – 1.5 * 3 * $30 – $135 per hour.
Some might argue that the above is not a “literal” cost, and sure, you don’t pay a backup administrator $30 for recoveries and $135 for media management. However, what I’m trying to convey is that not all activities performed by a backup administrator are created equal. Some represent best bang for buck, while others progressively represent less palatable activities for the backup administrator (and for the company to pay the backup administrator to do).
You might consider it thusly – if a backup administrator can’t work on a platinum task because a bronze task is “taking priority”, then that’s the penalty – $105 per hour of the person’s time. Of course though, that’s just the penalty for paying the person to do a less important activity. Additional penalties come into play when we consider that other people may not be able to complete work because they can’t get access to the data they need, etc. (E.g., consider the cost of a situation where 3 people can’t work because they need data to be recovered, but the backup administrator is currently swapping media in the tape library to ensure the weekend’s backups run…)
Once we know the penalty though, we can start to factor in additional costs of having a sub-optimal environment. Assume for instance, a backup administrator spends 1 hour on media management tasks per TB backed up per week. If 1.2TB of data doesn’t need to be backed up each week, that’s 1.2 hours of wasted activity by the backup administrator. With a $105 per hour penalty, that’s $126 per week wasted, or over $6,552 per year.
So far then, we have the following costs of not deleting/archiving:
- Impact on backup window;
- Impact on media usage requirements (i.e., what you’re backing up to);
- Immediate penalty of excessive media management by backup administrator;
- Potential penalty of backup administrator managing media instead of higher priority tasks.
The ironic thing is that deleting and archiving is something that smaller businesses seem to get better than larger businesses. For smaller, workgroup style businesses, where there’s no dedicated IT staff, the people who do handle the backups don’t have the luxury of tape changers, large capacity disk backup or cloud (ha!) – every GB of backup space has to be careful apportioned, and therefore the notion of data deletion and archive is well entrenched. Yearly projects are closed off, multiple duplicates are written, but then those chunks of data are removed from the backup pool.
When we start evaluating the real cost, in terms of time and money, of continually backing up stagnant data, the reasons against deleting or archiving data seem far less compelling. Ultimately, for safe and healthy IT operations, the entire data lifecycle must be followed.
In the next posts, we’ll consider the risks and challenges created by only archiving, or only deleting.
Or you could backup to i.e. Data Domain which means that the extra retention of “unchanged” data will not cost you additional space / storage, since the deduplication will remove any double data.
Backing up stagnant data, even via deduplication, is still a wasteful endeavour. Of course there’ll be some level of stagnant data that is always backed up, but the solution is not to keep backing up the same stale data again and again; even if we’re not storing large amounts more data in the backup, it still takes time to process, it still occupies primary storage, and it still costs.