Every now and then the topic arises over whether snapshots are backups.
This is going through a resurgence at the moment, as NetApp has dropped development of their VTL systems, with some indications being that they’re going to revert to recommending people use snapshots and replication for backup.
So this raises the question again – is a snapshot a backup? I’ll start by quoting from my book here:
A backup is a copy of any data that can be used to restore the data as/when required to its original form. That is, a backup is a valid copy of data, files, applications, or operating systems that can be used for the purposes of recovery.
On the face of this definition, a snapshot is indeed a backup, and I’d agree that on a per-instance basis snapshots can act as backups. However, I’d equally argue that building your entire backup and recovery system on the basis of snapshots and replication is like building a house of cards on shifting sand in the face of an oncoming storm. In short, I don’t believe that snapshots and replication alone provide:
- Sufficient long-term protection.
- Sufficient long-term management.
- Sufficient long-term performance.
I’ll be the first to argue that in a system with high SLAs, having snapshots and/or replication is going to be almost a 100% requirement. You can’t meet a 1 hour data loss deadline if you only backup once every 24 hours – and backing up every hour using conventional backup systems is rarely appropriate (or rarely even works). So I’m not dismissing snapshots at all.
It’s easy to discuss the theoretical merits of using snapshots in lieu of backup/recovery software as a total backup system, but I think that the practical considerations quickly overcome any theoretical discussion. So let’s consider a situation though where you want to keep your backups for 6 months. (These days that’s a fairly short period.) Do you really want to keep 6 months of snapshots around? Let’s assume we keep hourly snapshots for 2 weeks, then one snapshot per day for the rest of the time. That’s 504 snapshots per system – in fact, normally per NAS filesystem. Say you’ve got 4 NAS units and 30 filesystems on each one – that’s around 60,000 snapshots over a course of 6 months.
What’s 60,000+ snapshots going to do to:
- Primary production storage performance?
- Storage and backup administrator management?
- Storage costs?
- Indexing costs?
The argument that snapshots and replication alone can replace a healthy enterprise backup system (or act in lieu of it) just doesn’t wash as far as I’m concerned. It looks good on paper to some, but on closer inspection, it’s a paper tiger. By all means within environments with heavy SLAs they’re likely to form part of the data protection solution, but they shouldn’t be the only solution.












