A question I get asked periodically is “can I backup my filesystem and database at the same time?”
As is often the case, the answer is: “it depends”.
Or, to put it another way: it depends on what the specific client can handle at the time.
For the most part, backup products have a fairly basic design requirement: get the data from the source (let’s say “the client”, ignoring options like ProtectPoint for the moment) to the destination (protection storage) as quickly as possible. The faster the better, in fact. So if we want backups done as fast as possible, wouldn’t it make sense to backup the filesystem and any databases on the client at the same time? Well – the answer is “it depends”, and it comes down to the impact it has on the client and the compatibility of the client to the process.
First, let’s consider compatibility – if both the filesystem and database backup process use the same snapshot mechanism for instance, and only one can have a snapshot operational at any given time, that immediately rules out doing both at once. That’s the most obvious scenario, but the more subtle one almost comes back to the age-old parallelism problem: how fast is too fast?
If we’re simultaneously conducting a complete filesystem read (say, in the case of a full backup) and simultaneously reading an entire database and the database and filesystem we’re reading from both reside on the same physical LUN, there is the potential the two reads will be counter-productive: if the underlying physical LUN is in fact a single disk, you’re practically guaranteed that’s the case, for instance. We wouldn’t normally want RAID-less storage for pretty much anything in production, but just slipping RAID into the equation doesn’t guarantee we can achieve both reads simultaneously without impact to the client – particularly if the client is already doing other things. Production things.
Virtualisation doesn’t write a blank cheque, either; image level backup with databases in the image are a bit of a holy grail in the backup industry but even in those situations where it may be supported, it’s not supported for every database type; so it’s still more common than not to see situations where you have virtual/image level backups for the guest for crash consistency on the file and operating system components, and then an in-guest database agent running for that true guaranteed database recoverability. Do you want a database and image based backup happening at the same time? Your hypervisor is furiously reading the image file while the in-guest agent is furiously reading the database.
In each case that’s just at a per client level. Zooming out for a bit in a datacentre with hundreds or thousands of hosts all accessing shared storage via shared networking, usually via shared compute resources as well, how long is a piece of string becomes a exponentially increasing question as the number of shared resources and items sharing those resources start to come into play.
Unless you have an overflow of compute resources and SSD offering more IO than your systems can ever need, can I backup my filesystem and databases at the same time is very much a non-trivial question. In fact, it becomes a bit of an art, as does all performance tuning. So rather than directly answering the question, I’ll make a few suggestions to be considered along the way as you answer the question for your environment:
- Recommendation: Particularly for traditional filesystem agent + traditional database agent backups, never start the two within five minutes, and preferably give half an hour gap between starts. I.e., overlap is OK, concurrency for starting should be avoided where possible.
- Recommendation: Make sure the two functions can be concurrently executed. I.e., if one blocks the other from running at the same time, you have your answer.
- Remember: It’s all parallelism. Rather than a former CEO leaping around stage shouting “developers, developers, developers!” imagine me leaping around shouting “parallelism, parallelism, parallelism!”* – at the end of the day each concurrent filesystem backup uses a unit of parallelism and each concurrent database backup uses a unit of parallelism, so if you exceed what the client can naturally do based on memory, CPU resources, network resources or disk resources, you have your answer.
- Remember: Backup isn’t ABC, it’s CDE: Compression, Deduplication, Encryption: Each function will adjust the performance characteristics of the host you’re backing up – sometimes subtly, sometimes not so. Compression and encryption are easier to understand: if you’re doing either as a client-CPU function you’re likely going to be hammering the host. Deduplication gets trickier of course – you might be doing a bit more CPU processing on the host, but over a shorter period of time if the net result if a 50-99% reduction in the amount of data you’re sending.
- Remember: You need the up-close and big picture view. It’s rare we have systems so isolated any more that you can consider this in the perspective of a single host. What’s the rest of the environment doing or likely to be doing?
- Remember: ‘More magic’ is better than ‘magic’. (OK, it’s unrelated, but it’s always a good story to tell.)
- Most importantly: Test. Once you’ve looked at your environment, once you’ve worked out the parallelism, once you’re happy the combined impact of a filesystem and database backup won’t go beyond the operational allowances on the host – particularly on anything remotely approaching mission critical – test it.
If you were hoping there was an easy answer, the only one I can give you is don’t, but that’s just making a blanket assumption you can never or should never do it. It’s the glib/easy answer – the real answer is: only you can answer the question.
But trust me: when you do, it’s immensely satisfying.
On another note: I’m pleased to say I made it into the EMC Elect programme for another year – that’s every year since it started! If you’re looking for some great technical people within the EMC community (partners, employees, customers) to keep an eye on, make sure you check out the announcement page.
—
* Try saying “parallelism, parallelism, parallelism!” three times fast when you had a speech impediment as a kid. It doesn’t look good.