A Simple Linux KVM Backup Framework using NetWorker

Of late I’ve been playing around a little bit with KVM. I must admit, it’s not really my virtualisation system of choice, but I’ve been using VMware in some form or another since the late 90s, so that could as much as anything be a case of what-I’m-used-to-syndrome.

Blue glowing hardware fractal computer generated abstract backgroundBecause I’m a backup geek, once I’ve been using something for a short while, my thoughts invariably turn to the question: how would I back this up? This leads to gratitude that NetWorker and Avamar are both framework based backup products: sure, they come with a lot of built-in functionality; but if you want something that’s not there by default, and you have a little bit of determination, you can always come up with a solution.

The simplest solution for KVM would be to just shutdown the virtual machines when I needed to, then back them up. That’s fine for a lab environment, which is where I’m running them, but I don’t like doing things as if I’m in a lab environment; instead, I wanted to come up with something that could be run in a more life-like situation. So that meant being able to do backups around snapshots.

With Linux KVM, that’s a bit of work. My test environment was a CentOS 7 server, which meant I had to go through the process of enabling the out-of-band QEMU options for CentOS — the default KVM install includes older API versions that don’t support live snapshot processes. On the CentOS 7 server, I deployed a simple CentOS 7 guest, too. Ideally, if you want to be able to quiesce KVM guest filesystems when you backup (i.e., commit writes to better capture the content of the filesystem during backup), you’ll need to ensure the virtual machine has the qemu-guest-agent package installed, too.

The short view of the end to end process works like this:

  • Identify the virtual machine(s) running on a KVM server
  • For each virtual machine:
    • Identify the virtual disk(s)
    • Take a snapshot of the virtual machine’s disks
    • Backup the original virtual machine files (QCOW2 format)
    • Release the snapshot and delete the snapshot file(s)
    • Generate an XML configuration dump of the virtual machine
    • Backup the XML configuration dump
    • Delete the XML configuration dump

It was relatively easy to do the backup as an ad-hoc, manual process (and, of course, verify recovery!), but I wanted to go a little further, so I decided to write a bit of a proof of concept script. In fact, I was able to verify the backup and recovery process using both NetWorker and Avamar, but for the script I decided to roll my sleeves up and do a bit of working with NetWorker.

The end result gives me a backup process that looks like the following:

KVM Backup Command and mminfo
KVM Backup Command and mminfo

In the above, I’ve executed the script, savekvm.pl with a 7 day retention time (-r 7), for the BoostBackup pool (-b BoostBackup), to the backup server orilla (-s orilla), with the quiesce guest option (-Q) enabled. The backup writes reasonably verbose logging to the programmed log file (/nsr/applogs/savekvm.log), but is designed to be reasonably quiet on the actual console itself. (After all, it’s just a proof of concept.)

However, as you can see from the subsequent mminfo command, I’m making use of the NetWorker option to generate a name for the saveset to logically name the content that I’ve backed up – in this case, a naming format of:

hypervisor:KVM:vmGuest:{config|diskName}

So in this example, I’ve got a virtual machine guest called ‘cent7’ running on the hypervisor/server ‘oa’; ‘cent7’ has two virtual disks, vda and vdb (effectively the KVM equivalent of sda and sdb), and of course, a backup of the configuration file.

The full command syntax available by the way for my savekvm.pl file is:

KVM Command Syntax
KVM Command Syntax

As you can see, there’s some defaults built into the script for the server, the pool and the retention time. There’s also options such as logging the details around what commands would be executed, without actually executing them; useful if you’re wanting to see what would go on during a backup process. The log file you get when you execute looks a little like the following:

KVM savekvm_log
KVM savekvm.log

To test this out in a live scenario, I setup a script on my virtual machine which would generate content to one of its filesystems, and started running it just before I started the backup process. So working across two terminal sessions, the combined view looked like:

KVM Changing VMs while backing up
KVM Changing VMs while backing up

So on the left hand terminal, there’s the ssh into the guest virtual machine to run a script called /root/fillerup.sh which just executes dd commands into /d/data. After starting that, I then flicked across to the second terminal to run the backup. Ergo, when we restore the virtual machine, we should get back some content into /d/data — at least, whatever had been generated and could be quiesced at the start of the snapshot process.

After the backup was completed, in order to test, I needed to destroy the virtual machine so I could recover it. (Relax, I’ve tested this a few times.) The process therefore was:

KVM delete virtual machine
KVM delete virtual machine

So in the above, the deletion of the virtual machine consisted of: shutting it down, undefining it, then deleting the virtual disks. After that was done, I could use the NetWorker recover command to get the data back. This is a two step recovery – recover the virtual disks, and then recover the configuration file. Here’s the virtual disk recovery process:

KVM Recover Virtual Disks
KVM Recover Virtual Disks

After the virtual disks were recovered, I then recovered the configuration file. Now, I could have recovered them all at the same time, but I generate the configuration file to a different location, which means a relocated recovery into a single directory would result in sub-directories being created, and I wanted to keep it simple:

KVM Recover Config File
KVM Recover Config File

So at that point, that’s left me with the configuration file and .qcow2 files into the /home/kvm/cent7 directory. To complete the process, all I need to do is:

  • Import the virtual machine from the restored configuration dump file
  • Start the virtual machine
  • Log onto the virtual machine and confirm there were some files created in /d/data

Here’s what that looked like:

KVM Recreation and Verification
KVM Recreation and Verification

So there you have it: using NetWorker to backup and recover KVM.

This is a classic example of what I’ve always said about enterprise backup products: it doesn’t matter what options they ship with so long as you’ve got the capability of extending their functionality to suit your needs. The script I’ve created, for instance, could in theory be called as part of a pre command when doing a hypervisor backup to first backup the guest virtual machines. (Or it could be run by the KVM administrator.) Because in this case we name the savesets based on the hypervisor, guest and disk/config, it theoretically makes it easy to track virtual machines if they get moved between hypervisors (though I don’t have a full environment to test that under).

If you’ve got KVM guests on Linux that you want to backup using NetWorker, feel free to check out version 1 of my savekvm script, which you can download from here. (As always when you download a script from somewhere: test it out, review it carefully, and use it at your own risk!)

1 thought on “A Simple Linux KVM Backup Framework using NetWorker”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.