Expecting a more complete backup

As I point out in my book, there’s a lot of stuff that ends up in datacentres completely unprotected. That includes such components as:

  • Network switch configurations
  • Storage switch configurations
  • PABXs
  • etc.

By “unprotected”, I mean not regularly backed up and monitored within the centralised enterprise backup framework.

However, it also covers backups for databases that don’t have application modules. Over the last few years there’s been a lot of outcry over a lack of support for MySQL – personally I’d prefer more attention be given to PostgreSQL, but either way, these two open source databases also frequently get neglected in centralised backup solutions as well due to the lack of module support.

When it comes to backing up what I’d traditionally refer to black box devices – switches, PABXs, etc., you’re never going to get an application module. That doesn’t mean though that backups for these devices need to stay in the dark ages where every now and then someone logs on to it, retrieves the configuration, saves it to a fileserver somewhere and hopes that it doesn’t get deleted before the next backup – or that the backups for it are retained long enough.

To centralise and automate backups for ‘non-traditional’ components, you need to start exploring scripting languages. Sure, some devices now support ssh and scp, but even so, there’ll reach a point where a certain level of interractivity is required to backup a non-traditional device or database, and at that point you need to script.

One of my favourite languages for this purpose is a lesser known one (particularly outside of Unix circles) called expect. If you don’t want to follow that link yet, but need the elevator pitch for expect, it’s a language that allows you to script an interactive session with a program that would normally reject scripts and require you to manually type the commands. That is, it’s an automation tool.

As I advocate in my book, by using utilities like expect, you can design a solution such as the following:

  • Traditional clients receive standard backups
  • For non-traditional clients:
    • Define a special client (e.g., another instance of the backup servers’ client resource) that makes use of savepnpc for its backups;
    • The savepnpc component, will, for a pre-script, log on to the non-traditional devices, retrieve their configuration dumps, backups, etc.;
    • That retrieved data will then be saved as files on the backup server, preferably both in some human-readable format (where applicable), and also in the appropriate format for re-loading the configuration;
    • Once the savepnpc activities are complete, the local filesystems will be backed up normally using NetWorker, allowing the centralise and automated backup of non-traditional clients.

Similarly, the same can be achieved for non-supported databases such as MySQL or PostgreSQL:

  • Database server is configured with savepnpc for its backup command;
  • The pre-script for the backup server generates a snapshot or exports a dump of the database;
  • The exported or snapshot region is backed up as part of the traditional filesystem backup.

In essence, what I’m saying is there’s very little, if no reason, why you can’t automate and centralise your non-traditional backups in the same way that you use an enterprise class backup product to automate and centralise your traditional backups.

For example, let’s consider backing up a PostgreSQL database via expect, and integrating that backup with regular filesystem backups for the database server. In this case, it’s only a small database, and PostgreSQL supports hot exports*, so we’ll do the backup via the PostgreSQL pg_dump command.

The pg_dump command leverages the security of the database(s) being dumped; thus, if you have a standard PostgreSQL configuration that allows anyone on the local host to connect to any database, you don’t need any special scripting. But assuming you’re in an enterprise environment and you have password protected access to the database, you will need to supply a password to the pg_dump command for the database to be backed up. The pg_dump command however is one of those pesky commands that refuses to accept a password as a command line argument**, so we need to look at using expect to supply a password for us.

So you’d start with an expect script file, which for a simple database called “tododb”, might resemble the following:

#!/usr/bin/expect -f
spawn /usr/local/pgsql/bin/pg_dump -U backup_user -W -f /nsr/backups/tododb.dump
expect "Password: "
send "a9d8ea8d12b4b47db8bd833b8fade7b2r"
sleep 120

(For the purposes of what we’re doing, we’ll call this file pgbackup.e and assume it’s in the /home/postgresql directory.)

So what does this do? First, it spawns (or executes) the pg_dump command. Then it waits for the “Password: ” prompt to be supplied by the pg_dump command, and when it receives that, it sends the encrypted password. (No, that’s not the real password I use!)

Because it’s only a small database, it then waits 2 minutes for the dump to complete before exiting.

You’d either add into, or wrap around this script, additional commands or scripts to say, confirm that the database dump has been completed successfully, or ensure that multiple copies are being kept on-line, etc., but for the purposes of brevity I’ll leave those options as exercises for the reader. Since NetWorker doesn’t provide any environment to the precmd or pstcmd in a savepnpc action, you will need to ensure that at bare minimum you setup an execution environment that configures the PATH and the PGDATA path correctly. This might resemble the following:

#!/bin/bash

export PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/pgsql/bin
export PGDATA=/usr/local/pgsql/data
/home/postgresql/pgbackup.e

For the purposes of this exercise, we’ll assume this is saved as /home/postgresql/pgbackup.sh.

Next, the client needs to be configured to use savepnpc. Assuming you wanted to do this for the client cerberus in the group Daily Databases, you would need to create in /nsr/res on cerberus a file named “Daily Databases.res”, with the following content:

type: savepnpc;
precmd: "/home/postgresql/pgbackup.sh";
pstcmd: "/bin/sleep 5";
timeout: "12:00:00";
abort precmd with group: No;

Note that the pstcmd content isn’t really important in this scenario, as we only need to do something before the backup runs; I don’t like to delete this entry though because in the past I’ve encountered issues with savepnpc backups that were missing either a precmd entry or a pstcmd entry (admittedly that was some time ago).

With this file in place, all you would need to do is set the backup command for the client cerberus in the Daily Databases group to savepnpc; NetWorker will handle the rest.

For more in-depth coverage of choosing what to backup, extending your backups beyond traditional filesystems and databases, and making risk vs cost decisions on backup strategies, check out my book.


* I.e., can generate an export from an in-use database that can be recovered from later. Not all databases support this.

** Something the programmers of it should be applauded for.

7 thoughts on “Expecting a more complete backup”

  1. On the network/FC switch backup. They tend to be a “set and forget” config. We all hope that everyone has a change control process that will ensure the current config is saved (backed up) and then then new config is also saved. The later however always seems to get missed, since the change always seems to take place at 4am and that is the LAST thing on your mind at the time. Snuggles with your SO seem like a better choice ;)If they are awake.

  2. That’s a good example of where the ‘adhoc’ nature of such backup decisions often fail; or if you want to be more generalised you could equally say that as soon as a person has to act to make the backup go ahead, you should assume a high failure rate – i.e., failure to automate means failure to backup.

  3. I tried executing the pg_dump using expect.

    When i give the password in “send” as plain text it dumps the given database correctly. But when i give the encrypted password, it does not dumps the database. And in log “Password authentication failure” error is there. I encrypted the password using the md5sum command.

    So how to send the encrypted password correctly to the pg_dump command ?

    1. That would seem to me to indicate that either the account you’re using is configured with an unencrypted password, or the database itself that you’re trying to backup doesn’t have md5sum authentication enabled. For instance, you might want to check the pg_hba.conf file to make sure that the access method is ‘md5’, though that will obviously affect how other user accounts can then access the database, without additional tweaking. (E.g., if all users currently access via the ‘password’ method, you might set the backup user to instead access via the ‘md5’ method.)

  4. Thanks for the answer. But after doing the specified change too i don’t get the correct result.

    I changed the access method as ‘md5’ and restarted the postgresql service. And now executed the expect script, again in the postgresql log, the same error ‘ Authentication failure ‘.

    Misc:
    1. I can login to the psql prompt by providing the normal password only !
    2. If so, How does the pg_dump knows whether the user is giving normal password or an encrypted password.
    3. What is the identification does the pg_dump has to identify whether the password is being supplied is md5 or normal …

    1. Unfortunately PostgreSQL administration isn’t my strong point – I know enough to allow for what I need to do, but past that unless I actually need to solve a particular problem myself I tend not to have a lot of exposure to the administration side of it. My guess would be that you still need to configure the user for encrypted access as well, as per the “alter user” command:

      template1=# help alter user;
      Command: ALTER USER
      Description: change a database role
      Syntax:
      ALTER USER name [ [ WITH ] option [ … ] ]

      where option can be:

      SUPERUSER | NOSUPERUSER
      | CREATEDB | NOCREATEDB
      | CREATEROLE | NOCREATEROLE
      | CREATEUSER | NOCREATEUSER
      | INHERIT | NOINHERIT
      | LOGIN | NOLOGIN
      | CONNECTION LIMIT connlimit
      | [ ENCRYPTED | UNENCRYPTED ] PASSWORD ‘password’
      | VALID UNTIL ‘timestamp’

      After that though if that doesn’t work I’d suggest you touch base with members of the PostgreSQL administration community…

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.