pmdg

Aug 052017
 

It may be something to do with my long Unix background, or maybe it’s because my first system administration job saw me administer systems over insanely low link speeds, but I’m a big fan of being able to use the CLI whenever I’m in a hurry or just want to do something small. GUIs may be nice, but CLIs are fun.

Under NetWorker 8 and below, if you wanted to run a server initiated backup job from the command line, you’d use the savegrp command. Under NetWorker 9 onwards, groups are there only as containers, and what you really need to work on are workflows.

bigStock Workflow

There’s a command for that – nsrworkflow.

At heart it’s a very simple command:

# nsrworkflow -p policy -w workflow

That’s enough to kick off a backup job. But there’s some additional options that make it more useful, particularly in larger environments. To start with, you’ve got the -a option, which I really like. That tells nsrworkflow you want to perform an ‘adhoc’ execution of a job. Why is that important? Say you’ve got a job you really need to run today but it’s configured to skip … running it in adhoc will disregard the skip for you.

The -A option allows you to specify specific overrides to actions. For instance, if I wanted to run a job workflow today from the command line as a full rather than an incremental, I might use something like the following:

# nsrworkflow -p Gold -w Finance -A "backup -l full"

The -A option there effectively allows me to specify overrides for individual actions – name the action (backup) and name the override (-l full).

Another useful option is -c component which allows you to specify to run the job on just a single or a small list of components – e.g., clients. Extending from the above, if I wanted to run a full for a single client called orilla, it might look as follows:

# nsrworkflow -p Gold -w Finance -c orilla -A "backup -l full"

Note that specifying the action there doesn’t mean it’s the only action you’ll run – you’ll still run the other actions in the workflow (e.g., a clone operation, if it’s configured) – it just means you’re specifying an override for the nominated action.

For virtual machines, the way I’ve found easiest to start an individual client is using the vmid flag – effectively what the saveset name is for a virtual machine started via a proxy. Now, to get that name, you have to do a bit of mminfo scripting:

# mminfo -k -r vmname,name

 vm_name name
vulcan vm:500f21cd-5865-dc0d-7fe5-9b93fad1a059:caprica.turbamentis.int
vulcan vm:500f21cd-5865-dc0d-7fe5-9b93fad1a059:caprica.turbamentis.int
win01 vm:500f444e-4dda-d29d-6741-d23d6169f158:caprica.turbamentis.int
win01 vm:500f444e-4dda-d29d-6741-d23d6169f158:caprica.turbamentis.int
picon vm:500f6871-2300-47d4-7927-f3c799ee200b:caprica.turbamentis.int
picon vm:500f6871-2300-47d4-7927-f3c799ee200b:caprica.turbamentis.int
win02 vm:500ff33e-2f70-0b8d-e9b2-6ef7a5bf83ed:caprica.turbamentis.int
win02 vm:500ff33e-2f70-0b8d-e9b2-6ef7a5bf83ed:caprica.turbamentis.int
vega vm:5029095d-965e-2744-85a4-70ab9efcc312:caprica.turbamentis.int
vega vm:5029095d-965e-2744-85a4-70ab9efcc312:caprica.turbamentis.int
krell vm:5029e15e-3c9d-18be-a928-16e13839f169:caprica.turbamentis.int
krell vm:5029e15e-3c9d-18be-a928-16e13839f169:caprica.turbamentis.int
krell vm:5029e15e-3c9d-18be-a928-16e13839f169:caprica.turbamentis.int

What you’re looking for is the vm:a-b-c-d set, stripping out the :vcenter at the end of the ID.

Now, I’m a big fan of not running extra commands unless I really need to, so I’ve actually got a vmmap.pl Perl script which you’re free to download and adapt/use as you need to streamline that process. Since my lab is pretty basic, the script is too, though I’ve done my best to make the code straight forward. You simply run vmmap.pl as follows:

[root@orilla bin]# vmmap.pl -c krell
vm:5029e15e-3c9d-18be-a928-16e13839f169

With ID in hand, we can invoke nsrworkflow as follows:

# nsrworkflow -p VMware -w "Virtual Machines" -c vm:5029e15e-3c9d-18be-a928-16e13839f169
133550:nsrworkflow: Starting Protection Policy 'VMware' workflow 'Virtual Machines'.
123316:nsrworkflow: Starting action 'VMware/Virtual Machines/backup' with command: 'nsrvproxy_save -s orilla.turbamentis.int -j 705080 -L incr -p VMware -w "Virtual Machines" -A backup'.
123321:nsrworkflow: Action 'VMware/Virtual Machines/backup's log will be in '/nsr/logs/policy/VMware/Virtual Machines/backup_705081.raw'.
123325:nsrworkflow: Action 'VMware/Virtual Machines/backup' succeeded.
123316:nsrworkflow: Starting action 'VMware/Virtual Machines/clone' with command: 'nsrclone -a "*policy name=VMware" -a "*policy workflow name=Virtual Machines" -a "*policy action name=clone" -s orilla.turbamentis.int -b BoostClone -y "1 Months" -o -F -S'.
123321:nsrworkflow: Action 'VMware/Virtual Machines/clone's log will be in '/nsr/logs/policy/VMware/Virtual Machines/clone_705085.raw'.
123325:nsrworkflow: Action 'VMware/Virtual Machines/clone' succeeded.
133553:nsrworkflow: Workflow 'VMware/Virtual Machines' succeeded.

Of course, if you are in front of NMC, you can start individual clients from the GUI if you want to:

Starting an Individual ClientStarting an Individual Client

But it’s always worth knowing what your command line options are!

NetWorker 9.2 Capacity Measurement

 Licensing, NetWorker, Scripting  Comments Off on NetWorker 9.2 Capacity Measurement
Aug 032017
 

As I’ve mentioned in the past, there’s a few different licensing models for NetWorker, but capacity licensing (e.g., 100 TB front end backup size) gives considerable flexibility, effectively enabling all product functionality within a single license, thereby allowing NetWorker usage to adapt to suit the changing needs of the business.

Data Analysis

In the past, measuring utilisation has typically required either the use of DPA or asking your DellEMC account team to review the environment and provide a report. NetWorker 9.2 however gives you a new, self-managed option – the ability to run, whenever you want, a capacity measurement report to determine what your utilisation ratio is.

This is done through a new command line tool, nsrcapinfo, which is incredibly simple to run. In fact, running it without any options at all will give the default 60 day report, providing utilisation details for each of the key data types as well as summary. For instance, against my lab server, here’s the output:

<?xml version="1.0" encoding="UTF8" standalone="yes" ?>
<!--
~ Copyright (c) 2017 Dell EMC Corporation. All Rights Reserved.
~
~ This software contains the intellectual property of Dell EMC Corporation or is licensed to
~ Dell EMC Corporation from third parties. Use of this software and the intellectual property
~ contained therein is expressly limited to the terms and conditions of the License
~ Agreement under which it is provided by or on behalf of Dell EMC.
-->
<Capacity_Estimate_Report>
<Time_Stamp>2017-08-02T21:21:18Z</Time_Stamp>
<Clients>13</Clients>
<DB2>0.0000</DB2>
<Informix>0.0000</Informix>
<IQ>0.0000</IQ>
<Lotus>0.0000</Lotus>
<MySQL>0.0000</MySQL>
<Sybase>0.0000</Sybase>
<Oracle>0.0000</Oracle>
<SAP_HANA>0.0000</SAP_HANA>
<SAP_Oracle>0.0000</SAP_Oracle>
<Exchange_NMM8.x>0.0000</Exchange_NMM8.x>
<Exchange_NMM9.x>0.0000</Exchange_NMM9.x>
<Hyper-V>0.0000</Hyper-V>
<SharePoint>0.0000</SharePoint>
<SQL_VDI>0.0000</SQL_VDI>
<SQL_VSS>0.0000</SQL_VSS>
<Meditech>0.0000</Meditech>
<Other_Applications>2678.0691</Other_Applications>
<Unix_Filesystems>599.9214</Unix_Filesystems>
<VMware_Filesystems>360.3535</VMware_Filesystems>
<Windows_Filesystems>27.8482</Windows_Filesystems>
<Total_Largest_Filesystem_Fulls>988.1231</Total_Largest_Filesystem_Fulls>
<Peak_Daily_Applications>2678.0691</Peak_Daily_Applications>
<Capacity_Estimate>3666.1921</Capacity_Estimate>
<Unit_of_Measure_Bytes_per_GiB>1073741824</Unit_of_Measure_Bytes_per_GiB>
<Days_Measured>60</Days_Measured>
</Capacity_Estimate_Report>

That’s in XML by default – and the numbers are in GiB.

If you do fulls on longer cycles than the default of a 60 day measurement window you can extend the data sampling range by using -d nDays in the command (e.g., “nsrcapinfo -d 90” would provide a measurement over a 90 day window). You can also, if you wish for further analysis, generate additional reports (see the command reference guide or, man nsrcapinfo if you’re on Linux for the full details). One of those reports that I think will be quite popular with backup administrators will be the client report. An example of that is below:

[root@orilla ~]# nsrcapinfo -r clients
"Hostname", "Client_Capacity_GiB", "Application_Names" 
"abydos.turbamentis.int", "2.3518", "Unix_Filesystems"
"vulcan", "16.0158", "VMware_Filesystems"
"win01", "80.0785", "VMware_Filesystems"
"picon", "40.0394", "VMware_Filesystems"
"win02", "80.0788", "VMware_Filesystems"
"vega", "64.0625", "VMware_Filesystems"
"test02", "16.0157", "VMware_Filesystems"
"test03", "16.0157", "VMware_Filesystems"
"test01", "16.0157", "VMware_Filesystems"
"krell", "32.0314", "VMware_Filesystems"
"faraway.turbamentis.int", "27.8482", "Windows_Filesystems"
"orilla.turbamentis.int", "1119.5321", "Other_Applications Unix_Filesystems"
"rama.turbamentis.int", "2156.1067", "Other_Applications Unix_Filesystems"

That’s a straight-up simple view of the FETB estimation for each client you’re protecting in your environment.

There you have it – capacity measurement in NetWorker as a native function in version 9.2.

NetWorker 9.2 – A Focused Release

 NetWorker  Comments Off on NetWorker 9.2 – A Focused Release
Jul 292017
 

NetWorker 9.2 has just been released. Now, normally I pride myself for having kicked the tyres on a new release for weeks before it’s come out via the beta programmes, but unfortunately my June, June and July taught me new definitions of busy (I was busy enough that I did June twice), so instead I’ll be rolling the new release into my lab this weekend, after I’ve done this initial post about it.

bigStock Focus

I’ve been working my way through NetWorker 9.2’s new feature set, though, and it’s impressive.

As you’ll recall, NetWorker 9.1 introduced NVP, or vProxy – the replacement to the Virtual Backup Appliance introduced in NetWorker 8. NVP is incredibly efficient for backup and recovery operations, and delivers hyper-fast file level recovery from image level recovery. (Don’t just take my written word for it though – check out this demo where I recovered almost 8,000 files in just over 30 seconds.)

NetWorker 9.2 expands on the virtual machine backup integration by adding the capability to perform Microsoft SQL Server application consistent backup as part of a VMware image level backup. That’s right, application consistent, image level backup. That’s something Avamar has been able to do for a little while now, and it’s now being adopted in NetWorker, too. We’re starting with Microsoft SQL Server – arguably the simplest one to cover, and the most sought after by customers, too – before tackling other databases and applications. In my mind, application consistent image level backup is a pivot point for simplifying data protection – in fact, it’s a topic I covered as an emerging focus for the next several years of data protection in my book, Data Protection: Ensuring Data Availability. I think in particular app-consistent image level backups will be extremely popular in smaller/mid-market customer environments where there’s not guaranteed to be a dedicated DBA team within the IT department.

It’s not just DBAs that get a boost with NetWorker 9.2 – security officers do, too. In prior versions of NetWorker, it was possible to integrate Data Domain Retention Lock via scripting – now in NetWorker 9.2, it’s rolled into the interface itself. This means you’ll be able to establish retention lock controls as part of the backup process. (For organisations not quite able to go down the path of having a full isolated recovery site, this will be a good mid-tier option.)

Beyond DBAs and security officers, those who are interested in backing up to the cloud, or in the cloud, will be getting a boost as well – CloudBoost 2.2 has been introduced with NetWorker 9.2, and this gives Windows 64-bit clients the CloudBoost API as well, allowing a direct to object storage model from both Windows and Linux (which got CloudBoost client direct in a earlier release). What does this mean? Simple: It’s a super-efficient architecture leveraging an absolute minimum footprint, particularly when you’re running IaaS protection in the Cloud itself. Cloud protection gets another option as well – support for DDVE in the Cloud: AWS or Azure.

NMC isn’t left out – as NetWorker continues to scale, there’s more information and data within NMC for an administrator or operator to sort through. If you’ve got a few thousand clients, or hundred of client groups created for policies and workflows, you might not want to scroll through a long list. Hence, there’s now filtering available in a lot of forms. I’m always a fan of speeding up what I have to do within a GUI, and this will be very useful for those in bigger environments, or who prefer to find things by searching rather than visually eye-balling while scrolling.

If you’re using capacity licensing, otherwise known as Front End TB (FETB) licensing, NetWorker now reports license utilisation estimation. You might think this is a synch, but it’s only a synch if you count whitespace everywhere. That’s not something we want done. Still, if you’ve got capacity licensing, NetWorker will now keep track of it for you.

There’s a big commitment within DellEMC for continued development of automation options within the Data Protection products. NetWorker has always enjoyed a robust command line interface, but a CLI can only take you so far. The REST API that was introduced previously continues to be updated. There’s support for the Data Domain Retention Lock integration and the new application consistent image level backup options, just to name a couple of new features.

NetWorker isn’t just about the core functionality as well – there’s also the various modules for databases and applications, and they’ve not been left unattended, either.

SharePoint and Exchange get tighter integration with ItemPoint for granular recovery. Previously it was a two step process to mount the backup and launch ItemPoint – now the NMM recovery interface can automatically start ItemPoint, directing it to the mounted backup copies for processing.

Microsoft SQL Server is still of course supported for traditional backup/recovery operations via the NetWorker Module for Microsoft, and it’s been updated with some handy new features. Backup an recovery operations no longer need Windows administrative privileges in all instances, and you can do database exclusions now via wild-cards – very handy if you’ve got a lot of databases on a server following a particular naming convention and you don’t need to protect them all, or protect them all in a single backup stream. You also get the option during database recovery now to terminate other user access to the database; previously this had to be managed manually by the SQL administrator for the target database – now it can be controlled as part of the recovery process. There’s also a bunch of new options for SQL Always On Availability Groups, and backup promotion.

In addition to the tighter ItemPoint integration mentioned previously for Exchange, you also get the option to do ItemPoint/Granular Exchange recovery from a client that doesn’t have Exchange installed. This is particularly handy when Exchange administrators want to limit what can happen on an Exchange server. Continuing the tight Data Domain Cloud Tier integration, NMM now handles automatic and seamless recall of data from Cloud Tier should it be required as part of a recovery option.

Hyper-V gets some love, too: there’s processes to remove stale checkpoints, or merge checkpoints that exceed a particular size. Hyper-V allows a checkpoint disk (a differencing disk – AVHDX file) to grow to the same size as its original parent disk. However, that can cause performance issues and when it hits 100% it creates other issues. So you can tell NetWorker during NMM Hyper-V backups to inspect the size of Hyper-V differencing disks and automatically merge if they exceed a certain watermark. (E.g., you might force a merge when the differencing disk is 25% of the size of the original.) You also get the option to exclude virtual hard disks (either VHD or VHDX format) from the backup process should you desire – very handy for virtual machines that have large disks containing transient or other forms of data that have no requirement for backup.

Active Directory recovery browsing gets a performance boost too, particularly for large AD trees.

SAP IQ (formerly known as Sybase IQ) gets support in NetWorker 9.2 NMDA. You’ll need to be running v16 SP11 and a simplex architecture, but you’ll get a variety of backup and recovery options. A growing trend within database vendors is to allow designation of some data files within the database as read-only, and you can choose to either backup or skip read-only data files as part of a SAP IQ backup, amongst a variety of other options. If you’ve got a traditional Sybase ASE server, you’ll find that there’s now support for backing up database servers with >200 databases on them – either in sequence, or with a configured level of parallelism.

DB2 gets some loving, too – NMDA 9.1 gave support for PowerLink little-endian DB2 environments, but with 9.2 we also get a Boost plugin to allow client-direct/Boost backups for DB2 little-endian environments.

(As always, there’s also various fixes included in any new release, incorporating fixes that were under development concurrently in earlier releases.)

As always, when you’re planning to upgrade NetWorker, there’s a few things you should do as a matter of course. There’s a new approach to making sure you’re aware of these steps – when you go to support.emc.com and click to download the NetWorker server installer or either Windows or Linux, you’ll initially find yourself redirected to a PDF: the NetWorker 9.2 Recommendations, Training and Downloads for Customers and Partners. Now, I admit – in my lab I have a tendency sometimes to just leap in and start installing new packages, but in reality when you’re using NetWorker in a real environment, you really do want to make sure you read the documentation and recommendations for upgrades before going ahead with updating your environment. The recommendations guide is only three pages, but it’s three very useful pages – links to technical training, references to the documentation portfolio, where to find NetWorker focused videos on the Community NetWorker and YouTube, and details about licensing and compatibility. There’s also very quick differences details between NetWorker versions, and finally the download location links are provided.

Additional key documentation you should – in my mind, you must – review before upgrading include the release notes, the compatibility guide, and of course, the ever handy updating from a prior version guide. That’s in addition to checking standard installation guides.

Now if you’ll excuse me, I have a geeky data protection weekend ahead of me as I upgrade my lab to NetWorker 9.2.

Basics – Using the vSphere Plugin to Add Clients for Backup

 NetWorker, NVP, vProxy  Comments Off on Basics – Using the vSphere Plugin to Add Clients for Backup
Jul 242017
 

It’s a rapidly changing trend – businesses increasingly want the various Subject Matter Experts (SMEs) running applications and essential services to be involved in the data protection process. In fact, in the 2016 Data Protection Index, somewhere in the order of 93% of respondents said this was extremely important to their business.

It makes sense, too. Backup administrators do a great job, but they can’t be expected to know everything about every product deployed and protected within the organisation. The old way of doing things was to force the SMEs to learn how to use the interfaces of the backup tools. That doesn’t work so well. Like the backup administrators having their own sphere of focus, so too do the SMEs – they understandably want to use their tools to do their work.

What’s more, if we do find ourselves in a disaster situation, we don’t want backup administrators to become overloaded and a bottleneck to the recovery process. The more those operations are spread around, the faster the business can recover.

So in the modern data protection environment, we have to work together and enable each other.

Teams working together

In a distributed control model, the goal will be for the NetWorker administrator to define the protection policies needed, based on the requirements of the business. Once those policies are defined, enabled SMEs should be able to use their tools to work with those policies.

One of the best examples of that is for VMware protection in NetWorker. Using the plugins provided directly into the vSphere Web Client, the VMware administrators can attach and detach virtual machines from protection policies that have been established in NetWorker, and initiate backups and recoveries as they need.

In the video demo below, I’ll take you through the process whereby the NetWorker administrator defines a new virtual machine backup policy, then the VMware administrator attaches a virtual machine to that policy and kicks it off. It’s really quite simple, and it shows the power that you get when you enable SMEs to interact with data protection from within the comfort of their own tools and interfaces. (Don’t forget to ensure you switch to 720p/HD in order to see what’s going on within the session.)


Don’t forget – if you find the NetWorker Blog useful, you’ll be sure to enjoy Data Protection: Ensuring Data Availability.

Jul 212017
 

I want to try something different with this post. Rather than the usual post with screen shots and descriptions, I wanted instead to do a demo video showing just how easy it is to do file level recovery (FLR) from NetWorker VMware Image Level Backup thanks to the new NVP or vProxy system in NetWorker 9.

The video below steps you through the entire FLR process for a Linux virtual machine. (If your YouTube settings don’t default to it, be sure to switch the video to High Def (720) or otherwise the text on the console and within NMC may be difficult to read.)

Don’t forget – if you find the information on the NetWorker Blog useful, I’m sure you’ll get good value out of my latest book, Data Protection: Ensuring Data Availability.

Jul 112017
 

NetWorker 9 modules for SQL, Exchange and Sharepoint now make use of ItemPoint to support granular recovery.bigstock Database

ItemPoint leverages NetWorker’s ability to live-mount a database or application backup from compatible media, such as Advanced File Type devices or Data Domain Boost.

I thought I’d step through the process of performing a table level recovery out of a SQL server backup – as you’ll see below, it’s actually remarkably straight-forward to run granular recoveries in the new configuration. For my lab setup, I installed the Microsoft 180 day evaluation* license of Windows 2012 R2, and in the same spirit, the 180 day evaluation license for SQL Server 2014 (Standard).

Next off, I created a database and within that database, a table. I grabbed a list of English-language dictionary words and populated a table with rows consisting of the words and a unique ID key – just for something simple to test with.

Installing NetWorker on the Client

After getting the database server and a database ready, the next process was to install the NetWorker client within the Windows instance in order to do backup and recovery. After installing the standard NetWorker filesystem client using the base NetWorker for Windows installer, I went on to install the NetWorker Module for Microsoft Applications, choosing the SQL option.

In case you haven’t installed a NMM v9 plugin yet, I thought I’d annotate/show the install process below.

After you’ve unpacked the NMM zip file, you’ll want to run the appropriate setup file – in this case, NWVSS.

NMM SQL Install 01

NMM SQL Install 01

You’ll have to do the EULA acceptance, of course.

NMM SQL Install 02

NMM SQL Install 02

After you’ve agreed and clicked Next, you’ll get to choose what options in NMM you want to install.

NMM SQL Install 03

NMM SQL Install 03

I chose to run the system configuration checker, and you definitely should too. This is an absolute necessity in my mind – the configuration checker will tell you if something isn’t going to work. It works through a gamut of tests to confirm that the system you’re attempting to install NMM on is compatible, and provides guidance if any of those tests aren’t passed. Obviously as well, since I wanted to do SQL backup and recovery, I also selected the Microsoft SQL option. After this, you click Check to start the configuration check process.

Depending on the size and scope of your system, the configuration checker may take a few minutes to run, but after it completes, you’ll get a summary report, such as below.

NMM SQL Install 04

NMM SQL Install 04

Make sure to scroll through the summary and note there’s no errors reported. (Errors will have a result of ‘ERROR’ and will be in red.) If there is an error reported, you can click the ‘Open Detailed Report…’ button to open up the full report and see what actions may be available to rectify the issue. In this case, the check was successful, so it was just a case of clicking ‘Next >’ to continue.

NMM SQL Install 05

NMM SQL Install 05

Next you have to choose whether to configure the Windows firewall. If you’re using a third party firewall product, you’ll typically want to do the firewall configuration manually and choose ‘Do not configure…’. Choose the appropriate option for your environment and click ‘Next >’ to continue again.

NMM SQL Install 06

NMM SQL Install 06

Here’s where you get to the additional options for the plugin install. I chose to enable the SQL Granular Recovery option, and enabled all the SQL Server Management Studio options, per the above. You’ll get a warning when you click Next here to ensure you’ve got a license for ItemPoint.

NMM SQL Install 07

NMM SQL Install 07

I verified I did have an ItemPoint license and clicked Yes to continue. If you’re going with granular recovery, you’ll be prompted next for the mount point directories to be used for those recoveries.

NMM SQL Install 08

NMM SQL Install 08

In this, I was happy to accept the default options and actually start the install by clicking the ‘Install >’ button.

NMM SQL Install 09

NMM SQL Install 09

The installer will then do its work, and when it completes you’ll get a confirmation window.

NMM SQL Install 10

NMM SQL Install 10

That’s the install done – the next step of course is configuring a client resource for the backup.

Configuring the Client in NMC

The next step is to create a client resource for the SQL backups. Within NMC, go into the configuration panel, right-click on Clients and choose to create a new client via the wizard. The sequence I went through was as follows.

NMM SQL Config 01

NMM SQL Config 01

Once you’ve typed the client name in, NetWorker is going to be able to reach out to the client daemons to coordinate configuration. My client was ‘win02’, and as you can see from the client type, a ‘Traditional’ client was the one to pick. Clicking ‘Next >’, you get to choose what sort of backup you want to configure.

NMM SQL Config 02

NMM SQL Config 02

At this point the NetWorker server has contacted the client nsrexecd process and identified what backup/recovery options there are installed on the client. I chose ‘SQL Server’ from the available applications list. ‘Next >’ to continue.

NMM SQL Config 03

NMM SQL Config 03

I didn’t need to change any options here (I wanted to configure a VDI backup rather than a VSS backup, so I left ‘Block Based Backup’ disabled). Clicking ‘Next >’ from here lets you choose the databases you want to backup.

NMM SQL Config 04

NMM SQL Config 04

I wanted to backup everything – the entire WIN02 instance, so I left WIN02 selected and clicked ‘Next >’ to continue the configuration.

NMM SQL Config 05

NMM SQL Config 05

Here you’ll be prompted for the accessing credentials for the SQL backups. Since I don’t run active directory at home, I was just using Windows authentication so in actual fact I entered the ‘Administrator’ username and the password, but you can change it to whatever you need to as part of the backup. Once you’ve got the correct authentication details entered, ‘Next >’ to continue.

NMM SQL Config 06

NMM SQL Config 06

Here’s where you get to choose SQL specific options for the backup. I elected to skip simple databases for incremental backups, and enabled 6-way striping for backups. ‘Next >’ to continue again.

NMM SQL Config 07

NMM SQL Config 07

The Wizard then prompts you to confirm your configuration options, and I was happy with them, so I clicked ‘Create’ to actually have the client resource created in NetWorker.

NMM SQL Config 08

NMM SQL Config 08

The resource was configured without issue, so I was able to click Finish to complete the wizard. After this, it was just a case of adding the client to an appropriate policy and then running that policy from within NMC’s monitoring tab.

NMM SQL Config 09

NMM SQL Config 09

And that was it – module installed, client resource configured, and backup completed. Next – recovery!

Doing a Granular Recovery

To do a granular recovery – a table recovery – I jumped across via remote desktop to the Windows host and launched SQL Management Studio. First thing, of course, was to authenticate.

NMM SQL GLR 01

NMM SQL GLR 01

Once I’d logged on, I clicked the NetWorker plugin option, highlighted below:

NMM SQL GLR 02

NMM SQL GLR 02

That brought up the NetWorker plugin dialog, and I went straight to the Table Restore tab.

NMM SQL GLR 03

NMM SQL GLR 03

In the table restore tab, I chose the NetWorker server, the SQL server host, the SQL instance, then picked the database I wanted to restore from, as well as the backup. (Because there was only one backup, that was a pretty simple choice.) Next was to click Run to initiate the recovery process. Don’t worry – the Run here refers to running the mount; nothing is actually recovered yet.

NMM SQL GLR 04

NMM SQL GLR 04

While the mounting process runs you’ll get output of the process as it is executing. As soon as the database backup is mounted, the ItemPoint wizard will be launched.

NMM SQL GLR 05

NMM SQL GLR 05

When ItemPoint launches, it’ll prompt via the Data Wizard for the source of the recovery. In this case, work with the NetWorker defaults, as the source type (Folder) and Source Folder will be automatically populated as a result of the mount operation previously performed.

NMM SQL GLR 06

NMM SQL GLR 06

You’ll be prompted to provide the SQL Server details here and whether you want to connect to a single database or the entire server. In this case, I went with just the database I wanted – the Silence database. Clicking Finish then opens up the data browser for you.

NMM SQL GLR 07

NMM SQL GLR 07

You’ll see the browser interface is pretty straight forward – expand the backup down to the Tables area so you can select the table you want to restore.

NMM SQL GLR 08

NMM SQL GLR 08

Within ItemPoint, you don’t so much restore a table as copy it out of the backup region. So you literally can right-click on the table you want and choose ‘Copy’.

NMM SQL GLR 09

NMM SQL GLR 09

Logically then the next thing you do is go to the Target area and choose to paste the table.

NMM SQL GLR 10

NMM SQL GLR 10

Because that table still existed in the database, I was prompted to confirm what the pasted table would be called – in this case, just dbo.ImportantData2. Clicking OK then kicks off the data copy operation.

NMM SQL GLR 11

NMM SQL GLR 11

Here you can see the progress indicator for the copy operation. It keeps you up to date on how many rows have been processed, and the amount of time it’s taken so far.

NMM SQL GLR 12

NMM SQL GLR 12

At the end of the copy operation, you’ll have details provided about how many rows were processed, when it was finished and how long it took to complete. In this case I pulled back 370,101 rows in 21 seconds. Clicking Close will return you to the NetWorker Plugin where the backup will be dismounted.

NMM SQL GLR 13

NMM SQL GLR 13

And there you have it. Clicking “Close” will close down the plugin in SQL Management Studio, and your table level recovery has been completed.

ItemPoint GLR for SQL Server is really quite straight forward, and I heartily recommend the investment in the ItemPoint aspect of the plugin so as to get maximum benefit out of your SQL, Exchange or SharePoint backups.


* I have to say, it really irks me that Microsoft don’t have any OS pricing for “non-production” use. I realise the why – that way too many licenses would be finagled into production use. But it makes maintaining a home lab environment a complete pain in the posterior. Which is why, folks, most of my posts end up being around Linux, since I can run CentOS for free. I’d happily pay a couple of hundred dollars for Windows server licenses for a lab environment, but $1000-$2000? Ugh. I only have limited funds for my home lab, and it’s no good exhausting your budget on software if you then don’t have hardware to run it on…

Would you buy a dangerbase?

 Backup theory, Policies  Comments Off on Would you buy a dangerbase?
Jun 072017
 

Databases. They’re expensive, aren’t they?

What if I sold you a Dangerbase instead?

What’s a dangerbase!? I’m glad you asked. A dangerbase is functionally almost exactly the same as a database, except it may be a little bit more lax when it comes to controls. Referential integrity might slip. Occasionally an insert might accidentally trigger a background delete. Nothing major though. It’s twenty percent less of the cost with only four times the risk of one of those pesky ‘databases’! (Oh, you might need 15% more infrastructure to run it on, but you don’t have to worry about that until implementation.)

Dangerbases. They’re the next big thing. They have a marketshare that’s doubling every two years! Two years! (Admittedly that means they’re just at 0.54% marketshare at the moment, but that’s double what it was last year!)

A dangerbase is a stupid idea. Who’d trust storing their mission critical data in a dangerbase? The idea is preposterous.

Sadly, dangerbases get considered all too often in the world of data protection.

Destroyed Bridge

What’s a dangerbase in the world of data protection? Here’s just some examples:

  • Relying solely on an on-platform protection mechanism. Accidents happen. Malicious activities happen. You need to always ensure you’ve got a copy of your data outside of the original production platform it is created and maintained on, regardless of what protection you’ve got in place there. And you should at least have one instance of each copy in a different physical location to the original.
  • Not duplicating your backups. Whether you call it a clone or a copy or a duplication doesn’t matter to me here – it’s the effect we’re looking for, not individual product nomenclature. If your backup isn’t copied, it means your backup represents a single point of failure in the recovery process.
  • Using post-process deduplication. (That’s something I covered in detail recently.)
  • Relying solely on RAID when you’re doing deduplication. Data Invulnerability Architecture (DIA) isn’t just a buzzterm, it’s essential in a deduplication environment.
  • Turning your databases into dangerbases by doing “dump and sweep”. Plugins have existed for decades. Dump and sweep is an expensive waste of primary storage space and introduces a variety of risk into your data protection environment.
  • Not having a data lifecycle policy! Without it, you don’t have control over capacity growth within your environment. Without that, you’re escalating your primary storage costs unnecessarily, and placing strain on your data protection environment – strain that can easily break it.
  • Not having a data protection advocate, or data protection architect, within your organisation. If data is the lifeblood of a company’s operations, and information is money, then failing to have a data protection architect/advocate within the organisation is like not bothering with having finance people.
  • Not having a disaster recovery policy that integrates into a business continuity policy. DR is just one aspect of business continuity, but if it doesn’t actually slot into the business continuity process smoothly, it’s as likely going to hinder than help the company.
  • Not understanding system dependencies. I’ve been talking about system dependency maps or tables for years. Regardless of what structure you use, the net effect is the same: the only way you can properly protect your business services is to know what IT systems they rely on, and what IT systems those IT systems rely on, and so on, until you’re at the root level.

That’s just a few things, but hopefully you understand where I’m coming from.

I’ve been living and breathing data protection for more than twenty years. It’s not just a job, it’s genuinely something I’m passionate about. It’s something everyone in IT needs to be passionate about, because it can literally make the difference between your company surviving or failing in a disaster situation.

In my book, I cover all sorts of considerations and details from a technical side of the equation, but the technology in any data protection solution is just one aspect of a very multi-faceted approach to ensuring data availability. If you want to take data protection within your business up to the next level – if you want to avoid having the data protection equivalent of a dangerbase in your business – check my book out. (And in the book there’s a lot more detail about integrating into IT governance and business continuity, a thorough coverage of how to work out system dependencies, and all sorts of details around data protection advocates and the groups that they should work with.)

Architecture Matters: Protection in the Cloud (Part 2)

 Architecture  Comments Off on Architecture Matters: Protection in the Cloud (Part 2)
Jun 052017
 

(Part 1).

Particularly when we think of IaaS style workloads in the Cloud, there’s two key approaches that can be used for data protection.

The first is snapshots. Snapshots fulfil part of a data protection strategy, but we do always need to remember with snapshots that:

  • They’re an inefficient storage and retrieval model for long-term retention
  • Cloud or not, they’re still essentially on-platform

As we know, and something I cover in my book quite a bit – a real data protection strategy will be multi-layered. Snapshots undoubtedly can provide options around meeting fast RTOs and minimal RPOs, but traditional backup systems will deliver a sufficient recovery granularity for protection copies stretching back weeks, months or years.

Stepping back from data protection itself – public cloud is a very different operating model to traditional  in-datacentre infrastructure spending. The classic in-datacentre infrastructure procurement process is an up-front investment designed around 3- or 5-year depreciation schedules. For some businesses that may mean a literal up-front purchase to cover the entire time-frame (particularly so when infrastructure budget is only released for the initial deployment project), and for others with more fluid budget options, there’ll be an investment into infrastructure that can be expanded over the 3- or 5-year solution lifetime to meet systems growth.

Cloud – public Cloud – isn’t costed or sold that way. It’s a much smaller billing window and costing model; use a GB or RAM, pay for a GB of RAM. Use a GHz or CPU, pay for a GHz of CPU. Use a GB of storage, pay for a GB of storage. Public cloud costing models often remind me of Master of the House from Les Miserables, particularly this verse:

Charge ’em for the lice, extra for the mice
Two percent for looking in the mirror twice
Here a little slice, there a little cut
Three percent for sleeping with the window shut
When it comes to fixing prices
There are a lot of tricks I knows
How it all increases, all them bits and pieces
Jesus! It’s amazing how it grows!

Master of the House, Les Miserables.

That’s the Cloud operating model in a nutshell. Minimal (or no) up-front investment, but you pay for every scintilla of resource you use – every day or month.

If you say, deploy a $30,000 server into your datacentre, you then get to use that as much or as little as you want, without any further costs beyond power and cooling*. With Cloud, you won’t be paying that $30,000 initial fee, but you will pay for every MHz, KB of RAM and byte of storage consumed within every billing period.

If you want Cloud to be cost-effective, you have to be able to optimise – you have to effectively game the system, so to speak. Your in-Cloud services have to be maximally streamlined. We’ve become inured to resource wastage in the datacentre because resources have been cheap for a long time. RAM size/speed grows, CPU speed grows, as does the number of cores, and storage – well, storage seems to have an infinite expansion capability. Who cares if what you’re doing generates 5 TB of logs per day? Information is money, after all.

To me, this is just the next step in the somewhat lost art of programmatic optimisation. I grew up in the days of 8-bit computing**, and we knew back then that CPU, RAM and storage weren’t infinite. This didn’t end with 8-bit computing, though. When I started in IT as a Unix system administrator, swap file sizing, layout and performance was something that formed a critical aspect of your overall configuration, because if – Jupiter forbid – your system started swapping, you needed a fighting chance that the swapping wasn’t going to kill your performance. Swap file optimisation was, to use a Bianca Del Rio line, all about the goal: “Not today, Satan.”

That’s Cloud, now. But we’re not so much talking about swap files as we are resource consumption. Optimisation is critical. A failure to optimise means you’ll pay more. The only time you want to pay more is when what you’re paying for delivers a tangible, cost-recoverable benefit to the business. (I.e., it’s something you get to charge someone else for, either immediately, or later.)

Cloud Cost

If we think about backup, it’s about getting data from location A to location B. In order to optimise it, you want to do two distinct thinks:

  • Minimise the number of ‘hops’ that data has to make in order to get from A to B
  • Minimise the amount of data that you need to send from A to B.

If you don’t optimise that, you end up in a ‘classic’ backup architecture that we used to rely so much on in the 90s and early 00s, such as:

Cloud Architecture Matters 1

(In this case I’m looking just at backup services that land data into object storage. There are situations where you might want higher performance than what object offers, but let’s stick just with object storage for the time being.)

I don’t think this diagram is actually good at giving the full picture. There’s another way I like to draw the diagram, and it looks like this:

Cloud Architecture Matters 2

In the Cloud, you’re going to pay for the systems you’re running for business purposes no matter what. That’s a cost you have to accept, and the goal is to ensure that whatever services or products you’re on-selling to your customers using those services will pay for the running costs in the Cloud***.

You want to ensure you can protect data in the Cloud, but sticking to architectures designed at the time of on-premises infrastructure – and physical infrastructure at that – is significantly sub-optimal.

Think of how traditional media servers (or in NetWorker parlance, storage nodes) needed to work. A media server is designed to be a high performance system that funnels data coming from client to protection storage. If a backup architecture still heavily relies on media servers, then the cost in the Cloud is going to be higher than you need it – or want it – to be. That gets worse if a media server needs to be some sort of highly specced system encapsulating non-optimised deduplication. For instance, one of NetWorker’s competitors provides details on their website of hardware requirements for deduplication media servers, so I’ve taken these specifications directly from their website. To work with just 200 TB of storage allocated for deduplication, a media server for that product needs:

  • 16 CPU Cores
  • 128 GB of RAM
  • 400 GB SSD for OS and applications
  • 2 TB of SSD for deduplication databases
  • 2 TB of 800 IOPs+ disk (SSD recommended in some instances) for index cache

For every 200 TB. Think on that for a moment. If you’re deploying systems in the Cloud that generate a lot of data, you could very easily find yourself having to deploy multiple systems such as the above to protect those workloads, in addition to the backup server itself and the protection storage that underpins the deduplication system.

Or, on the other hand, you could work with an efficient architecture designed to minimise the number of data hops, and minimise the amount of data transferred:

CloudBoost Workflow

That’s NetWorker with CloudBoost. Unlike that competitor, a single CloudBoost appliance doesn’t just allow you to address 200TB of deduplication storage, but 6 PB of logical object storage. 6 PB, not 200 TB. All that using 4 – 8 CPUs and 16 – 32GB of RAM, and with a metadata sizing ratio of 1:2000 (i.e., every 100 GB of metadata storage allows you to address 200 TB of logical capacity). Yes, there’ll be SSD optimally for the metadata, but noticeably less than the competitor’s media server – and with a significantly greater addressable range.

NetWorker and CloudBoost can do that because the deduplication workflow has been optimised. In much the same way that NetWorker and Data Domain work together, within a CloudBoost environment, NetWorker clients will participate in the segmentation, deduplication, compression (and encryption!) of the data. That’s the first architectural advantage: rather than needing a big server to handle all the deduplication of the protection environment, a little bit of load is leveraged in each client being protected. The second architectural advantage is that the CloudBoost appliance does not pass the data through. Clients send their deduplicated, compressed and encrypted data directly to the object storage, minimising the data hops involved****.

To be sure, there are still going to be costs associated with running a NetWorker+CloudBoost configuration in public cloud – but that will be true of any data protection service. That’s the nature of public cloud – you use it, you pay for it. What you do get with NetWorker+CloudBoost though is one of the most streamlined and optimised public cloud backup options available. In an infrastructure model where you pay for every resource consumed, it’s imperative that the backup architecture be as resource-optimised as possible.

IaaS workloads will only continue to grow in public cloud. If your business uses NetWorker, you can take comfort in being able to still protect those workloads while they’re in public cloud, and doing it efficiently, optimised for maximum storage potential with minimised resource cost. Remember always: architecture matters, no matter where your infrastructure is.


Hey, if you found this useful, don’t forget to check out Data Protection: Ensuring Data Availability.


 


* Yes, I am aware there’ll be other costs beyond power and cooling when calculating a true system management price, but I’m not going to go into those for the purposes of this blog.

** Some readers of my blog may very well recall earlier computing models. But I started with a Vic-20, then the Commodore-64, and both taught me valuable lessons about what you can – and can’t – fit in memory.

*** Many a company has been burnt by failing to cost that simple factor, but in the style of Michael Ende, that is another story, for another time.

**** Linux 64-bit clients do this now. Windows 64-bit clients are supported in NetWorker 9.2, coming soon. (In the interim Windows clients work via a storage node.)

May 232017
 

I’m going to keep this one short and sweet. In Cloud Boost vs Cloud Tier I go through a few examples of where and when you might consider using Cloud Boost instead of Cloud Tier.

One interesting thing I’m noticing of late is a variety of people talking about “VTL in the Cloud”.

BigStock Exhausted

I want to be perfectly blunt here: if your vendor is talking to you about “VTL in the Cloud”, they’re talking to you about transferring your workloads rather than transforming your workloads. When moving to the Cloud, about the worst thing you can do is lift and shift. Even in Infrastructure as a Service (IaaS), you need to closely consider what you’re doing to ensure you minimise the cost of running services in the Cloud.

Is your vendor talking to you about how they can run VTL in the Cloud? That’s old hat. It means they’ve lost the capacity to innovate – or at least, lost interest in it. They’re not talking to you about a modern approach, but just repeating old ways in new locations.

Is that really the best that can be done?

In a coming blog article I’ll talk about the criticality of ensuring your architecture is streamlined for running in the Cloud; in the meantime I just want to make a simple point: talking about VTL in the Cloud isn’t a “modern” discussion – in fact, it’s quite the opposite.

May 232017
 

Introduction

A seemingly straight-forward question, what constitutes a successful backup may not engender the same response from everyone you ask. On the surface, you might suggest the answer is simply “a backup that completes without error”, and that’s part of the answer, but it’s not the complete answer.

Bullseye

Instead, I’m going to suggest there’s actually at least ten factors that go into making up a successful backup, and explain why each one of them is important.

The Rules

One – It finishes without a failure

This is the most simple explanation of a successful backup. One that literally finishes successfully. It makes sense, and it should be a given. If a backup fails to transfer the data it is meant to transfer during the process, it’s obviously not successful.

Now, there’s a caveat here, something I need to cover off. Sometimes you might encounter situations where a backup completes successfully  but triggers or produces a spurious error as it finishes. I.e., you’re told it failed, but it actually succeeded. Is that a successful backup? No. Not in a useful way, because it’s encouraging you to ignore errors or demanding manual cross-checking.

Two – Any warnings produced are acceptable

Sometimes warnings will be thrown during a backup. It could be that a file had to be re-read, or a file was opened at the time of backup (e.g., on a Unix/Linux system) and could only be partially read.

Some warnings are acceptable, some aren’t. Some warnings that are acceptable on one system may not be acceptable on another. Take for instance, log files. On a lot of systems, if a log file is being actively written to when the backup is running, it could be that the warning of an incomplete capture of the file is acceptable. If the host is a security logging system and compliance/auditing requirements dictate all security logs are to be recoverable, an open-file warning won’t be acceptable.

Three – The end-state is captured and reported on

I honestly can’t say the number of times over the years I’ve heard of situations where a backup was assumed to have been running successfully, then when a recovery is required there’s a flurry of activity to determine why the recovery can’t work … only to find the backup hadn’t been completing successfully for days, weeks, or even months. I really have dealt with support cases in the past where critical data that had to be recovered was unrecoverable due to a recurring backup failure – and one that had been going on, being reported in logs and completion notifications, day-in, day-out, for months.

So, a successful backup is also a backup here the end-state is captured and reported on. The logical result is that if the backup does fail, someone knows about it and is able to choose an action for it.

When I first started dealing with NetWorker, that meant checking the savegroup completion reports in the GUI. As I learnt more about the importance of automation, and systems scaled (my system administration team had a rule: “if you have to do it more than once, automate it”), I built parsers to automatically interpret savegroup completion results and provide emails that would highlight backup failures.

As an environment scales further, automated parsing needs to scale as well – hence the necessity of products like Data Protection Advisor, where you not only get simple dashboards for overnight success ratios with drill-downs, root cause analysis, and all the way up to SLA adherence reports and beyond.

In short, a backup needs to be reported on to be successful.

Four – The backup method allows for a successful recovery

A backup exists for one reason alone – to allow the retrieval and reconstruction of data in the event of loss or corruption. If the way in which the backup is run doesn’t allow for a successful recovery, then the backup should not be counted as a successful backup, either.

Open files are a good example of this – particularly if we move into the realm of databases. For instance, on a regular Linux filesystem (e.g., XFS or EXT4), it would be perfectly possible to configure a filesystem backup of an Oracle server. No database plugin, no communication with RMAN, just a rolling sweep of the filesystem, writing all content encountered to the backup device(s).

But it wouldn’t be recoverable. It’s a crash-consistent backup, not an application-consistent backup. So, a successful backup must be a backup that can be successfully recovered from, too.

Five – If an off-site/redundant copy is required, it is successfully performed

Ideally, every backup should get a redundant copy – a clone. Practically, this may not always be the case. The business may decide, for instance, that ‘bronze’ tiered backups – say, of dev/test systems, do not require backup replication. Ultimately this becomes a risk decision for the business and so long as the right role(s) have signed off against the risk, and it’s deemed to be a legally acceptable risk, then there may not be copies made of specific types of backups.

But for the vast majority of businesses, there will be backups for which there is a legal/compliance requirement for backup redundancy. As I’ve said before, your backups should not be a single point of failure within your data protection environment.

So, if a backup succeeds but its redundant copy fails, the backup should, to a degree, be considered to have failed. This doesn’t mean you have to necessarily do the backup again, but if redundancy is required, it means you do have to make sure the copy gets made. That then hearkens back to requirement three – the end state has to be captured and reported on. If you’re not capturing/reporting on end-state, it means you won’t be aware if the clone of the backup has succeeded or not.

Six – The backup completes within the required timeframe

You have a flight to catch at 9am. Because of heavy traffic, you don’t arrive at the airport until 1pm. Did you successfully make it to the airport?

It’s the same with backups. If, for compliance reasons you’re required to have backups complete within 8 hours, but they take 16 to run, have they successfully completed? They might exit without an error condition, but if SLAs have been breached, or legal requirements have not been met, it technically doesn’t matter that they finished without error. The time it took them to exit was, in fact, the error condition. Saying it’s a successful backup at this point is sophistry.

Seven – The backup does not prevent the next backup from running

This can happen one of two different ways. The first is actually a special condition of rule six – even if there are no compliance considerations, if a backup meant to run once a day takes longer than 24 hours to complete, then by extension, it’s going to prevent the next backup from running. This becomes a double failure – not only does the next backup run, but the next backup doesn’t run because the earlier backup is blocking it.

The second way is not necessarily related to backup timing – this is where a backup completes, but it leaves system in state that prevents next backup from running. This isn’t necessarily a common thing, but I have seen situations where for whatever reason, the way a backup finished prevented the next backup from running. Again, that becomes a double failure.

Eight – It does not require manual intervention to complete

There’s two effective categories of backups – those that are started automatically, and those that are started manually. A backup may in fact be started manually (e.g., in the case of an ad-hoc backup), but should still be able to complete without manual intervention.

As soon as manual intervention is required in the backup process, there’s a much greater risk of the backup not completing successfully, or within the required time-frame. This is, effectively, about designing the backup environment to reduce risk by eliminating human intervention. Think of it as one step removed from the classic challenge that if your backups are required but don’t start without human intervention, they likely won’t run. (A common problem with ‘strategies’ around laptop/desktop self-backup requirements.)

There can be workarounds for this – for example, if you need to trigger a database dump as part of the backup process (e.g., for a database without a plugin), then it could be a password needs to be entered, and the dump tool only accepts passwords interactively. Rather than having someone actually manually enter the password, the dump command could instead be automated with tools such as Expect.

Nine – It does not unduly impact access to the data it is protecting

(We’re in the home stretch now.)

A backup should be as light-touch as possible. The best example perhaps of a ‘heavy touch’ backup is a cold database backup. That’s where the database is shutdown for the duration of the backup, and it’s a perfect situation of a backup directly impacting/impeding access to the data being protected. Sometimes it’s more subtle though – high performance systems may have limited IO and system resources to handle the steaming of a backup, for instance. If system performance is degraded by the backup, then it should be considered the case the backup is unsuccessful.

I liken this to uptime vs availability. A server might be up, but if the performance of the system is so poor that users consider the service offered by the system, it’s not usable. That’s where, for instance, systems like ProtectPoint can be so important – in high performance systems it’s not just about getting a high speed backup, but limiting the load of the database server during the backup process.

Ten – It is predictably repeatable

Of course, there are ad-hoc backups that might only ever need to be run once, or backups that you may never need to run again (e.g., pre-decommissioning backup).

The vast majority of backups within an environment though will be repeated daily. Ideally, the result of each backup should be predictably repeatable. If the backup succeeds today, and there’s absolutely no changes to the systems or environment, for instance, then it should be reasonable to expect the backup will succeed tomorrow. That doesn’t ameliorate the requirement for end-state capturing and reporting; it does mean though that the backup results shouldn’t effectively be random.

In Summary

It’s easy to understand why the simplest answer (“it completes without error”) can be so easily assumed to be the whole answer to “what constitutes a successful backup?” There’s no doubt it forms part of the answer, but if we think beyond the basics, there are definitely a few other contributing factors to achieving really successful backups.

Consistency, impact, recovery usefulness and timeliness, as well as all the other rules outlined above also come into how we can define a truly successful backup. And remember, it’s not about making more work for us, it’s about preventing future problems.


If you’ve thought the above was useful, I’d suggest you check out my book, Data Protection: Ensuring Data Availability. Available in paperback and Kindle formats.