Preston de Guise

Feb 072018
 

The world is changing, and data protection is changing with it. (OK, that sounds like an ad for Veridian Dynamics, but I promise I’m serious.)

One of the areas in which data protection is changing is that backup environments are growing in terms of deployments. It’s quite common these days to see multiple backup servers deployed in an environment – whether that’s due to acquisitions, required functionality, network topology or physical locations, the reason doesn’t really matter. What does matter is that as you increase the number of systems providing protection within an environment, you want to be able to still manage and monitor those systems centrally.

Data Protection Central (DPC) was released earlier this month, and it’s designed from the ground up as a modern, HTML5 web-based system to allow you to monitor your Avamar, NetWorker and Data Domain environments, providing health and capacity reporting on systems and backup. (It also builds on the Multi Systems Manager for Avamar to allow you to perform administrative functions within Avamar without leaving the DPC console – and, well, more is to come on that front over time.)

I’ve been excited about DPC for some time. You may remember a recent post of mine talking about Data Domain Management Center (DDMC); DPC isn’t (at the moment at least) a replacement for DDMC, but it’s built in the same spirit of letting administrators have easy visibility over their entire backup and recovery environment.

So, what’s involved?

Well, let’s start with the price. DPC is $0 for NetWorker and Avamar customers. That’s a pretty good price, right? (If you’re looking for the product page on the support website by the way, it’s here.)

You can deploy it in one of two ways; if you’ve got a SLES server deployed within your environment that meets the requirement, you can download a .bin installer to drop DPC onto that system. The other way – and quite a simple way, really, is to download a VMware OVA file to allow you to easily deploy it within your virtual infrastructure. (Remember, one of the ongoing themes of DellEMC Data Protection is to allow easy virtual deployment wherever possible.)

So yesterday I downloaded the OVA file and today I did a deployment. From start to finish, including gathering screenshots of its operation, that deployment, configuration and use took me about an hour or so.

When you deploy the OVA file, you’ll get prompted for configuration details so that there’s no post-deployment configuration you have to muck around with:

Deploying DPC as an OVA - Part 1

Deploying DPC as an OVA – Part 1

At this point in the deployment, I’ve already selected where the virtual machine will deploy, and what the disk format is. (If you are deploying into a production environment with a number of systems to manage, you’ll likely want to follow the recommendations for thick provisioning. I chose thin, since I was deploying it into my lab.)

You fill in standard networking properties – IP address, gateway, DNS, etc. Additionally, per the screen shot below, you can also immediately attach DPC into your AD/LDAP environment for enterprise authentication:

DPC Deployment, LDAP

DPC Deployment, LDAP

I get into enough trouble at home for IT complexity, so I don’t run LDAP (any more), so there wasn’t anything else for me to do there.

The deployment is quite quick, and after you’re done, you’re ready to power on the virtual machine.

DPC Deployment, ready to power on

DPC Deployment, ready to power on

In fact, one of the things you’ll want to be aware of is that the initial power on and configuration is remarkably quick. (After power-on, the system was ready to let me log on within 5 minutes or so.)

It’s a HTML5 interface – that means there’s no Java Web Start or anything like that; you simply point your web browser at the FQDN or IP address of the DPC server in a browser, and you’ll get to log in and access the system. (The documentation also includes details for changing the SSL certificate.)

DPC Login Screen

DPC Login Screen

DPC follows Dell’s interface guidelines, so it’s quite a crisp and easy to navigate interface. The documentation includes details of your initial login ID and password, and of course, following best practices for security, you’re prompted to change that default password on first login:

DPC Changing the Default Password

DPC Changing the Default Password

After you’ve logged in, you get to see the initial, default dashboard for DPC:

DPC First Login

DPC First Login

Of course, at this point, it looks a wee bit blank. That makes sense – we haven’t added any systems to the environment yet. But that’s easily fixed, by going to System Management in the left-hand column.

DPC System Management

DPC System Management

System management is quite straightforward – the icons directly under “Systems” and “Groups” are for add, edit and delete, respectively. (Delete simply removes a system from DPC, it doesn’t un-deploy the system, of course.)

When you click the add button, you are prompted whether you want to add a server into DPC. (Make sure you check out the version requirements from the documentation, available on the support page.) Adding systems is a very straight-forward operation, as well. For instance, for Data Domain:

DPC Adding a Data Domain

DPC Adding a Data Domain

Adding an Avamar server is likewise quite simple:

DPC Adding an Avamar Server

DPC Adding an Avamar Server

And finally, adding a NetWorker server:

DPC Adding a NetWorker Server

DPC Adding a NetWorker Server

Now, you’ll notice here, DPC prompts you that there’s some added configuration to do on the NetWorker server; it’s about configuring the NetWorker rabbitmq system to be able to communicate with DPC. For now, that’s a manual process. After following the instructions in the documentation, I also added the following to my /etc/rc.d/rc.local file on my Linux-based NetWorker/NMC server to ensure it happened on every reboot, too:

/bin/cat <<EOF | /opt/nsr/nsrmq/bin/nsrmqctl
monitor andoria.turbamentis.int
quit
EOF

It’s not just NetWorker, Avamar and Data Domain you can add – check out the list here:

DPC Systems you can add

DPC Systems you can add

Once I added all my systems, I went over to look at the Activities > Audit pane, which showed me:

DPC Activity Audit

DPC Activity Audit

Look at those times there – it took me all of 8 minutes to change the password on first login, then add 3 Data Domains, an Avamar Server and a NetWorker server to DPC. DPC has been excellently designed to enable rapid deployment and time to readiness. And guess how many times I’d used DPC before? None.

Once systems have been added to DPC and it’s had time to poll the various servers you’re monitoring, you start getting the dashboards populated. For instance, shortly after their addition, my lab DDVE systems were getting capacity reporting:

DPC Capacity Reporting (DD)

DPC Capacity Reporting (DD)

You can drill into capacity reporting by clicking on the capacity report dashboard element to get a tabular view covering Data Domain and Avamar systems:

DPC Detailed Capacity Reporting

DPC Detailed Capacity Reporting

On that detailed capacity view, you see basic capacity details for Data Domains, and as you can see down the right hand side, details of each Mtree on the Data Domain as well. (My Avamar server is reported there as well.)

Under Health, you’ll see a quick view of all the systems you have configured and DPC’s assessment of their current status:

DPC System Health

DPC System Health

In this case, I had two systems reported as unhealthy – one of my DDVEs had an email configuration problem I lazily had not gotten around to fixing, and likewise, my NetWorker server had a licensing error I hadn’t bothered to investigate and fix. Shamed by DPC, I jumped onto both and fixed them, pronto! That meant when I went back to the dashboards, I got an all clear for system health:

DPC Detailed Dashboard

DPC Detailed Dashboard

I wanted to correct those 0’s, so I fired off a backup in NetWorker, which resulted in DPC updating pretty damn quickly to show something was happening:

DPC Dashboard Backup Running

DPC Detailed Dashboard, Backup Running

Likewise, when the backup completed and cloning started, the dashboard was updated quite promptly:

DPC Detailed Dashboard, Clone Running

DPC Detailed Dashboard, Clone Running

You can also see details of what’s been going on via the Activities > System view:

DPC Activities - Systems

DPC Activities – Systems

Then, with a couple of backup and clone jobs run, the Detailed Dashboard was updated a little more:

DPC, Detailed Dashboard More Use

DPC, Detailed Dashboard More Use

Now, I mentioned before that DPC takes on some Multi Systems Manager functionality for Avamar, viz.:

DPC, Avamar Systems Management

DPC, Avamar Systems Management

So that’s back in the Systems Management view. Clicking the horizontal ‘…’ item next to a system lets you launch the individual system management interface, or in the case of Avamar, also manage policy configuration.

DPC, Avamar Policy View

DPC, Avamar Policy View

In that policy view, you can create new policies, initiate jobs, and edit existing configuration details – all without having to go into the traditional Avamar interface:

DPC, Avamar Schedule Configuration

DPC, Avamar Schedule Configuration

DPC, Avamar Retention Configuration

DPC, Avamar Retention Configuration

DPC, Avamar Policy Editing

DPC, Avamar Policy Editing

That’s pretty much all I’ve got to say about DPC at this point in time – other than to highlight the groups function in System Management. By defining groups of resources (and however you want to), you can then filter dashboard views not only for individual systems, but for groups, too, allowing quick and easy review of very specific hosts:

DPC System Management - Groups

DPC System Management – Groups

In my configuration there I’ve lumped by whether systems are associated with an Avamar backup environment or a NetWorker backup environment, but you can configure groups however you need. Maybe you have services broken up by state, or country, or maybe you have them distributed by customer or service you’re providing. Regardless of how you’d like to group them, you can filter through to them in DPC dashboards easily.

So there you go – that’s DPC v1.0.1. It’s honestly taken me more time to get this blog article written than it took me to deploy and configure DPC.

Note: Things I didn’t show in this article:

  • Search and Recovery – That’s where you’d add a DP Search system (I don’t have DP-Search deployed in my lab)
  • Reports – That’s where you’d add a DPA server, which I don’t have deployed in my lab either.

Search and Recovery lets you springboard into the awesome DP-Search web interface, and Reports will drill into DPA and extract the most popular reports people tend to access in DPA, all within DPC.

I’m excited about DPC and the potential it holds over time. And if you’ve got an environment with multiple backup servers and Data Domains, you’ll get value out of it very quickly.

Basics: Planning A Recovery Service

 Architecture, Basics, Recovery  Comments Off on Basics: Planning A Recovery Service
Jan 302018
 

Introduction

In Data Protection: Ensuring Data Availability, I talk quite a lot about what you need to understand and plan as part of a data protection environment. I’m often reminded of the old saying from clothing and carpentry – “measure twice, cut once”. The lesson in that statement of course is that rushing into something headlong may make your work more problematic. Taking the time to properly plan what you’re doing though can in a lot of instances (and data protection is one such instance) make the entire process easier. This post isn’t meant to be a replacement to the various planning chapters in my book – but I’m sure it’ll have some useful tips regardless.

We don’t backup just as something to do; in fact, we don’t protect data just as something to do, either. We protect data to either shield our applications and services (and therefore our businesses) from failures, and to ensure we can recover it if necessary. So with that in mind, what are some essential activities in planning a recovery service?

Hard disk and magnifying glass

First: Do you know what the data is?

Data classification isn’t something done during a data protection cycle. Maybe one day it will be when AI and machine learning is sufficiently advanced; in the interim though it requires input from people – IT, the business, and so on. Of course, there’s nothing physically preventing you from planning and implementing a recovery service without performing data classification; I’d go so far as to suggest that an easy majority of businesses do exactly that. That doesn’t mean it’s an ideal approach though.

Data classification is all about understanding the purpose of the data, who cares about it, how it is used, and so on. It’s a collection of seemingly innocuous yet actually highly important questions. It’s something I cover quite a bit in my book, and for the very good reason that I honestly believe a recovery service can be made simpler, cheaper and more efficient if it’s complimented by a data classification process within the organisation.

Second: Does the data need to exist?

That’s right – does it need to exist? This is another essential but oft-overlooked part of achieving a cheaper, simpler and more efficient recovery service: data lifecycle management. Yet, every 1TB you can eliminate from your primary storage systems, for the average business at least, is going to yield anywhere between 10 and 30TB savings in protection storage (RAID, replication, snapshots, backup and recovery, long term recovery, etc.). While for some businesses that number may be smaller, for the majority of mid-sized and higher businesses, that 10-30TB saving is likely to go much, much higher – particularly as the criticality of the data increases.

Without a data lifecycle policy, bad things happen over time:

  • Keeping data becomes habitual rather than based on actual need
  • As ‘owners’ of data disappear (e.g., change roles, leave the company, etc.), reluctance to delete, prune or manage the data tends to increase
  • Apathy or intransigence towards developing a data lifecycle programme increases.

Businesses that avoid data classification and data lifecycle condemn themselves to the torment of Sisyphus – constantly trying to roll a boulder up a hill only to have it fall back down again before they get to the top. This manifests in many ways, of course, but in designing, acquiring and managing a data recovery service it usually hits the hardest.

Third: Does the data need to be protected?

I remain a firm believer that it’s always better to backup too much data than not enough. But that’s a default, catchall position rather than one which should be the blanket rule within the business. Part of data classification and data lifecycle will help you determine whether you need to enact specific (or any) data protection models for a dataset. It may be test database instances that can be recovered at any point from production systems; it might be randomly generated data that has no meaning outside of a very specific use case, or it might be transient data merely flowing from one location to another that does not need to be captured and stored.

Remember the lesson from data lifecycle – every 1TB eliminated from primary storage can eliminate 10-30TB of data from protection storage. The next logical step after that is to be able to accurately answer the question, “do we even need to protect this?”

Fourth: What recovery models are required?

At this point, we’ve not talked about technology. This question gets us a little closer to working out what sort of technology we need, because once we have a fair understanding of the data we need to offer recovery services for, we can start thinking about what types of recovery models will be required.

This will essential involve determining how recoveries are done for the data, such as:

  • Full or image level recoveries?
  • Granular recoveries?
  • Point in time recoveries?

Some data may not need every type of recovery model deployed for it. For some data, granular recoverability is equally important as complete recoverability, for other types of data, it could be that the only way to recover it is image/full – wherein granular recoveries would simply leave data corrupted or useless. Does all data require point in time recovery? Much will, but some may not.

Other recovery models you should consider of course are how much users will be involved in recoveries. Self-service for admins? Self-service for end-users? All operator run? Chances are of course it’ll be a mix depending those previous recovery model questions (e.g., you might allow self-service individual email recovery, but full exchange recovery is not going to be an end-user initiated task.)

Fifth: What SLOs/SLAs are required?

Regardless of whether your business has Service Level Objectives (SLOs) or Service Level Agreements (SLAs), there’ll be the potential you have to meet a variety of them depending on the nature of the failure, the criticality and age of the data, and so on. (For the rest of this section, I’ll use ‘SLA’ as a generic term for both SLA and SLO). In fact, there’ll be up to three different categories of SLAs you have to meet:

  • Online: These types of SLAs are for immediate or near-immediate recoverability from failure; they’re meant to keep the data online rather than having to seek to retrieve it from a copy. This will cover options such as continuous replication (e.g., fully mirrored storage arrays), continuous data protection (CDP), as well as more regular replication and snapshot options.
  • Nearline: This is where backup and recovery, archive, and long term retention (e.g., compliance retention of backups/archives) comes into play. Systems in this area are designed to retrieve the data from a copy (or in the case of archive, a tiered, alternate platform) when required, as opposed to ensuring the original copy remains continuously, or near to continuously available.
  • Disaster: These are your “the chips are down” SLAs, which’ll fall into business continuity and/or isolated recovery. Particularly in the event of business continuity, they may overlap with either online or nearline SLAs – but they can also diverge quite a lot. (For instance, in a business continuity situation, data and systems for ‘tier 3’ and ‘tier 4’ services, which may otherwise require a particular level of online or nearline recoverability during normal operations, might be disregarded entirely until full service levels are restored.

Not all data may require all three of the above, and even if data does, unless you’re in a hyperconverged or converged environment, it’s quite possible if you’re a backup administrator, you only need to consider some of the above, with other aspects being undertaken by storage teams, etc.

Now you can plan the recovery service (and conclusion)

And because you’ve gathered the answers to the above, planning and implementing the recovery service is now the easy bit! Trust me on this – working out what a recovery service should look like for the business is when you’ve gathered the above information is a fraction of the effort compared to when you haven’t. Again: “Measure twice, cut once.”

If you want more in-depth information on above, check out chapters in my book such as “Contextualizing Data Protection”, “Data Life Cycle”, “Business Continuity”, and “Data Discovery” – not to mention the specific chapters on protection methods such as backup and recovery, replication, snapshots, continuous data protection, etc.

Jan 262018
 

When NetWorker and Data Domain are working together, some operations can be done as a virtual synthetic full. It sounds like a tautology – virtual synthetic. In this basics post, I want to explain the difference between a synthetic full and a virtual synthetic full, so you can understand why this is actually a very important function in a modernised data protection environment.

The difference between the two operations is actually quite simple, and best explained through comparative diagrams. Let’s look at the process of creating a synthetic full, from the perspective of working with AFTDs (still less challenging than synthetic fulls from tape), and working with Data Domain Boost devices.

Synthethic Full vs Virtual Synthetic Full

On the left, we have the process of creating a synthetic full when backups are stored on a regular AFTD device. I’ve simplified the operation, since it does happen in memory rather than requiring staging, etc. Effectively, the NetWorker server (or storage ndoe) will read the various backups that need to be reconstituted into a new, synthetic full, up into memory, and as chunks of the new backup are constructed, they’re written back down onto the AFTD device as a new saveset.

When a Data Domain is involved though, the server gets a little lazier – instead, it just simply has the Data Domain virtually construct a synthetic full – remember, at the back end on the Data Domain, it’s all deduplicated segments of data along with metadata maps that define what a complete ‘file’ is that was sent to the system. (In the case of NetWorker, by ‘file’ I’m referring to a saveset.) So the Data Domain assembles details of a new full without any data being sent over the network.

The difference is simple, but profound. In a traditional synthetic full, the NetWorker server (or storage node) is doing all the grunt work. It’s reading all the data up into itself, combining it appropriately and writing it back down. If you’ve got a 1TB full backup and 6 incremental backups, it’s having do read all that data – 1TB or more, up from disk storage, process it, and write another ~1TB backup back down to disk. With a virtual synthetic full, the Data Domain is doing all the heavy lifting. It’s being told what it needs to do, but it’s doing the reading and processing, and doing it more efficiently than a traditional data read.

So, there’s actually a big difference between synthetic full and virtual synthetic full, and virtual synthetic full is anything but a tautology.

The Perils of Tape

 Architecture, Best Practice, Recovery  Comments Off on The Perils of Tape
Jan 232018
 

In the 90s, if you wanted to define a “disaster recovery” policy for your datacentre, part of that policy revolved around tape. Back then, tape wasn’t just for operational recovery, but also for disaster recovery – and even isolated recovery. In the 90s of course, we weren’t talking about “isolated recovery” in the same way we do now, but depending on the size of the organisation, or its level of exposure to threats, you might occasionally find a company that would make three copies of data – the on-site tape, the off-site tape, and then the “oh heck, if this doesn’t work we’re all ruined” tape.

These days, whatever you might want to use tape for within a datacentre (and yes, I’ll admit that sometimes it may still be necessary, though I’d argue far less necessary than the average tape deployment), something you most definitely don’t want to use it for is disaster recovery.

The reasoning is simple – time. To understand what I mean, let’s look at an example.

Let’s say your business has an used data footprint for all production systems of just 300 TB. I’m not talking total storage capacity here, just literally the occupied space across all DAS, SAN and NAS storage just for production systems. For the purposes of our discussion, I’m also going to say that based on dependencies and business requirements, disaster recovery means getting back all production data. Ergo, getting back 300 TB.

Now, let’s also say that your business is using LTO-7 media. (I know LTO-8 is out, but not everyone using tape updates as soon as a new format becomes available.) The key specifications on LTO-7 are as follows:

  • Uncompressed capacity: 6 TB
  • Max speed without compression: 300 MB/s
  • Typical rated compression: 2.5:1

Now, I’m going to stop there. When it comes to tape, I’ve found the actual compression ratios achieved to be wildly at odds with reality. Back when everyone used to quote a 2:1 compression ratio, I found that tapes yielding 2:1 or higher were in the extreme, and the norm was more likely 1.4:1, or 1.5:1. So, I’m not going to assume you’re going to get compression in the order of 2.5:1 just because the tape manufacturers say you will. (They also say tapes will reliably be OK on shelves for decades…)

A single LTO-7 tape, if we’re reading or writing at full speed, will take around 5.5 hours to go end-to-end. That’s very generously assuming no shoe-shining and that the read or write can be accomplished as a single process – either via multiplexing, or a single stream. Let’s say you get 2:1 compression on the tape. That’s 12 TB, but still takes 5.5 hours to complete (assuming a linear increase in tape speed. And at 600MB/s, that’s getting iffy – but again, bear with me.)

Drip, drip, drip.

Now, tape drives are always expensive – so you’re not going to see enough tape drives to load one tape per drive, fill the tape, and bang, your backup is done for your entire environment. Tape libraries exist to allow the business to automatically switch tapes as required, avoiding that 1:1 relationship between backup size and the number of drives used.

So let’s say you’ve got 3 tape drives. That means in 5.5 hours, using 2:1 compression and assuming you can absolutely stream at top speed for that compression ratio, you can recover 36 TB. (If you’ve used tape for as long as me, you’ll probably be chuckling at this point.)

So, that’s 36 TB in the first 5.5 hours. At this point, things are already getting a little absurd, so I’ll not worry about the time it takes to unload and load new media. The next 36 TB is done in another 5.5 hours. If we continue down that path, our first 288 TB will be recovered after 44 hours. We might then assume our final 12TB fitted snugly on a single remaining tape – which will take another 5.5 hours to read, getting us to a disaster recovery time of 49.5 hours.

Drip, drip, drip.

Let me be perfectly frank here: those numbers are ludicrous. The only way you have a hope of achieving that sort of read speed in a disaster recovery for that many tapes is if you are literally recovering a 300 TB filesystem filled almost exclusively with 2GB files that each can be compressed down by 50%. If your business production data doesn’t look like that, read on.

Realistically, in a disaster recovery situation, I’d suggest you’re going to look at at least 1.5x the end-to-end read/write time. For all intents and purposes, that assumes the tapes are written with say, 2-way multiplexing and we can conduct 2 reads across it, where each read is able to use filemark-seek-forward (FSFs) to jump over chunks of the tape.

5.5 hours per tape becomes 8.25 hours. (Even that, I’d say, is highly optimistic.) At that point, the 49.5 hour recovery is a slightly more believable 74.25 hour recovery. Slightly. That’s over 3 days. (Don’t forget, we still haven’t factored in tape load, tape rewind, and tape unload times into the overall scope.)

Drip, drip, drip.

What advantage do we get if we read from disk instead of tape? Well, if we’re going to go down that path, we’re going to get instant access (no load, unload or seek times), and concurrency (highly parallel reads – recovering fifty systems at a time, for instance).

But here’s the real kicker. Backup isn’t disaster recovery. And particularly, tape backup is disaster recovery in the same way that a hermetically sealed, 100% barricaded house is safety from a zombie apocalypse. (Eventually, you’re going to run out of time, i.e., food and water.)

This is, quite frankly, where the continuum comes into play:

More importantly, this is where picking the right data protection methodology for the SLAs (the RPOs and RTOs) is essential, viz.:

Storage Admin vs Backup Admin

Backup and recovery systems might be a fallback position when all other types of disaster recovery have been exhausted, but at the point you’ve decided to conduct a disaster recovery of your entire business from physical tape, you’re in a very precarious position. It might have been a valid option in the 90s, but in a modern data protection environment, there’s a plethora of higher level systems that should be built into a disaster recovery strategy. I’ve literally seen examples of companies when, faced with having to do a complete datacentre recovery from tape, taking a month or more to get back operational. (If ever.)

Like the old saying goes – if all you have is a hammer, everything looks like a nail. “Backup only” companies have a tendency to encourage you to think as backup as a disaster recovery function, or even to think of disaster recovery/business continuity as someone else’s problem. Talking disaster recovery is easiest when able to talk to someone across the entire continuum.

Otherwise you may very well find your business perfectly safely barricaded into a hermetically sealed location away from the zombie hoard, watching your food and water supplies continue to decrease hour by hour, day by day. Drip, drip, drip.

Planning holistic data protection strategies is more than just backup – and it’s certainly more than just tape. That’s why Data Protection: Ensuring Data Availability looks at the entire data storage protection spectrum. And that’s why I work for DellEMC, too. It’s not just backup. It’s about the entire continuum.

 

Jan 182018
 

Data Domain Management Centre (DDMC) is a free virtual appliance available for customers with Data Domain to provide a web interface for monitoring and managing multiple Data Domains from the same location. Even if you’ve only got one Data Domain in your environment, it can provide additional functionality for you.

The system resource requirements for DDMC are trivially small, viz.:

Size# DDsvCPUvRAMHDD Space (Install+Data)
Small1-251240+50
Medium1-502340+100
Large1-752440+200

If you’ve used the standard per-system Data Domain web interface, it’s very likely you’ll be at home instantly with DDMC.

After you’ve logged in, you’ll get a “quick view” dashboard of the system, featuring everyone’s favourite graphs – the donut.

DDMC_01 - Dashboard

In my case, I’ve got DDMC running to monitor two DDVEs I have in my lab – so there’s not a lot to see here. If there are any alerts on any of the monitored systems (and the DDMC includes itself in monitored systems, which is why you’ll get 1 more system than the number of DDs your monitoring in the ‘Alerts’ area), that’ll be red, drawing your attention straight away to details that may need your attention. Each of those widgets is pretty straight forward – health, capacity thresholds, replication, capacity used, etc. Nothing particularly out of the ordinary there.

DDMC is designed from the ground up to very quickly let you see the status of your entire Data Domain fleet. For instance, looking under Health > Status, you see a very straight-forward set of ‘traffic light’ views:

DDMC_02 Status

This gives you a list of systems, their current status, and a view as to what protocols and key features are enabled. You’ll note three options above “HA Readiness”. These are, in left-to-right order:

  • All systems
  • Groups – Administrator configurable collections of Data Domains (e.g., by physical datacentre location such as “Moe”, “Subiaco”, “Tichborne”, or datacentre location, such as “Core”, “ROBO”, “DMZ”, etc.)
  • Tenants – DDMC is tenant aware, and lets you view information presented by tenancy as well.

There’s a variety of options available to you in DDMC for centralised management – for instance, being able to roll out OS updates to multiple Data Domains from a central location. But there’s also some great centralised reporting and monitoring for you as well. For instance, under Capacity Management, you can quickly get views such as:

DDMC_03 Capacity Management

From this pane you can very quickly see capacity utilisation on each Data Domain, with combined stats across the fleet. You can also change what sort of period of time you’re viewing the information for – the default graph in the above screen shot for instance shows weekly capacity utilisation over the most recent one month period on one of my DDVEs.

What’s great though is that underneath “Management”, you’ll also see an option for Projected. This is where DDMC gives you a great view for your systems – what does it project, based on either a default range, or your own selected range of dates, the capacity utilisation of the system will be?

DDMC_04 Projected Capacity

In my case, above, DDMC is presenting a sudden jump in projected capacity on the day fo the report being run (January 18, 2018) simply because that was the day I basically doubled the amount of data being sent to the Data Domain after a fairly lengthy trend of reasonably consistent weekly backup cycles. You’ll note though that it projects out three key dates:

  • 100% (Full)
  • 90% (Critical)
  • 80% (Warning)

Now, Data Domain will keep working fine up to the moment you hit 100% full, at which point it obviously can’t accept any more data. The critical and warning levels are pretty standard recommended capacity monitoring levels, particularly for deduplication storage. 80% is a good threshold for determining whether you need to order more capacity or not, and have it arrive in time. 90% is your warning level for environments where you prefer to run closer to the line – or an alert that you may want to manually check out why the capacity is that high. So there’s nothing unusual with having 80 and 90% alert levels – they’re actually incredibly handy.

I’m not going to go through each option within DDMC, but I will pause the review with Systems Inventory:

DDMC_05 - Systems Inventory

Within the Inventory area in DDMC, you can view quick details of OS/etc details for each Data Domain, perform an upgrade, and edit various aspects of the system information. In particular, the area I’m showing above is the Thresholds area. DDMC doesn’t hard-set the threshold alerts to 80% and 90%; instead, if you have particular Data Domains that need different threshold notification areas, you can make those changes here, ensuring that your threshold alerts and projections are accurate to your requirements.

DDMC isn’t meant to be a complex tool; the average Data Domain/backup administrator will likely find it simple and intuitive to use with very little time taken just wandering around in the interface. If you’ve got an operations team, it’s the sort of thing you’ll want the operators to have access to in order to keep an eye on your fleet; if you’re an IT or capacity manager you might use it as a starting point to keeping any eye on capacity utilisation, and if you’re a backup or storage administrator in an environment with multiple Data Domains, you’ll quickly get used to referring to the dashboard and management options to make your life simpler.

Also, at $0 and with such simple infrastructure requirements to run it, it’s not really something you have to think about.

 

Dec 302017
 

With just a few more days of 2017 left, I thought it opportune making the last post of the year to summarise some of what we’ve seen in the field of data protection in 2017.

2017 Summary

It’s been a big year, in a lot of ways, particularly at DellEMC.

Towards the end of 2016, but definitely leading into 2017, NetWorker 9.1 was released. That meant 2017 started with a bang, courtesy of the new NetWorker Virtual Proxy (NVP, or vProxy) backup system. This replaced VBA, allowing substantial performance improvements, and some architectural simplification as well. I was able to generate some great stats right out of the gate with NVP under NetWorker 9.1, and that applied not just to Windows virtual machines but also to Linux ones, too. NetWorker 9.1 with NVP allows you to recover tens of thousands or more files from image level backup in just a few minutes.

In March I released the NetWorker 2016 usage survey report – the survey ran from December 1, 2016 to January 31, 2017. That reminds me – the 2017 Usage Survey is still running, so you’ve still got time to provide data to the report. I’ve been compiling these reports now for 7 years, so there’s a lot of really useful trends building up. (The 2016 report itself was a little delayed in 2017; I normally aim for it to be available in February, and I’ll do my best to ensure the 2017 report is out in February 2018.)

Ransomware and data destruction made some big headlines in 2017 – repeatedly. Gitlab hit 2017 running with a massive data loss in January, which they consequently blamed on a backup failure, when in actual fact it was a staggering process and people failure. It reminds one of the old manager #101 credo, “If you ASSuME, you make an ASS out of U and ME”. Gitlab’s issue may have at a very small level been a ‘backup failure’, but only in so much that everyone in the house thinking it was someone else’s turn to fill the tank of the car, and running out of petrol, is a ‘car failure’.

But it wasn’t just Gitlab. Next generation database users around the world – specifically, MongoDB – learnt the hard way that security isn’t properly, automatically enabled out of the box. Large numbers of MongoDB administrators around the world found their databases encrypted or lost as default security configurations were exploited on databases left accessible in the wild.

In fact, Ransomware became such a common headache in 2017 that it fell prey to IT’s biggest meme – the infographic. Do a quick Google search for “Ransomware Timeline” for instance, and you’ll find a plethora of options around infographics about Ransomware. (And who said Ransomware couldn’t get any worse?)

Appearing in February 2017 was Data Protection: Ensuring Data Availability. Yes, that’s right, I’m calling the release of my second book on data protection as a big event in the realm of data storage protection in 2017. Why? This is a topic which is insanely critical to business success. If you don’t have a good data protection process and strategy within your business, you could literally lose everything that defines the operational existence of your business. There’s three defining aspects I see in data protection considerations now:

  • Data is still growing
  • Product capability is still expanding to meet that growth
  • Too many businesses see data protection as a series of silos, unconnected – storage, virtualisation, databases, backup, cloud, etc. (Hint: They’re all connected.)

So on that basis, I do think a new book whose focus is to give a complete picture of the data storage protection landscape is important to anyone working in infrastructure.

And on the topic of stripping the silos away from data protection, 2017 well and truly saw DellEMC cement its lead in what I refer to as convergent data protection. That’s the notion of combining data protection techniques from across the continuum to provide new methods of ensuring SLAs are met, impact is eliminated, and data hops are minimised. ProtectPoint was first introduced to the world in 2015, and has evolved considerably since then. ProtectPoint allows primary storage arrays to integrate with data protection storage (e.g., VMAX3 to Data Domain) so that those really huge databases (think 10TB as a typical starting point) can have instantaneous, incremental-forever backups performed – all application integrated, but no impact on the database server itself. ProtectPoint though was just the starting position. In 2017 we saw the release of Hypervisor Direct, which draws a line in the sand on what Convergent Data Protection should be and do. Hypervisor direct is there for your big, virtualised systems with big databases, eliminating any risk of VM-stun during a backup (an architectural constraint of VMware itself) by integrating RecoverPoint for Virtual Machines with Data Domain Boost, all while still being fully application integrated. (Mark my words – hypervisor direct is a game changer.)

Ironically, in a world where target-based deduplication should be a “last resort”, we saw tech journalists get irrationally excited about a company heavy on marketing but light on functionality promote their exclusively target-deduplication data protection technology as somehow novel or innovative. Apparently, combining target based deduplication and needing to scale to potentially hundreds of 10Gbit ethernet ports is both! (In the same way that releasing a 3-wheeled Toyota Corolla for use by the trucking industry would be both ‘novel’ and ‘innovative’.)

Between VMworld and DellEMC World, there were some huge new releases by DellEMC this year though, by comparison. The Integrated Data Protection Appliance (IDPA) was announced at DellEMC world. IDPA is a hyperconverged backup environment – you get delivered to your datacentre a combined unit with data protection storage, control, reporting, monitoring, search and analytics that can be stood up and ready to start protecting your workloads in just a few hours. As part of the support programme you don’t have to worry about upgrades – it’s done as an atomic function of the system. And there’s no need to worry about software licensing vs hardware capacity: it’s all handled as a single, atomic function, too. For sure, you can still build your own backup systems, and many people will – but for businesses who want to hit the ground running in a new office or datacentre, or maybe replace some legacy three-tier backup architecture that’s limping along and costing hundreds of thousands a year just in servicing media servers (AKA “data funnel$”), IDPA is an ideal fit.

At DellEMC World, VMware running in AWS was announced – imagine that, just seamlessly moving virtual machines from your on-premises environment out to the world’s biggest public cloud as a simple operation, and managing the two seamlessly. That became a reality later in the year, and NetWorker and Avamar were the first products to support actual hypervisor level backup of VMware virtual machines running in a public cloud.

Thinking about public cloud, Data Domain Virtual Edition (DDVE) became available in both the Azure and AWS marketplaces for easy deployment. Just spin up a machine and get started with your protection. That being said, if you’re wanting to deploy backup in public cloud, make sure you check out my two-part article on why Architecture Matters: Part 1, and Part 2.

And still thinking about cloud – this time specifically about cloud object storage, you’ll want to remember the difference between Cloud Boost and Cloud Tier. Both can deliver exceptional capabilities to your backup environment, but they have different use cases. That’s something I covered off in this article.

There were some great announcements at re:Invent, AWS’s yearly conference, as well. Cloud Snapshot Manager was released, providing enterprise grade control over AWS snapshot policies. (Check out what I had to say about CSM here.) Also released in 2017 was DellEMC’s Data Domain Cloud Disaster Recovery, something I need to blog about ASAP in 2018 – that’s where you can actually have your on-premises virtual machine backups replicated out into a public cloud and instantiate them as a DR copy with minimal resources running in the cloud (e.g., no in-Cloud DDVE required).

2017 also saw the release of Enterprise Copy Data Analytics – imagine having a single portal that tracks your Data Domain fleet world wide, and provides predictive analysis to you about system health, capacity trending and insights into how your business is going with data protection. That’s what eCDA is.

NetWorker 9.2 and 9.2.1 came out as well during 2017 – that saw functionality such as integration with Data Domain Retention Lock, database integrated virtual machine image level backups, enhancements to the REST API, and a raft of other updates. Tighter integration with vRealize Automation, support for VMware image level backup in AWS, optimised object storage functionality and improved directives – the list goes on and on.

I’d be remiss if I didn’t mention a little bit of politics before I wrap up. Australia got marriage equality – I, myself, am finally now blessed with the challenge of working out how to plan a wedding (my boyfriend and I are intending to marry on our 22nd anniversary in late 2018 – assuming we can agree on wedding rings, of course), and more broadly, politics again around the world managed to remind us of the truth to that saying by the French Philosopher, Albert Camus: “A man without ethics is a wild beast loosed upon this world.” (OK, I might be having a pointed glance at Donald Trump over in America when I say that, but it’s still a pertinent thing to keep in mind across the political and geographic spectrums.)

2017 wasn’t just about introducing converged data protection appliances and convergent data protection, but it was also a year where more businesses started to look at hyperconverged administration teams as well. That’s a topic that will only get bigger in 2018.

The DellEMC data protection family got a lot of updates across the board that I haven’t had time to cover this year – Avamar 7.5, Boost for Enterprise Applications 4.5, Enterprise Copy Data Management (eCDM) 2, and DDOS 6.1! Now that I sit back and think about it, my January could be very busy just catching up on things I haven’t had a chance to blog about this year.

I saw some great success stories with NetWorker in 2017, something I hope to cover in more detail into 2018 and beyond. You can see some examples of great success stories here.

I also started my next pet project – reviewing ethical considerations in technology. It’s certainly not going to be just about backup. You’ll see the start of the project over at Fools Rush In.

And that’s where I’m going to leave 2017. It’s been a big year and I hope, for all of you, a successful year. 2018, I believe, will be even bigger again.

Basics – Prior Recovery Details

 Basics, NetWorker, Recovery  Comments Off on Basics – Prior Recovery Details
Dec 192017
 

If you need to find out details about what has recently been recovered with a NetWorker server, there’s a few different ways to achieve it.

NMC, of course, offers recovery reports. These are particularly good if you’ve got admins (e.g., security/audit people) who only access NetWorker via NMC – and good as a baseline report for auditing teams. Remember that NMC will retain records for reporting for a user defined period of time, via the Reports | Reports | Data Retention setting:

Reports Data Retention Menu

Data Retention Setting

The actual report you’ll usually want to run for recovery details is the ‘Recover Details’ report:

NMC Recover Details Report

The other way you can go about it is to use the command nsrreccomp, which retrieves details about completed recoveries from the NetWorker jobs database. Now, the jobs database is typically pruned every 72 hours in a default install (you can change the length of time on the jobs database). Getting a list of completed recoveries (successful or otherwise) is as simple as running the command nsrreccomp -L:

[root@orilla tmp]# nsrreccomp -L
name, start time, job id, completion status
recover, Tue Dec 19 10:28:25 2017(1513639705), 838396, succeeded
DNS_Serv_Rec_20171219, Tue Dec 19 10:39:19 2017(1513640359), 838404, failed
DNS_Server_Rec_20171219_2, Tue Dec 19 10:41:48 2017(1513640508), 838410, failed
DNS_Server_Rec_20171219_3, Tue Dec 19 10:43:55 2017(1513640635), 838418, failed
DNS_Server_Rec_20171219, Tue Dec 19 10:47:43 2017(1513640863), 838424, succeeded

You can see in the above list that I made a few recovery attempts that generated failures (deliberately picking a standalone ESX server that didn’t have a vProxy to service it as a recovery target, then twice forgetting to change my recovery destination) so that we could see the list includes successful and failed jobs.

You’ll note the second last field in each line of output is actually the Job ID associated with the recovery. You can actually use this with nsrreccomp to retrieve the full output of an individual job, viz.:

[root@orilla tmp]# nsrreccomp -R 838396
 Recovering 1 file from /tmp/ into /tmp/recovery
 Volumes needed (all on-line):
 Backup.01 at Backup_01
 Total estimated disk space needed for recover is 8 KB.
 Requesting 1 file(s), this may take a while...
 Recover start time: Tue 19 Dec 2017 10:28:25 AEDT
 Requesting 1 recover session(s) from server.
 129290:recover: Successfully established direct file retrieve session for save-set ID '1345653443' with adv_file volume 'Backup.01'.
 ./tostage.txt
 Received 1 file(s) from NSR server `orilla'
 Recover completion time: Tue 19 Dec 2017 10:28:25 AEDT

The show-text option can be used for any recovery performed. For a virtual machine recovery it’ll be quite verbose – a snippet is as follows:

[root@orilla tmp]# nsrreccomp -R 838424
Initiating virtual machine 'krell_20171219' recovery on vCenter caprica.turbamentis.int using vProxy langara.turbamentis.int.
152791:nsrvproxy_recover: vProxy Log Begins ===============================================
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 NOTICE: [@(#) Build number: 67] Logging to '/opt/emc/vproxy/runtime/logs/vrecoverd/RecoverVM-5e9719e5-61bf-4e56-9b68-b4931e2af5b2.log' on host 'langara'.
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 NOTICE: [@(#) Build number: 67] Release: '2.1.0-17_1', Build number: '1', Build date: '2017-06-21T21:02:28Z'
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 NOTICE: [@(#) Build number: 67] Changing log level from INFO to TRACE.
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 INFO: [@(#) Build number: 67] Created RecoverVM session for "krell_20171219", logging to "/opt/emc/vproxy/runtime/logs/vrecoverd/RecoverVM-5e9719e5-61bf-4e56-9b68-b4931e2af5b2.log"...
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 NOTICE: [@(#) Build number: 67] Starting restore of VM "krell_20171219". Logging at level "TRACE" ...
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 NOTICE: [@(#) Build number: 67] RecoverVmSessions supplied by client:
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 NOTICE: [@(#) Build number: 67] {
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 NOTICE: [@(#) Build number: 67] "Config": {
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 NOTICE: [@(#) Build number: 67] "SessionId": "5e9719e5-61bf-4e56-9b68-b4931e2af5b2",
159373:nsrvproxy_recover: vProxy Log: 2017/12/19 10:47:19 NOTICE: [@(#) Build number: 67] "LogTag": "@(#) Build number: 67",

Now, if you’d like more than 72 hours retention in your jobs database, you can set it to a longer period of time (though don’t get too excessive – you’re better off capturing jobs database details periodically and saving to an alternate location than trying to get the NetWorker server to retain a huge jobs database) via the server properties. This can be done using nsradmin or NMC – NMC shown below for reference:

NMC Server Properties

NMC Server Properties JobsDB Retention

There’s not much else to grabbing recovery results out of NetWorker – it’s certainly useful to know of both the NMC and command line options available to you. (Of course, if you want maximum flexibility on recovery reporting, you should definitely check out Data Protection Advisor (DPA), which is available automatically under any of the Data Protection Suite licensing models.)

 

Cloud Snapshot Manager

 Best Practice, Cloud  Comments Off on Cloud Snapshot Manager
Dec 152017
 

Recently at Amazon re:Invent, DellEMC announced Cloud Snapshot Manager (CSM), the first iteration of an enterprise grade data protection control portal for native cloud snapshots.

What’s important about CSM is that it’s not trying to re-invent the wheel – it’s not about hacking a new snapshot protocol into third party clouds like Amazon, but leveraging the existing tools. How is this different from a roll your own snapshot script, then? Well, it all comes down to how you want to manage your snapshots – hence the name. CSM is about giving you a true, enterprise grade data protection overlay to the native snapshot functionality, letting you set policies around snapshot frequencies, retention times, and instance selection.

Cloud Snapshot Manager

Now, you might think it’s odd that I’m talking about snapshots – why wouldn’t we be trying to back this up? Well, just because we move into the public cloud it doesn’t mean we abandon a data protection continuum. Remember the continuum from my previous post:

Backup and recovery in public cloud serves a purpose, but just like for on-premises systems, you’ll sometimes want faster recoverability options than a traditional backup will give you; or (and this is something particular to PaaS in particular within the cloud) you’ll want to be able to provide protection against something that you can’t necessarily deploy a traditional backup option against. Hence, just as they are within the datacentre, snapshots can form an important part of a holistic data protection strategy in the public cloud.

CSM lets you define snapshot policies, perform ad-hoc snapshots, and perform snapshot recoveries, all from within an easy to use web portal hosted within the DellEMC cloud. Instances can be protected by selection or tagging – and most importantly, CSM regularly rediscovers your environment, so changing moving protected systems between layers of snapshot policies can be as simple as changing instance tags.

CSM was in interactive development for some time, with a lot of customers making use of it. It brings two distinct benefits – enterprise management, of course, but also allowing policy-based removal of snapshots as they expire. This can pay for itself in eliminating the risk of bill-bloat as snapshots get taken, but not cleaned up. By managing the snapshots from a central location, you keep a tighter control over the bill that comes from the storage associated with those snapshots.

If your business is using public cloud, Cloud Snapshot Manager is something you, as a data protection administrator, really should be looking into. And the cool thing about that is you can do it with a 30 day free trial. Because there’s nothing to install, it’s as simple as signing up to a portal to get started. Currently CSM works with Amazon Web Services – if you’re interested, you should check out the official page on CSM for more details.

 

The rise of the hyperconverged administrator

 Architecture, Best Practice  Comments Off on The rise of the hyperconverged administrator
Dec 142017
 

Hyperconverged is a hot topic in enterprise IT these days, and deservedly so. The normal approach to enterprise IT infrastructure after all, is a bit like assembling a jigsaw puzzle over a few years. Over successive years it might be time to refresh storage, then compute, then backup, in any combination or order. So you’re piecing together the puzzle over an extended period of time, and wanting to make sure that you end up in a situation where all the pieces – sometimes purchased up to five years apart – will work well together. That’s before you start to deal with operating system and application compatibility considerations, too.

bigStock Roads

Unless your business is actually in infrastructure service delivery, none of this is of value to the business. The value starts when people can start delivering new business services on top of infrastructure – not building the infrastructure itself. I’m going to (perhaps controversially to some IT people suggest) that if your IT floor or office still has signs scattered around the place along the lines of “Unix Admin Team”, “Windows Admin Team”, “Storage Admin Team”, “Network Admin Team”, “Security Team”, “VMware Admin Team”, “Backup Admin Team”, etc., then your IT is premised on an architectural jigsaw approach that will continue to lose relevance to businesses who are seeking true, cloud-like agility within their environment. Building a new system, or delivering a new service in that silo-based environment is a multi-step process where someone creates a service ticket, which goes for approval, then once approved, goes to the network team to allocate an IP address, before going to the storage team to confirm sufficient storage, then moving to the VMware team to allocate a virtual machine, before that’s passed over to the Unix or Windows team to prepare the system, then the ticket gets migrated to the backup team and … five weeks have passed. If you’re lucky.

Ultimately, a perception of “it’s cheaper” is only part of the reason why businesses look enviously at cloud – it’s also the promise of agility, which is what the business needs more than anything else from IT.

If a business is either investigating, planning or implementing hyperconverged, it’s doing it for efficiency and to achieve cloud-like agility. At the point that hyperconverged lands, those signs in the office have to come down, and the teams have to work together. Administration stops being a single field of expertise and becomes an entire-stack vertical operation.

Let’s look at that (just) from the perspective of data protection. If we think of a typical data protection continuum (excluding data lifecycle management’s archive part), it might look as simple as the following:

Now I’ve highlighted those functions in different colours, for the simple fact that while they’re all data protection, in a classic IT environment, they’re rather differentiated from one another, viz.:

Primary vs Protection Storage

The first three – continuous availability, replication and snapshots – all, for the most part in traditional infrastructure IT shops, fit into the realm of functions of primary storage. On the other hand, operational backup and recovery, as well as long term retention, both fit into models around protection storage (regardless of what your protection storage is). So, as a result, you’ll typically end up with the following administrative separation:

Storage Admin vs Backup Admin

And that’s where the traditional IT model has sat for a long time – storage admins, and backup admins. Both working on different aspects of the same root discipline – data protection. That’s where you might find, in those traditionally silo’d IT shops, a team of storage administrators and a team of backup administrators. Colleagues to be sure, though not part of the same team.

But a funny thing happens when hyperconverged enters the picture. Hyperconverged merges storage and compute, and to a degree, networking, all into the same bucket of resources. This creates a different operational model which the old silo’d approach just can’t handle:

Hyperconverged Administration

Once hyperconverged is in the picture, the traditional delineation between storage administration and backup administration disappears. Storage becomes just one part of the integrated stack. Protection storage might share the hyperconverged environment (depending on the redundancy therein, or the operational budget), or it might be separate. Separate or not though, data protection by necessity is going to become a lot more closely intertwined.

In fact, in a hyperconverged environment, it’s likely to be way more than just storage and backup administration that becomes intertwined – it really does go all the way up the stack. So much so that I can honestly say every environment where I’ve seen hyperconverged deployed and fail is where the traditional silos have remained in place within IT, with processes and intra-group communications effectively handled by service/trouble tickets.

So here’s my tip as we reach the end of the 2017 tunnel and now see 2018 barrelling towards us: if you’re a backup administrator and hyperconverged is coming into your environment, this is your perfect opportunity to make the jump into being a full spectrum data protection administrator (and expert). (This is something I cover in a bit more detail in my book.) Within hyperconverged, you’ll need to understand the entire data protection stack, and this is a great opportunity to grow. It’s also a fantastic opportunity to jump out of pure operational administration and start looking at the bigger picture of service availability, too, since the automation and service catalogue approach to hyperconverged is going to let you free yourself substantially from the more mundane aspects of backup administration.

“Hyperconverged administrator” – it’s something you’re going to hear a lot more of in 2018 and beyond.

 

NetWorker 2017 Usage Survey

 NetWorker, Site  Comments Off on NetWorker 2017 Usage Survey
Dec 012017
 
Survey Image

It seems like only a few weeks ago, 2017 was starting. But here we are again, and it’s time for another NetWorker usage survey. If you’re a recent blog subscriber, you may not have seen previous surveys, so here’s how it works:

Every year a survey is run on the NetWorker blog to capture data on how businesses are using NetWorker within their environment. As per previous years, the survey runs from December 1 to January 31. At the end of survey, I analyse the data, crunch the numbers, sacrifice a tape to the data protection deities and generate a report about how NetWorker is being used in the community.

My goal isn’t just for the report to be a simple regurgitation of the data input by respondents. It’s good to understand the patterns that emerge, too. Is deduplication more heavily used in the Americas, or APJ? Who keeps data for the longest? Is there any correlation between the longevity of NetWorker use and the number of systems being protected? You can see last year’s survey results here.

Survey Image

To that end, it’s time to run the 2017 NetWorker survey. Once again, I’m also going to give away a copy of my latest book, Data Protection: Ensuring Data Availability. All you have to do in order to be in the running is to be sure to include your email address in the survey. Your email address will only be used to contact you if you win.

The survey should hopefully only take you 5-10 minutes.

The survey has now closed. Results will be published mid-late February 2018.

%d bloggers like this: