Feb 072018

The world is changing, and data protection is changing with it. (OK, that sounds like an ad for Veridian Dynamics, but I promise I’m serious.)

One of the areas in which data protection is changing is that backup environments are growing in terms of deployments. It’s quite common these days to see multiple backup servers deployed in an environment – whether that’s due to acquisitions, required functionality, network topology or physical locations, the reason doesn’t really matter. What does matter is that as you increase the number of systems providing protection within an environment, you want to be able to still manage and monitor those systems centrally.

Data Protection Central (DPC) was released earlier this month, and it’s designed from the ground up as a modern, HTML5 web-based system to allow you to monitor your Avamar, NetWorker and Data Domain environments, providing health and capacity reporting on systems and backup. (It also builds on the Multi Systems Manager for Avamar to allow you to perform administrative functions within Avamar without leaving the DPC console – and, well, more is to come on that front over time.)

I’ve been excited about DPC for some time. You may remember a recent post of mine talking about Data Domain Management Center (DDMC); DPC isn’t (at the moment at least) a replacement for DDMC, but it’s built in the same spirit of letting administrators have easy visibility over their entire backup and recovery environment.

So, what’s involved?

Well, let’s start with the price. DPC is $0 for NetWorker and Avamar customers. That’s a pretty good price, right? (If you’re looking for the product page on the support website by the way, it’s here.)

You can deploy it in one of two ways; if you’ve got a SLES server deployed within your environment that meets the requirement, you can download a .bin installer to drop DPC onto that system. The other way – and quite a simple way, really, is to download a VMware OVA file to allow you to easily deploy it within your virtual infrastructure. (Remember, one of the ongoing themes of DellEMC Data Protection is to allow easy virtual deployment wherever possible.)

So yesterday I downloaded the OVA file and today I did a deployment. From start to finish, including gathering screenshots of its operation, that deployment, configuration and use took me about an hour or so.

When you deploy the OVA file, you’ll get prompted for configuration details so that there’s no post-deployment configuration you have to muck around with:

Deploying DPC as an OVA - Part 1

Deploying DPC as an OVA – Part 1

At this point in the deployment, I’ve already selected where the virtual machine will deploy, and what the disk format is. (If you are deploying into a production environment with a number of systems to manage, you’ll likely want to follow the recommendations for thick provisioning. I chose thin, since I was deploying it into my lab.)

You fill in standard networking properties – IP address, gateway, DNS, etc. Additionally, per the screen shot below, you can also immediately attach DPC into your AD/LDAP environment for enterprise authentication:

DPC Deployment, LDAP

DPC Deployment, LDAP

I get into enough trouble at home for IT complexity, so I don’t run LDAP (any more), so there wasn’t anything else for me to do there.

The deployment is quite quick, and after you’re done, you’re ready to power on the virtual machine.

DPC Deployment, ready to power on

DPC Deployment, ready to power on

In fact, one of the things you’ll want to be aware of is that the initial power on and configuration is remarkably quick. (After power-on, the system was ready to let me log on within 5 minutes or so.)

It’s a HTML5 interface – that means there’s no Java Web Start or anything like that; you simply point your web browser at the FQDN or IP address of the DPC server in a browser, and you’ll get to log in and access the system. (The documentation also includes details for changing the SSL certificate.)

DPC Login Screen

DPC Login Screen

DPC follows Dell’s interface guidelines, so it’s quite a crisp and easy to navigate interface. The documentation includes details of your initial login ID and password, and of course, following best practices for security, you’re prompted to change that default password on first login:

DPC Changing the Default Password

DPC Changing the Default Password

After you’ve logged in, you get to see the initial, default dashboard for DPC:

DPC First Login

DPC First Login

Of course, at this point, it looks a wee bit blank. That makes sense – we haven’t added any systems to the environment yet. But that’s easily fixed, by going to System Management in the left-hand column.

DPC System Management

DPC System Management

System management is quite straightforward – the icons directly under “Systems” and “Groups” are for add, edit and delete, respectively. (Delete simply removes a system from DPC, it doesn’t un-deploy the system, of course.)

When you click the add button, you are prompted whether you want to add a server into DPC. (Make sure you check out the version requirements from the documentation, available on the support page.) Adding systems is a very straight-forward operation, as well. For instance, for Data Domain:

DPC Adding a Data Domain

DPC Adding a Data Domain

Adding an Avamar server is likewise quite simple:

DPC Adding an Avamar Server

DPC Adding an Avamar Server

And finally, adding a NetWorker server:

DPC Adding a NetWorker Server

DPC Adding a NetWorker Server

Now, you’ll notice here, DPC prompts you that there’s some added configuration to do on the NetWorker server; it’s about configuring the NetWorker rabbitmq system to be able to communicate with DPC. For now, that’s a manual process. After following the instructions in the documentation, I also added the following to my /etc/rc.d/rc.local file on my Linux-based NetWorker/NMC server to ensure it happened on every reboot, too:

/bin/cat <<EOF | /opt/nsr/nsrmq/bin/nsrmqctl
monitor andoria.turbamentis.int

It’s not just NetWorker, Avamar and Data Domain you can add – check out the list here:

DPC Systems you can add

DPC Systems you can add

Once I added all my systems, I went over to look at the Activities > Audit pane, which showed me:

DPC Activity Audit

DPC Activity Audit

Look at those times there – it took me all of 8 minutes to change the password on first login, then add 3 Data Domains, an Avamar Server and a NetWorker server to DPC. DPC has been excellently designed to enable rapid deployment and time to readiness. And guess how many times I’d used DPC before? None.

Once systems have been added to DPC and it’s had time to poll the various servers you’re monitoring, you start getting the dashboards populated. For instance, shortly after their addition, my lab DDVE systems were getting capacity reporting:

DPC Capacity Reporting (DD)

DPC Capacity Reporting (DD)

You can drill into capacity reporting by clicking on the capacity report dashboard element to get a tabular view covering Data Domain and Avamar systems:

DPC Detailed Capacity Reporting

DPC Detailed Capacity Reporting

On that detailed capacity view, you see basic capacity details for Data Domains, and as you can see down the right hand side, details of each Mtree on the Data Domain as well. (My Avamar server is reported there as well.)

Under Health, you’ll see a quick view of all the systems you have configured and DPC’s assessment of their current status:

DPC System Health

DPC System Health

In this case, I had two systems reported as unhealthy – one of my DDVEs had an email configuration problem I lazily had not gotten around to fixing, and likewise, my NetWorker server had a licensing error I hadn’t bothered to investigate and fix. Shamed by DPC, I jumped onto both and fixed them, pronto! That meant when I went back to the dashboards, I got an all clear for system health:

DPC Detailed Dashboard

DPC Detailed Dashboard

I wanted to correct those 0’s, so I fired off a backup in NetWorker, which resulted in DPC updating pretty damn quickly to show something was happening:

DPC Dashboard Backup Running

DPC Detailed Dashboard, Backup Running

Likewise, when the backup completed and cloning started, the dashboard was updated quite promptly:

DPC Detailed Dashboard, Clone Running

DPC Detailed Dashboard, Clone Running

You can also see details of what’s been going on via the Activities > System view:

DPC Activities - Systems

DPC Activities – Systems

Then, with a couple of backup and clone jobs run, the Detailed Dashboard was updated a little more:

DPC, Detailed Dashboard More Use

DPC, Detailed Dashboard More Use

Now, I mentioned before that DPC takes on some Multi Systems Manager functionality for Avamar, viz.:

DPC, Avamar Systems Management

DPC, Avamar Systems Management

So that’s back in the Systems Management view. Clicking the horizontal ‘…’ item next to a system lets you launch the individual system management interface, or in the case of Avamar, also manage policy configuration.

DPC, Avamar Policy View

DPC, Avamar Policy View

In that policy view, you can create new policies, initiate jobs, and edit existing configuration details – all without having to go into the traditional Avamar interface:

DPC, Avamar Schedule Configuration

DPC, Avamar Schedule Configuration

DPC, Avamar Retention Configuration

DPC, Avamar Retention Configuration

DPC, Avamar Policy Editing

DPC, Avamar Policy Editing

That’s pretty much all I’ve got to say about DPC at this point in time – other than to highlight the groups function in System Management. By defining groups of resources (and however you want to), you can then filter dashboard views not only for individual systems, but for groups, too, allowing quick and easy review of very specific hosts:

DPC System Management - Groups

DPC System Management – Groups

In my configuration there I’ve lumped by whether systems are associated with an Avamar backup environment or a NetWorker backup environment, but you can configure groups however you need. Maybe you have services broken up by state, or country, or maybe you have them distributed by customer or service you’re providing. Regardless of how you’d like to group them, you can filter through to them in DPC dashboards easily.

So there you go – that’s DPC v1.0.1. It’s honestly taken me more time to get this blog article written than it took me to deploy and configure DPC.

Note: Things I didn’t show in this article:

  • Search and Recovery – That’s where you’d add a DP Search system (I don’t have DP-Search deployed in my lab)
  • Reports – That’s where you’d add a DPA server, which I don’t have deployed in my lab either.

Search and Recovery lets you springboard into the awesome DP-Search web interface, and Reports will drill into DPA and extract the most popular reports people tend to access in DPA, all within DPC.

I’m excited about DPC and the potential it holds over time. And if you’ve got an environment with multiple backup servers and Data Domains, you’ll get value out of it very quickly.

Mar 272017

I’d like to take a little while to talk to you about licensing. I know it’s not normally considered an exciting subject (usually at best people think of it as a necessary-evil subject), but I think it’s common to see businesses not take full advantage of the potential data protection licensing available to them from Dell EMC. Put it this way: I think if you take the time to read this post about licensing, you’ll come away with some thoughts on how you might be able to expand a backup system to a full data protection system just thanks to some very handy licensing options available.

When I first started using NetWorker, the only licensing model was what I’d refer to as feature based licensing. If you wanted to do X, you bought a license that specifically enabled NetWorker to do X. The sorts of licenses you would use included:

  • NetWorker Base Enabler – To enable the actual base server itself
  • OS enablers – Called “ClientPack” enablers, these would let you backup operating systems other than the operating system of the NetWorker server itself (ClientPack for Windows, ClientPack for Unix, ClientPack for Linux, etc).
  • Client Count enablers – Increasing the number of clients you can backup
  • Module enablers – Allowing you to say, backup Oracle, or SQL, or Exchange, etc.
  • Autochanger enablers – Allowing you to connect autochangers of a particular slot count (long term NetWorker users will remember short-slotting too…)

That’s a small excerpt of the types of licences you might have deployed. Over time, some licenses got simplified or even removed – the requirement for ClientPack enablers for instance were dropped quite some time ago, and the database licenses were simplified by being condensed into licenses for Microsoft databases (NMM) and licenses for databases and applications (NMDA).

Feature based licensing is, well, confusing. I’d go so far as to suggest it’s anachronistic. As a long-term NetWorker user, I occasionally get asked what a feature based licensing set might look like, or what might be required to achieve X, and even for me, having dealt with feature based licenses for 20 years, it’s not fun.

bigStock Confusion

The problem – and it’s actually a serious one – with feature based licensing is you typically remain locked, for whatever your minimum budget cycle is, into what your backup functionality is. Every new database, set of clients, backup device or special requirement has to be planned well in advance to make sure you have the licenses you need. How often is that really the case? I’m into my 21st year of working with backup and I still regularly hear stories of new systems or projects coming on-line without full consideration of the data protection requirements.

In this modern age of datacentre infrastructure where the absolute requirement is agility, using feature-based licensing is like trying to run on a treadmill that’s submerged waist-deep in golden syrup.

There was, actually, one other type of NetWorker licensing back then – in the ‘old days’, I guess I can say: an Enterprise license. That enabled everything in one go, but required yearly audits to ascertain usage and appropriate maintenance costs, etc. It enabled convenient use but from a price perspective it only suited upper-echelon businesses.

Over time to assist with providing licensing agility, NetWorker got a second license type – capacity licensing. This borrowed the “unlimited features” aspect of enterprise-based licensing, and worked on the basis of what we refer to as FETB – Front End TB. The simple summary of FETB is “if you did a full backup of everything you’re protecting, how big would it be?” (In fact, various white-space components are typically stripped out – a 100 GB virtual machine for instance that’s thickly provisioned but only using 25GB would effectively be considered to contribute just 25 GB to the capacity.)

The beauty of the capacity license scheme is that it doesn’t matter how many copies you generate of your data. (An imaginary BETB (“Back End TB”) license would be unpleasant in the extreme – limiting you to the total stored capacity of your backups.) So that FETB license applies regardless of whether you just keep all your backups for 30 days, or whether you keep all your backups for 7 years. (If you keep all your backups for 7 years, read this.)

A FETB lets you adjust your backup functionality as the business changes around you. Someone deploys Oracle but you’ve only had to backup SQL Server before? Easy, just install NMDA and start backing Oracle up. The business makes the strategic decision to switch from Hyper-V to VMware? No problem – there’s nothing to change from a licensing perspective.

But, as I say in my book, backup and recovery, as a standalone topic is dead. That’s why Dell EMC has licensing around Data Protection Suite. In fact, there’s a few different options to suit different tiers of organisations. If you’ve not heard of Data Protection Suite licensing, you’ve quite possibly been missing out on a wealth of opportunities for your organisation.

Let’s start with the first variant that was introduced, Data Protection Suite for Backup. (In fact, it was originally just Data Protection Suite.) DPS for Backup has been expanded as other products have been released, and now includes:

DPS for Backup

Think about that – from a single wrapper license (DPS for Backup), you get access to 6 products. Remember before when I said the advantage of NetWorker capacity licensing over ‘feature’ licensing was the ability to adapt to changes in the business requirements for backup? This sort of license expands on that ability even more so. You might start today using NetWorker to protect your environment, but in a year’s time your business needs to setup some remote offices that are best served by Avamar. With DPS for Backup, you don’t need to go and buy Avamar licenses, you just deploy Avamar. Equally, the strategic decision might be made to give DBAs full control over their backup processes, so it makes sense to give them access to shared protection storage via Data Domain Boost for Enterprise Applications (DDBEA), instead of needing to be configured for manual backups in NetWorker. The business could decide to start pushing some long term backups from NetWorker out to Cloud object storage – that’s easy, just deploy a CloudBoost virtual machine because you can. You can mix and match your licenses as you need. Just as importantly, you can deploy Data Protection Advisor at the business layer to provide centralised reporting and monitoring across the entire gamut, and you can take advantage of Data Protection Search to easily find content regardless of whether it was NetWorker or Avamar that protected it.

Data Protection Suite for Backup is licensed – like the NetWorker Capacity model – via FETB. So if you license for say, 500 TB, you can slice and dice that however you need between NetWorker, Avamar and DDBEA, and get CloudBoost, DPA and DP-Search rolled in. Suddenly your backup solution is a much broader data protection solution, just thanks to a license model!

If you’re not an existing NetWorker or Avamar site, but you’re looking for some increased efficiencies in your application backups/backup storage, or a reduction in the capacity licensing for another product, you might instead be interested in DPS for Applications:

DPS for Applications

Like DPS for Backup, DPS for Applications is a FETB capacity license. You get to deploy Boost for Enterprise Apps and/or ProtectPoint to suit your requirements, you get Data Protection Advisor to report on your protection status, and you also get the option to deploy Enterprise Copy Data Management (eCDM). That lets you set policies on application protection – e.g., “There must always be 15 copies of this database”. The application administration team can remain in charge of backups, but to assuage business requirements, policies can be established to ensure systems are still adequately protected. And ProtectPoint: whoa, we’re talking serious speed there. Imagine backing up a 10TB or 50TB database, not 20% faster, but 20 times faster. That’s ProtectPoint – Storage Integrated Data Protection.

Let’s say you’re an ultra-virtualised business. There’s few, if any, physical systems left, and you don’t want to think of your data protection licensing in terms of FETB, which might be quite variable – instead, you want to look at a socket based licensing count. If that’s the case, you probably want to look at Data Protection Suite for Virtual Machines:

DPS for Virtual Machines

DPS for Virtual Machines is targeted for the small to medium end of town to meet their data protection requirements in a richly functional way. On a per socket (not per-core) license model, you get to protect your virtual infrastructure (and, if you need to, a few physical servers) with Avamar, using image based and agent-based backups in whatever mix is required. You also get RecoverPoint for Virtual Machines. RecoverPoint gives you DVR-like Continuous Data Protection that’s completely storage independent, since it operates at the hypervisor layer. Via an advanced journalling system, you get to deliver very tight SLAs back to the business with RTOs and RPOs in the seconds or minutes, something that’s almost impossible with just standard backup. (You can literally choose to roll back virtual machines on an IO-by-IO basis. Or spin up testing/DR copies using the same criteria.) You also get DPA and DP-Search, too.

There’s a Data Protection Suite for archive bundle as well if your requirements are purely archiving based. I’m going to skip that for the moment so I can talk about the final licensing bundle that gives you unparalleled flexibility for establishing a full data protection strategy for your business; that’s Data Protection Suite for Enterprise:

DPS for Enterprise

Data Protection Suite for Enterprise returns to the FETB model but it gives you ultimate flexibility. On top of it all you again get Data Protection Advisor and Data Protection Search, but then you get a raft of data protection and archive functionality, all again in a single bundled consumption model: NetWorker, Avamar, DDBEA, CloudBoost, RecoverPoint for Virtual Machines, ProtectPoint, AppSync, eCDM, and all the flavours of SourceOne. In terms of flexibility, you couldn’t ask for more.

It’s easy when we work in backup to think only in terms of the main backup product we’re using, but there’s two things that have become urgently apparent:

  • It’s not longer just about backup – To stay relevant, and to deliver value and results back to the business, we need to be thinking about data protection strategies rather than backup and recovery strategies. (If you want proof of that change from my perspective, think of my first book title vs the second – the first was “Enterprise Systems Backup and Recovery”, the second, “Data Protection”.)
  • We need to be more agile than “next budget cycle” – Saying you can’t do anything to protect a newly emerged or altering workload until you get budget next year to do it is just a recipe for disaster. We need, as data protection professionals, to be able to pick the appropriate tool for each workload and get it operational now, not next month or next year.

Licensing: it may on the outset appear to be a boring topic, but I think it’s actually pretty damn exciting in what a flexible licensing policy like the Data Protection Suite allows you to offer back to your business. I hope you do too, now.

Hey, you’ve made it this far, thanks! I’d love it if you bought my book, too! (In Kindle format as well as paperback.)


The designing of backup environments

 Architecture, Backup theory  Comments Off on The designing of backup environments
Feb 072012

The cockatrice was a legendary beast that was a two-legged dragon, with the head of a rooster that could, amongst other things, turn people to stone with a glance. So it was somewhat to a basilisk, but a whole lot uglier and looked like it had been designed by a committee.

You may be surprised to know that there are cockatrice backup environments out there. Such an environment can be just as ugly as the mythical cockatrice, and just as dangerous, turning even a hardened backup expert to stone as he or she tries to sort through the “what-abouts?”, the “where-ares?” and the “who-does?”

These environments are typically quite organic, and have grown and developed over years, usually with multiple staff having been involved and/or responsible, but no one staff member having had sufficient ownership (or longevity) to establish a single unifying factor within the environment. That in itself would be challenging enough, but to really make the backup environment a cockatrice, there’ll also be a lack of documentation.

In such environments, it’s quite possible that the environment is largely acting like a backup system, but through a combination of sheer luck and a certain level of procedural adherence, typically by operators who have remained in the environment for long enough. These are the systems for which, when the question “But why do you do X?”, the answer is simply, “Because we’ve always done X.”

In this sort of system, new technologies have typically just been tacked on, sometimes shoe-horned into “pretending” they work just as the old systems, and sometimes not used at their peak efficiency because of that general reluctance to change such systems engender. (A classic example for instance, can be seen where a deduplication system is tacked onto an existing backup environment, but is treated like a standard VTL or a standard backup-to-disk region, without any consideration for the particularities involved in using deduplication storage.)

The good news is, these environments can be fixed, and turned into true backup systems. To do so, there needs to be four decisions made:

  1. To embrace change. The first essential step is to eliminate the “it’s always been done this way before” mentality. This doesn’t allow for progress, or change, at all, and if there’s one common factor in any successful business, it’s the ability to change. This is not just representative of the business itself, but for each component of the business – and that includes backup.
  2. To assign ownership. A backup system requires both a technical owner and a management owner. Ideally, the technical owner will be the Data Protection Advocate for the company or business group, and the management owner will be both an individual, and the Information Protection Advisory Council. (See here.)
  3. To document. The first step to pulling order out of chaos (or even general disarray and disconnectedness) is to start documenting the environment. “Document! Document! Document!”, you might hear me cry as I write this line – and you wouldn’t be too far wrong. Document the system configuration. Document the rebuild process. Document the backup and recovery processes. Sometimes this documentation will be reference to external materials, but a good chunk of it will be material that your staff have to develop themselves.
  4. To plan. Organic growth is fine. Uncontrolled organic or haphazard growth is not. You need to develop a plan for the backup environment. This will be possible once the above aspects have been tackled, but two key parts to that plan should be:
    • How long will the system, in its current form, continue to service our requirements?
    • What are some technologies we should be starting to evaluate now, or at least stay abreast of, for consideration when the system has to be updated?

With those four decisions made, and implemented, the environment can be transfigured from a hodge-podge of technologies with no real unifying principle other than conformity to prior usage patterns into a collection of synergistic tools working seamlessly to optimise the data backup and recovery operations of the company.

Check in – New Years Resolutions

 Architecture, Backup theory  Comments Off on Check in – New Years Resolutions
Jan 312012

Resolutions Check-in

In December last year I posted “7 new years backup resolutions for companies”. Since it’s the end of January 2012, I thought I’d check in on those resolutions and suggest where a company should be up to on them, as well as offering some next steps.

  1. Testing – The first resolution related to ensuring backups are tested. By now at least an informal testing plan should be in place if none were before. The next step will be to deal with some of the aspects below so as to allow a group to own the duty of generating an official data protection test plan, and then formalise that plan.
  2. Duplication – There should be documented details of what is and what isn’t duplicated within the backup environment. Are only production systems duplicated? Are only production Tier 1 systems duplicated? The first step towards achieving satisfactory duplication/cloning of backups is to note the current level of protection and expand outwards from that. The next step will be to develop tier guidelines to allow a specification of what type of backup receives what level of duplication. If there are already service tiers in the environment, this can serve as a starting point, slotting existing architecture and capability onto those tiers. Where existing architecture is insufficient, it should be noted and budgets/plans should be developed next to deal with these short-falls.
  3. Documentation – As I mentioned before, the backup environment should be documented. Each team that is involved in the backup process should have assigned at least one individual to write documentation relating to their sections (e.g., Unix system administrators would write Unix backup and recovery guidelines, etc., Windows system administrators would do the same for Windows, and so on). This should actually include 3 people: the writer, the peer reviewer, and the manager or team leader who accepts the documentation as sufficiently complete. The next step after this will be to handover documentation to the backup administrator(s) who will be responsible for collation, contribution of their sections, and periodic re-issuing of the documents for updates.
  4. Training – If staff (specifically administrators and operators) had previously not been trained in backup administration, a training programme should be in the works. The next step, of course, will be to arrange budget for that training.
  5. Implementing a zero error policy – First step in implementing a zero error policy is to build the requisite documents: an issues register, an exceptions register, and an escalations register. Next step will be to adjust the work schedules of the administrators involved to allow for additional time taken to resolve the ‘niggly’ backup problems that have been in the environment for some time as the switchover to a zero error policy is enacted.
  6. Appointing a Data Protection Advocate – The call should have gone out for personnel (particularly backup and/or system administrators) to nominate themselves for the role of DPA within the organisation, or if it is a multi-site organisation, one DPA per site. By now, the organisation should be in a position to decide who becomes the DPA for each site.
  7. Assembling an Information Protection Advisory Council (IPAC) – Getting the IPAC in place is a little more effort because it’s going to involve more groups. However, by now there should be formal recognition of the need for this council, and an informal council membership. The next step will be to have the first formal meeting of the council, where the structure of the group and the roles of the individuals within the group are formalised. Additionally, the IPAC may very well need to make the final decision on who is the DPA for each site, since that DPA will report to them on data protection activities.

It’s worth remembering at this point that while these tasks may seem arduous at first, they’re absolutely essential to a well running backup system that actually meshes with the needs of the business. In essence: the longer they’re put off, the more painful they’ll be.

How are you going?

Jan 272012

Continuing on my post relating to dark data last week, I want to spend a little more about data awareness classification and distribution within an enterprise environment.

Dark data isn’t the end of the story, and it’s time to introduce the entire family of data-awareness concepts. These are:

  • Data – This is both the core data managed and protected by IT, and all other data throughout the enterprise which is:
    • Known about – The business is aware of it;
    • Managed – This data falls under the purview of a team in terms of storage administration (ILM);
    • Protected – This data falls under the purview of a team in terms of backup and recovery (ILP).
  • Dark Data – To quote the previous article, “all those bits and pieces of data you’ve got floating around in your environment that aren’t fully accounted for”.
  • Grey Data – Grey data is previously discovered dark data for which no decision has been made as yet in relation to its management or protection. That is, it’s now known about, but has not been assigned any policy or tier in either ILM or ILP.
  • Utility Data – This is data which is subsequently classified out of grey data state into a state where the data is known to have value, but is not either managed or protected, because it can be recreated. It could be that the decision is made that the cost (in time) of recreating the data is less expensive than the cost (both in literal dollars and in staff-activity time) of managing and protecting it.
  • Noise – This isn’t really data at all, but are all the “bits” (no pun intended) that are left which are neither grey data, data or utility data. In essence, this is irrelevant data, which someone or some group may be keeping for unnecessary reasons, and in actual fact should be considered eligible for either deletion or archival and deletion.

The distribution of data by awareness within the enterprise may resemble something along the following lines:

Data Awareness Percentage Distribution

That is, ideally the largest percentage of data should be regular data which is known, managed and protected. In all likelihood for most organisations, the next biggest percentage of data is going to be dark data – the data that hasn’t been discovered yet. Ideally however, after regular and dark data have been removed from the distribution, there should be at most 20% of data left, and this should be broken up such that at least half of that remaining data is utility data, with the last 10% split evenly between grey data and noise.

The logical implications of this layout should be reasonably straight forward:

  1. At all times the majority of data within an organisation should be known, managed and protected.
  2. It should be expected that at least 20% of the data within an organisation is undiscovered, or decentralised.
  3. Once data is discovered, it should exist in a ‘grey’ state for a very short period of time; ideally it should be reclassified as soon as possible into data, utility data or noise. In particular, data left in a grey state for an extended period of time represents just as dangerous a potential data loss situation as dark data.

It should be noted that regular data, even in this awareness classification scheme, will still be subject to regular data lifecycle decisions (archive, tiering, deletion, etc.) In that sense, primary data eligible for deletion isn’t really noise, because it’s previously been managed and protected; noise really is ex dark-data that will end up being deleted, either as an explicit decision, or due to a failure at some future point after the decision to classify it as ‘noise’, having never been managed or protected in a centralised, coordinated manner.

Equally, utility data won’t refer to say, Q/A or test databases that replicate the content of production databases. These types of databases will again have fallen under the standard data umbrella in that there will have been information lifecycle management and protection policies established for them, regardless of what those policies actually were.

If we bring this back to roles, then it’s clear that a pivotal role of both the DPAs (Data Protection Advocates) and the IPAC (Information Protection Advisory Council) within an organisation should be the rapid coordination of classification of dark data as it is discovered into one of the data, utility data or noise states.

7 New Years Backup Resolutions for Companies

 Backup theory  Comments Off on 7 New Years Backup Resolutions for Companies
Dec 272011

New years resolutions for backup

I’d like to suggest that companies be prepared to make (and keep!) 7 new years resolutions when it comes to the field of backup and recovery:

  1. We will test our backups: If you don’t have a testing regime in place, you don’t have a backup system at all.
  2. We will duplicate our backups: Your backup system should not be a single point of failure. If you’re not cloning, replicating or duplicating your backups in some form, your backup system could be the straw that breaks the camel’s back when a major issue occurs.
  3. We will document our backups: As for testing, if your backup environment is undocumented, it’s not a system. All you’ve got is a collection of backups, which, if the right people are around at the right time and in the right frame of mind, you could get a recovery from it. If you want a backup system in place, you not only have to test your backups, you also have to keep them well documented.
  4. We will train our administrators and operators: It never ceases to amaze me the number of companies that deploy enterprise backup software and then insist that administrators and operators just learn how to use it themselves. While the concept of backup is actually pretty simple (“hey, you, back it up or you’ll lose it!”), the practicality of it can be a little more complex, particularly given that as an environment grows in size, so does the scope and the complexity of a backup system. If you don’t have some form of training (whether it’s internal, by an existing employed expert, or external), you’re at the edge of the event horizon, peering over into the abyss.
  5. We will implement a zero error policy: Again, there’s no such thing as a backup system when there’s no zero error policy. No ifs, no buts, no maybes. If you don’t rigorously implement a zero error policy, you’re flipping a coin every time you do a recovery, regardless of what backup product you use. (To learn more about a zero error policy, check out the trial podcast I did where that was the topic.)
  6. We will appoint a Data Protection Advocate: There’s a lot of data “out there” within a company, not necessarily under central IT control. Someone needs to be thinking about it. That someone should be the Data Protection Advocate (DPA). This person should be tasked with being the somewhat annoying person who is present at every change control meeting, raising her or his hand and saying “But wait, how will this affect our ability to protect our data?” That person should also be someone who wanders around the office(s) looking under desks for those pesky departmental servers and “test” boxes that are deployed, the extra hard drives attached to research machines, etc. If you have multiple offices, you should have a DPA per office. (The role of the DPA is outlined in this post, “What don’t you backup?“)
  7. We will assemble an Information Protection Advisory Council (IPAC): Sitting at an equal tier to the change control board, and reporting directly to the CTO/CIO/CFO, the IPAC will liaise with the DPA(s) and the business to make sure that everyone is across the contingencies that are in place for data protection, and be the “go-to” point for the business when it comes to putting new functions in place. They should be the group that sees a request for a new system or service and collectively liaises with the business and IT to ensure that the information generated by that system/service is protected. (If you want to know more about an IPAC and its role in the business, check out “But where does the DPA fit in?“)

And there you have it – the new years resolutions for your company. You may be surprised – while there’ll be a little effort getting these in place, once they’re there, you’re going to find backup, recovery, and the entire information protection process a lot easier to manage, and a lot more reliable.

Aug 242011

In yesterday’s post, I suggested that it was time for businesses to recognise and setup a new role – the Data Protection Advocate (DPA). This would be the key person tasked within the organisation to think of data protection scenarios, potential gaps, etc., and be the advocate for ensuring that data generated by or on behalf of the company is protected.

However, a DPA by him or herself is probably not going to achieve much within an organisation, so the next step is to try to work out where the DPA fits within the organisational structure. For that, we need a diagram. And here’s one I prepared earlier:

Data Protection Advocate Org Chart

Assuming there are multiple backup administrators within an organisation, there will be fewer DPAs than there are administrators. So, nominally, backup administrators will in some way or another report through to the DPA.

The DPA would logically need to liaise with a large group of people within the organisation. At bare minimum, this would be:

  • Key users – These are the people in each business group who just “know” what is done. They’re the long-term people, the “go to” people within each department. They’re going to have a lot of intrinsic knowledge that the DPA should be regularly mining.
  • Function owners – Previously we’d have called these people the department heads, but functional ownership within businesses is shifting to be broader as traditional employee/management interaction continues to change, so “function owners” seems more appropriate.
  • IT Team Leaders – IT obviously represents a significant portion of the data iceberg within a company, and therefore the DPA should be liaising with each of the team leaders – including storage, virtualisation, networking, security, etc., as well as the traditional server teams.
  • HR/Finance – Smaller organisations traditionally see HR and Finance as a combined group. In larger organisations this will obviously not be the case. Regardless, both HR and Finance will have a very strong understanding of the types of data they need kept and protected. You could argue that this is no different from any other group, but HR and Finance data is usually at the core of the “business critical” data we protect, and thus deserve to be singled out.
  • Legal – Somewhere, someone has to have an understanding of the legal ramifications of (a) choosing not to protect some data or (b) how long data should be kept for. In larger organisations, IT people should be able to consult with someone from corporate legal to get a very clear and straight forward answer.

The DPA however does not work in isolation once the requirements have been gathered. This person will then coordinate with (and be a voting member of) the Information Protection Advisory Council. That will be a group of reasonably senior people within the organisation from across a spectrum including IT, Finance and traditional business functions, who are empowered to make decisions that affect the entire company in relation to data protection policies on behalf of the board. For want of a better term, this is the “policy team” for data and information protection. You’ll note that I’ve switched at this level from referring to Data Protection to referring to Information Protection. That’s quite deliberate. The DPA will be concerned with the minutia of data within the organisation. The IPAC should be able to focus on the broader information view, instead.

Logically, this group will sit at an organisational level on par with the most senior Change Control Board. That board will, for the average organisation, report directly to the CIO.

So there you have it – a new role, and a new group.

Have you appointed a DPA yet? Have you started forming your IPAC yet? If not, get cracking!

Aug 232011

I think this is a question that the average company wholly, inadequately, fails to understand. You see, when it’s asked, people start thinking about their servers – “data X is backed up, data Y can be reconstructed, so we don’t backup that…”

At the end of this article though, I hope you’ll want to take a walk.

At this point, the average backup administrator is responsible for just the backups of servers and storage servers for which discrete agents can be connected to. Yet this is woefully inadequate and demonstrates a wholly inappropriate level of planning within a company. That is, the person or people responsible for core data protection don’t get buy-in or oversight on all data protection.

What else is there within an environment? Well, quite a lot, potentially.

You’ve got the obvious things of course – end user desktops and laptops. Is there potential for local data storage on those machines? If there is, is that data protected?

You’ve got the slightly less obvious things – smart phones with critical business contacts, memos, etc., on them. Is that data being routinely being synced? What is it being synced to? Is that synced data accessible if say, the person leaves? Is that synced data backed up?

Moving right along past the “easy” questions, we’ve got the start of the really tricky questions – look at all the appliances within the organisation. No, I’m not talking about microwaves and toaster ovens in the kitchenettes on each floor. I’m talking about those boxes in racks that don’t have either a traditional operating system or an NDMP agent on them.

The network switches.

The fibre-channel switches.

The PABXs.

The encryption routers.

The encryption FC routers.

And so on.

All of these sorts of devices have configuration/state data on them. A month or so ago, I was talking to another third party consultant at a site, and that person whispered to me, with a slightly deer-in-the-headlights facial expression, “Their SAN FC zoning hasn’t even been saved to the switches, because they’re older and they can’t schedule the outage to save the config.”

And I thought, what sort of bizarro world have I entered? Because I’d bet money that if the running state wasn’t committed, it certainly wasn’t backed up either.

So, here’s my challenge to you, as a backup administrator – take ownership and become a Data Protection Advocate. I know, EMC have a product called DPA, but IT is rife with overloaded TLAs, so this is just another one. You need to stop being just the backup administrator, and start being the company’s Data Protection Advocate (DPA).

And how do you do that? You take a walk:

  1. Grab a notepad or an iPad and a suitable writing implement, be that pen or finger.
  2. Go into the server room.
  3. Note every bit of non-server equipment in that room.
  4. Next, start wandering around the offices.
  5. Note the electronic devices people are using. Smartphones? Tablets? PDAs? (Don’t laugh – I actually saw someone still using a Palm V just three weeks ago.)
  6. Ask at least two or three random people in each workgroup where they save their files to.
  7. Now go to your manager’s office.
  8. Tell your manager you want to have the title of DPA, and explain why.

I would suggest to you that very few, if any organisations, have actually formalised and thought through the process of just how much data goes unprotected on a daily basis. As such, it’s time for a new breed of backup administrators. Why? Because it’s damn unlikely that anyone else in the organisation will have anywhere near the level of appreciation for data protection than you – because it’s part of your job.

Do you want to be a Backup Administrator, or do you want to be a Data Protection Advocate?

I previously said that backup administrators should be part of the change control process, but realistically this isn’t the case. In fact, the DPA for the organisation should be part of the change control process. That person should be tasked with speaking out on behalf of the data – how will it be protected? How will it be recovered? If it can’t be protected, how can the risk be ameliorated?

What don’t you backup?

Are you ready to be a DPA?

If you are, read on at “But where does the DPA fit in?

%d bloggers like this: