Jan 242017

In 2013 I undertook the endeavour to revisit some of the topics from my first book, “Enterprise Systems Backup and Recovery: A Corporate Insurance Policy”, and expand it based on the changes that had happened in the industry since the publication of the original in 2008.

A lot had happened since that time. At the point I was writing my first book, deduplication was an emerging trend, but tape was still entrenched in the datacentre. While backup to disk was an increasingly common scenario, it was (for the most part) mainly used as a staging activity (“disk to disk to tape”), and backup to disk use was either dumb filesystems or Virtual Tape Libraries (VTL).

The Cloud, seemingly ubiquitous now, was still emerging. Many (myself included) struggled to see how the Cloud was any different from outsourcing with a bit of someone else’s hardware thrown in. Now, core tenets of Cloud computing that made it so popular (e.g., agility and scaleability) have been well and truly adopted as essential tenets of the modern datacentre, as well. Indeed, for on-premises IT to compete against Cloud, on-premises IT has increasingly focused on delivering a private-Cloud or hybrid-Cloud experience to their businesses.

When I started as a Unix System Administrator in 1996, at least in Australia, SANs were relatively new. In fact, I remember around 1998 or 1999 having a couple of sales executives from this company called EMC come in to talk about their Symmetrix arrays. At the time the datacentre I worked in was mostly DAS with a little JBOD and just the start of very, very basic SANs.

When I was writing my first book the pinnacle of storage performance was the 15,000 RPM drive, and flash memory storage was something you (primarily) used in digital cameras only, with storage capacities measured in the hundreds of megabytes more than gigabytes (or now, terabytes).

When the first book was published, x86 virtualisation was well and truly growing into the datacentre, but traditional Unix platforms were still heavily used. Their decline and fall started when Oracle acquired Sun and killed low-cost Unix, with Linux and Windows gaining the ascendency – with virtualisation a significant driving force by adding an economy of scale that couldn’t be found in the old model. (Ironically, it had been found in an older model – the mainframe. Guess what folks, mainframe won.)

When the first book was published, we were still thinking of silo-like infrastructure within IT. Networking, compute, storage, security and data protection all as seperate functions – separately administered functions. But business, having spent a decade or two hammering into IT the need for governance and process, became hamstrung by IT governance and process and needed things done faster, cheaper, more efficiently. Cloud was one approach – hyperconvergence in particular was another: switch to a more commodity, unit-based approach, using software to virtualise and automate everything.

Where are we now?

Cloud. Virtualisation. Big Data. Converged and hyperconverged systems. Automation everywhere (guess what? Unix system administrators won, too). The need to drive costs down – IT is no longer allowed to be a sunk cost for the business, but has to deliver innovation and for many businesses, profit too. Flash systems are now offering significantly more IOPs than a traditional array could – Dell EMC for instance can now drop a 5RU system into your datacentre capable of delivering 10,000,000+ IOPs. To achieve ten million IOPs on a traditional spinning-disk array you’d need … I don’t even want to think about how many disks, rack units, racks and kilowatts of power you’d need.

The old model of backup and recovery can’t cut it in the modern environment.

The old model of backup and recovery is dead. Sort of. It’s dead as a standalone topic. When we plan or think about data protection any more, we don’t have the luxury of thinking of backup and recovery alone. We need holistic data protection strategies and a whole-of-infrastructure approach to achieving data continuity.

And that, my friends, is where Data Protection: Ensuring Data Availability is born from. It’s not just backup and recovery any more. It’s not just replication and snapshots, or continuous data protection. It’s all the technology married with business awareness, data lifecycle management and the recognition that Professor Moody in Harry Potter was right, too: “constant vigilance!”

Data Protection: Ensuring Data Availability

This isn’t a book about just backup and recovery because that’s just not enough any more. You need other data protection functions deployed holistically with a business focus and an eye on data management in order to truly have an effective data protection strategy for your business.

To give you an idea of the topics I’m covering in this book, here’s the chapter list:

  1. Introduction
  2. Contextualizing Data Protection
  3. Data Lifecycle
  4. Elements of a Protection System
  5. IT Governance and Data Protection
  6. Monitoring and Reporting
  7. Business Continuity
  8. Data Discovery
  9. Continuous Availability and Replication
  10. Snapshots
  11. Backup and Recovery
  12. The Cloud
  13. Deduplication
  14. Protecting Virtual Infrastructure
  15. Big Data
  16. Data Storage Protection
  17. Tape
  18. Converged Infrastructure
  19. Data Protection Service Catalogues
  20. Holistic Data Protection Strategies
  21. Data Recovery
  22. Choosing Protection Infrastructure
  23. The Impact of Flash on Data Protection
  24. In Closing

There’s a lot there – you’ll see the first eight chapters are not about technology, and for a good reason: you must have a grasp on the other bits before you can start considering everything else, otherwise you’re just doing point-solutions, and eventually just doing point-solutions will cost you more in time, money and risk than they give you in return.

I’m pleased to say that Data Protection: Ensuring Data Availability is released next month. You can find out more and order direct from the publisher, CRC Press, or order from Amazon, too. I hope you find it enjoyable.

Basics – Configuring a reports-only user

 NetWorker, Security  Comments Off on Basics – Configuring a reports-only user
May 252015

Something that’s come up a few times in the last year for me has been a situation where a NetWorker user has wanted to allow someone to access NetWorker Management Console for the purpose of running reports, but not allow them any administrative access to NetWorker.

It turns out it’s very easy to achieve this, and you actually have a couple of options on the level of NetWorker access they’ll get.

Let’s look first at the minimum requirements – defining a reports only user.

To do that, you first go into NetWorker Management Console as an administrative user, and go across to the Setup pane.

You’ll then create a new user account:

New User Account in NMC

Within the Create User dialog, be certain to only select Console User as the role:

NMC new user dialog

At this point, you’ve successfully created a user account that can run NMC reports, but can’t administer the NetWorker server.

However, you’re then faced with a decision. Do you want a reports-only user that can “look but don’t touch”, or do you want a reports-only user that can’t view any of the NetWorker configuration (or at least, anything other than can be ascertained by the reports themselves)?

If you want your reports user to be able to run reports and you’re not fussed about the user being able to view the majority of your NetWorker configuration, you’re done at this point. If however your organisation has a higher security focus, you may need to look at adjusting the basic Users NetWorker user group. If you’re familiar with it, you’ll know this has the following configuration:NetWorker Users Usergroup

This usergroup in the default configuration allows any user in the NetWorker datazone to:

  • Monitor NetWorker
  • Recover Local Data
  • Backup Local Data

The key there is any user*@*. Normally you want this to be set to *@*, but if you’re a particularly security focused organisation you might want to tighten this down to only those users and system accounts authorised to perform recoveries. The same principle applies here. Let’s say I didn’t want the reports user to see any of the NetWorker configuration, but I did want any root, system or pmdg user in the environment to still have that basic functionality. I could change the Users usergroup to the following:

Modified NetWorker Users usergroup

With this usergroup modified, logging in as the reports user will show a very blank NMC monitoring tab:

NMC-monitoring reports user

Similarly, the client list (as an example) will be quite empty too:

NMC-config reports user

Now, it’s worth mentioning there are is a key caveat you should consider here – some modules may be designed in anticipation that the executing user for the backup or recovery (usually an application user with sufficient privileges) will at least be a member of the Users usergroup. So if you tighten the security against your reports user to this level, you’ll need to be prepared to increase the steps in your application onboarding processes to ensure those accounts are added to an appropriate usergroup (or a new usergroup).

But in terms of creating a reports user that’s not privileged to control NetWorker, it’s as easy as the steps above.

Records retention and NMC

 Basics, Best Practice, Security  Comments Off on Records retention and NMC
Dec 102014

For those of us who have been using NetWorker for a very long time, we can remember back to when the NetWorker Management Console didn’t exist. If you wanted reports in those days, you wrote them yourself, either by parsing your savegroup completion results, processing the NetWorker daemon.log, or interrogating mminfo.

Over time since its introduction, NMC has evolved in functionality and usefulness. These days there are still some things that I find easier to do on the command line, but more often than not I find myself reaching for NMC for various administrative functions. Reporting is one of those.

(Just a quick interrupt. The NetWorker Usage Survey is happening again. Every year I ask readers to participate and tell me a bit about their environment. It’s short – I promise! – you only need around 5 minutes to answer the questions. When you’re finished reading this article, I’d really appreciate if you could jump over and do the survey.) 

There’s a wealth of reports in NMC, but some of the ones I find particularly useful often end up being:

  • User auditing
  • Success/failure results and percentages
  • Backup volume over time
  • Deduplication statistics

In order to get maximum use out of those, you want to make sure those details are kept for as long as you need them. In newer versions of NetWorker, if you go to the Enterprise Console and check out the Reports menu, you’ll see an option labelled “Data Retention”, and the default values are as follows:

default NMC data retention values

Those values are OK for using NMC reporting just for casual checking, but if you’re intending to perform longer-term checking, reporting or compliance based auditing, you might want to extend those values somewhat. Based on conversations with a couple of colleagues, I’m inclined to extend everything except for the Completion Message section to at 3 years in sites where longer-term compliance and auditing reporting is required. The completion messages are generally a little bigger in scope, and I’d be inclined to limit those to 3 months at the most. So that means the resulting fields would look like:

alternate NMC data retention values

Ultimately the values you set in the NMC Reports Data Retention area should be specific to the requirements of your business, but be certain to check them out and tweak the defaults as necessary to align them with your needs.

(Hey, now you’ve finished reading this article, just a friendly reminder: The NetWorker Usage Survey is happening again. Every year I ask readers to participate and tell me a bit about their environment. It’s short – I promise! – you only need around 5 minutes to answer the questions. When you’re finished reading this article, I’d really appreciate if you could jump over and do the survey.)


Jan 242011

When IDATA was beta testing NetWorker 7.6 SP1, my colleagues in New Zealand were responsible for testing the DD/Boost functionality. This, as you may have heard, allows for tighter integration between NetWorker and Data Domain systems, in much the same way that Data Domain has previously integrated with NetBackup.

I’m now doing a DD/Boost implementation, and I’ve got to say, I’m pretty impressed at the integration. At the moment, this is a standalone Data Domain 670, which we’ll be cloning out to physical tape from, so my satisfaction with the integration level has nothing to do with replication. I’ll cover that off when I implement Boost replication.

The first thing that impressed me was that under Boost, the Data Domain device types can be configured with parallelism greater than 1 without affecting the deduplication ratio. That means that a datazone won’t end up with so many devices as it would have to under a normal VTL or ADV_FILE dedupe configuration, which is a nice bonus. (And also better for licensing, too.)

The thing that really gave me a head spin though was the reporting integration. Having done some target based dedupe work before this in NetWorker, I’d been finding it frustrating that I couldn’t drill down and find out what sort of dedupe ratios clients and filesystems were getting. Boost is the answer:

Dedupe ratio

This, as you can imagine, is pretty cool reporting. Not only can you see what systems are getting great deduplication ratios, it’ll make it easy as pie to find the ones that aren’t.

Going up a level, the same applies to clients, too:

Client dedupe summary

When you can isolate clients, filesystems and data that doesn’t deduplicate well, you can do any or all of the following:

  • Send data direct to physical tape if necessary;
  • Send data to slower, non-deduplicating disk backup;
  • Send data to the deduplication device, but immediately clone and stage out as a priority.

I think I’m going to have a long and productive affair with Boost.

Dec 022009

One of the areas where administrators have been rightly able to criticise NetWorker has been the lack of reporting or auditing options to do with recoveries. While some information has always been retrievable from the daemon logs, it’s been only basic and depends on keeping the logs. (Which you should of course always do.)

NetWorker 7.6 however does bring in recovery reporting, which starts to rectify those criticisms. Now in the enterprise reporting section, you’ll find the following section:

  • NetWorker Recover
    • Server Summary
    • Client Summary
    • Recover Details
    • Recover Summary over Time

Of these reporting options, I think the average administrator will want the bottom two the most, unless they operate in an environment where clients are billed for recoveries.

Let’s look at the Recover Summary over Time report:

Recover summary over time

This presents a fairly simple summary of the recoveries that have been done on a per-client basis, including the number of files recovered, the amount of data recovered and the breakdown of successful vs failed recovery actions.

I particularly like the Recover Details report though:

Recover Details report

(Click the picture to see the entire width.)

As you can see there, we get a per user breakdown of recovery activities, when they were started, how long they took, how much data was recovered, etc.

These reports are a brilliant and much needed addition to NetWorker reporting capabilities, and I’m pleased to see EMC has finally put them into the product.

There’s probably one thing still missing that I can see administrators wanting to see – file lists of recovery sessions. Hopefully 7.(6+x) would see that report option though.