Sep 072016
 

In previous posts I’ve talked about options around database backups – specifically whether you’d use a NetWorker module or say, DDBoost for Enterprise Applications. There’s a lot of architectural positives towards having the database administrators in control of the backup, but sometimes you’ll want the backups to be controlled and coordinated by NetWorker. It could be your organisation doesn’t have DBAs on-staff and need backup administrators to have more hands-on control over the environment, or it could be you have a policy to fully integrate database backup and recovery operations within NetWorker.

I’ve been going through a re-setup of my lab environment recently and today I wanted to spend a bit of time outlining how easy it is with NetWorker 9 (and NMDA v9) to configure Oracle backups, perform them, and do the recoveries as well – particularly if you’re a backup admin rather than a database admin.

With a freshly installed Oracle 12 instance on CentOS 6.7, I went through the process of installing and configuring NetWorker backups.

First you need to install the base NetWorker client package. (I always install the Extended client package for my lab servers, unless I’m specifically testing otherwise.) Once that’s been installed, you can install the appropriate NMDA package:

01 NMDA Plugin Install

01 NMDA Plugin Install

You’ll note at the end of the installation it tells you there may be additional postinstall steps to perform. I forgot to do that which generated an “oops” moment later – I’ll get to that at the appropriate time. But yes, there is a post-install operation you need to perform with Oracle databases.

Anyway, with the plugin installed and NetWorker started on the client, I jumped over to NMC to configure database backups for this system using the wizard:

02 New Client Wizard 01

02 New Client Wizard 01

Just choose “New Client Wizard” to start a step-by-step configuration process for Oracle backups for the newly installed system. The first thing you’re prompted for of course is the host name and what type of backup you’re intending to configure.

03 New Client Wizard 02

03 New Client Wizard 02

Hitting next, you’ll have NetWorker interrogate the client software to determine what backup modules and options are available and you’ll get to pick what you want to do:

04 New Client Wizard 03

04 New Client Wizard 03

And yes, it really is that simple – just select Oracle and hit Next.

05 New Client Wizard 04

05 New Client Wizard 04

The above part of the wizard covers the absolute basics about the configuration, and unless you’re planning on backing up the database over DDBoost-FC, you’ll be fine to leave the options as they are. Click Next to continue.

06 New Client Wizard 05

06 New Client Wizard 05

Here you get to choose between the three different backup options – a normal scheduled backup, a custom scheduled backup or a scheduled backup of disk backups – effectively allowing you to sweep up RMAN backups executed by the DBAs. In this case I wanted to go with the basics and kept it on Typical scheduled backup. Next to continue.

07 New Client Wizard 06

07 New Client Wizard 06

It’s on this form that you’ll definitely need a bit of an understanding of the Oracle setup. NetWorker managed to extract the Oracle home directory (presumably by interrogating /etc/oratab), but it needed me to specify the path to the tnsnames.ora directory. (That’s going to depend on your install of Oracle of course.)

The wizard uses two different forms of authentication – OS authentication or database authentication. Because I’d just setup the database in a pretty basic way I went with OS level authentication. (The alternative is to ensure there’s a fully configured backup user within the database and to use the database authentication. This is actually the more appropriate way if you have DBAs on staff. If you’re working on your own you might want to stick with the more basic OS authentication.)

So I supplied the username for Oracle (remember the base NetWorker client software runs as root/administrator, so it can su to the appropriate account), and the SID for the database instance I was configuring backups for. Next.

08 New Client Wizard 07

08 New Client Wizard 07

You then get confirmation of the options that are going to be configured and the choice between going back, cancelling the wizard or creating the client instance. I clicked Create. At the end of the creation you’ll get information as to whether it was done successfully or not.

Next up, it was necessary to create a new workflow for Oracle backups. I went to an Adhoc policy I have defined for backups I don’t automatically run each day in my lab, and started the creation of a new workflow. The first dialog is as follows:

09 New Workflow 01

09 New Workflow 01

This gives you the core details of the workflow – workflow name, when it executes, whether it automatically executes, etc. Name it how you need to, configure a Group consisting of the Oracle client(s) database backup instances, and then click Add to add the backup action.

10 New Workflow 02

10 New Workflow 02

Because this is a small database I elected to make every backup a full. If you talk to most DBAs you’ll find there’s a tradeoff between the space savings on incremental backups and the change of procedures for recoveries. (While most of those procedural changes are mitigated by backing up to disk, it’s quite common to have specific breakpoints in most environments between database backups that are full every day and those that get an extended fulls+incrementals configuration.)

With the levels/schedule set, I hit Next to move onto the next page of the dialog:

11 New Workflow 03

11 New Workflow 03

It’s on this dialog you’ll choose what storage node will handle the backup, how long it will be retained for, and most importantly, what pool is will be sent to. I wanted mine to go to my DDVE system, so I switched the pool over from Default to one I’d created called BoostBackup.

Moving on by clicking Next:

12 New Workflow 04

12 New Workflow 04

On the above dialog form you’ll get to define some more granular details about the backup process – how notifications are handled, number of retries, and overrides. I didn’t need to change anything here for what I was setting up, so I clicked Next to continue through the wizard to the Summary form.

13 New Workflow 05

13 New Workflow 05

The summary of the new action was pretty much what I was expecting so it was time to Configure.

14 New Workflow 06

14 New Workflow 06

With the action successfully created I could click OK to finish working on the Workflow and jump across to the Monitoring tab to start the new workflow:

15 Start Workflow

15 Start Workflow

Right clicking the workflow and choosing Start will have you prompted for confirmation that you do want the job run now; once you’ve given that confirmation your backup should kick off.

Except! Remember that bit where I said I was a bit of a doofus and didn’t do the post-install configuration step? Well, I forgot to link the NetWorker module library to Oracle’s libobk.so file, meaning the job failed. Since however NetWorker saves the output of RMAN it was pretty easy to jump into the policy logs and see exactly what went wrong, viz.:

17 Oops My Mistake

17 Oops My Mistake

That RMAN/Oracle error code and text tells the whole story there – unable to allocate a backup channel because there’s no linkage to an SBT_TAPE device type. (Remember with Oracle any external plugin: NetWorker, Avamar, DDBEA, NetBackup, etc. all slot in using Oracle’s SBT_TAPE device type. A legacy name from how we used to backup.)

With that corrected by creating the appropriate symlink (which is of course completely documented in the NMDA install guide that I didn’t check!), the backup ran to completion, quickly:

18 Successful Backup

18 Successful Backup

Now a backup is one thing, but recoveries are the real crux of the matter! And Oracle recoveries can be completely performed within NMC these days using the NMC Recovery interface. While your DBAs might want to run the recovery from the Oracle server if they’re available, empowering backup administrators to craft recovery processes when there are no DBAs available is just as useful.

Warning: I’m working through an example recovery scenario. You should not follow this blindly if you’re using it in your environment. This is a lab test only. Always adapt your recovery process to the activities and recovery requirements at hand, and always work with the appropriate documentation, processes and know-how!

19 NMC Recovery 01

19 NMC Recovery 01

The first step is to choose the host you want to recover (in my case, dbase1), and choose the type of recovery you want to configure (Oracle). Hit Next to continue.

20 NMC Recovery 02

20 NMC Recovery 02

Your options are pretty straight forward here – recover to a duplicate database instance, or recover to the original database. I chose to do an original database recovery and clicked Next.

21 NMC Recovery 03

21 NMC Recovery 03

This dialog is pretty similar to that backup configuration dialog I showed earlier – provide the appropriate configuration details for the database and the authentication method required.

22 NMC Recovery 04

22 NMC Recovery 04

You get an option between just recovering specified archived redo log files, or the entire database/specific database elements. I was doing a full recovery so I kept with the default selection and clicked Next.

23 NMC Recovery 05

23 NMC Recovery 05

Here you get to choose what specific tablespaces/data files you want to recover. This is particularly handy if you’ve say, had a single tablespace accidentally deleted and just need to recover that. Again, I wanted to recover everything so I clicked Next to continue.

24 NMC Recovery 06

24 NMC Recovery 06

Unless you’re working with a DBA who says otherwise, or have already got the database in a startup/mount mode, you’ll likely want to click Yes here to have NetWorker handle that for you.

25 NMC Recovery 07

25 NMC Recovery 07

Here I got the choice to recover datafiles to alternate locations; I left them as-is and clicked Next.

26 NMC Recovery 08

26 NMC Recovery 08

Here’s where you choose how many channels you want to use for the recovery, when you want to recover to, and whether you want the database automatically started at the end of the recovery process.

Once you’ve worked through those options, NMC will show you the RMAN recovery script it’s created, and give you the option to edit it:

27 NMC Recovery 09

27 NMC Recovery 09

(You can even save a copy of the RMAN script in case you want to reference it later, or hand it over to the DBA to complete.)

Clicking Next, you’re invited to confirm storage node details and optionally change the volumes to be used for the recovery:

28 NMC Recovery 10

28 NMC Recovery 10

Once you click past here you can give the recovery a name and choose to start it:

29 NMC Recovery 11

29 NMC Recovery 11

As soon as you click “Run Recovery” the recovery process will start. Here’s a few dialogs showing output during the recovery process:

30 NMC Recovery 12

30 NMC Recovery 12

31 NMC Recovery 13

31 NMC Recovery 13

And the completed recovery:

32 NMC Recovery 14

32 NMC Recovery 14

There you have it. A complete Oracle configuration, backup and recovery.

(As I said before, that’s a lab recovery – if you’re actually doing a recovery while the steps may be the same, you still need to customise for your database, so make sure you perform any recovery as appropriate for your environment and circumstances.)

Overall though it’s fair to say that Oracle backup and recovery with NetWorker is simple and straight-forward.

Betting the company

 Backup theory, Best Practice, Databases, General Technology  Comments Off on Betting the company
Jun 152016
 

Short of networking itself, backup and recovery systems touch more of your infrastructure than anything else. So it’s pretty common for any backup and recovery specialist to be asked how we can protect a ten or sometimes even twenty year old operating system or application.

Sure you can backup Windows 2012, but what about NT 4?

Sure you can backup Solaris 11, but what about Tru64 v5?

Sure you can backup Oracle 12, but what about Oracle 8?

These really are questions we get asked.

I get these questions. I even have an active Windows 2003 SMB server sitting in my home lab running as an RDP jump-point. My home lab.

Gambling the lot

So it’s probably time for me to admit: I’m not really speaking to backup administrators with this article, but the broader infrastructure teams and, probably more so, the risk officers within companies.

Invariably we get asked if we can backup AncientOS 1.1 or DefunctDatabase 3.2 because those systems are still in use within a business, and inevitably that’s because they’re in production use within a company. Sometimes they’re even running pseudo-mission critical services, but more often than not they’re just simply running essential services the business has deemed too costly to migrate to another platform.

I’m well aware of this. In 1999 I was the primary system administrator involved in a Y2K remediation project for a SAP deployment. The system as deployed was running on an early version of Oracle 8 as I recall (it might have been Oracle 7 – it was 17 years ago…), sitting on Tru64 with an old (even for then) version of SAP. The version of the operating system, the version of Oracle, the version of SAP and even things like the firmware in the DAS enclosures attached were all unsupported by the various vendors for Y2K.

The remediation process was tedious and slow because we had to do piecemeal upgrades of everything around SAP and beg for Y2K compliance exceptions from Oracle and Digital for specific components. Why? When the business had deployed SAP two years before, they’d spent $5,000,000 or so customizing it to the nth degree, and upgrading it would require a similarly horrifically expensive remediation customization project. It was, quite simply, easier and cheaper to risk periphery upgrades around the application.

It worked. (As I recall, the only system in the company that failed over the Y2K transition was the Access database put together at the last minute by some tech-boffin-project manager designed to track any Y2K incidents over the entire globe for the company. I’ve always found there to be beautiful irony in that.)

This is how these systems limp along within organisations. It costs too much to change them. It costs too much to upgrade them. It costs to much to replace them.

And so day by day, month by month, year by year, the business continues to bet that bad things won’t happen. And what’s the collateral for the bet? Well it could be the company itself. If it costs that much to change them, upgrade them or to replace them, what’s the cost going to be if they fail completely? There’s an old adage of a CEO and a CIO talking, and the CIO says: “Why are you paying all this money to train people? What if you train them and they leave?” To which the CEO responds, “What if we don’t train them and they stay?” I think this is a similar situation.

I understand. I sympathise – even empathise, but we’ve got to find a better way to resolve this problem, because it’s a lot more than just a backup problem. It’s even more than a data protection problem. It’s a data integrity problem, and that creates an operational integrity problem.

So why is the question “do you support X?” asked when the original vendor for X doesn’t even support it any more – and may not have done for a decade or more?

The question is not really whether we can supply backup agents or backup modules old enough to work with these systems unsupported by their vendor of origin, and whether you can get access to a knowledge-base that stretches back far enough to include details of those systems. Supply? Yes. Officially support? How much official support do you get from the vendor of origin?

I always think in these situations there’s a broader conversation to be had. Those legacy applications and operating systems are a sea anchor to your business at a time when you increasingly have to be able to steer and move the ship faster and with greater agility. Those scenarios where you’re reliant on technology so old it’s no longer supported are exactly those sorts of scenarios that are allowing startups and younger, more agile competitors to swoop in and take customers from you. And it’s those scenarios that also leave you exposed to an old 10GB ATA drive failing, or a random upgrade elsewhere in the company finally and unexpectedly resulting in that critical or essential system no longer being able to access the network.

So how do we solve the problem?

Sometimes there’s a simple workaround – virtualisation. If it’s an old x86 based platform, particularly Windows, there’s a good chance the system can at least be virtualised so it can at least run on modern hardware. That doesn’t solve the ‘supported’ problem, but it at least means greater protection: image level backups regardless of whether there’s an agent for the internal virtual machine, and snapshots and replication to reduce the requirements to ever have to consider a BMR. Usually being old, the amount of data on those systems is minimal, so that type of protection is not an issue.

But the real solution comes from being able to modernise the workload. We talk about platforms 1, 2 and 3 – platform 1 is the old mainframe approach to the world, platform 2 is the classic server/desktop architecture we’ve been living with for so long, and platform 3 is the new, mobile and cloud approach to IT. Some systems even get classified as platform ‘2.5’ – that interim step between the current and the new. What’s the betting that old curmudgeonly system that’s holding your business back from modernising is more like platform 1.5?

One way you can modernise is to look at getting innovative with software development. Increasing requirements for agility will drive more IT departments back to software development for platform 3 environments, so why not look at this as an opportunity to grow that development environment within your business? That’s where the EMC Federation can really swing in to help: Pivotal Labs is premised on new approaches to software development. Agile may seem like a buzz-word, but if you can cut software development down from 12-24 months to 6-12 weeks (or less!), doesn’t that mitigate many of the cost reasons to avoid dealing with the legacy platforms?

The other way of course is with traditional consulting approaches. Maybe there’s a way that legacy application can be adapted, or archived, in such a way that the business functions can be continued but the risk substantially reduced and the platform modernised. That’s where EMC’s consultancy services come in, where our content management services come in, and where our broad experience to hundreds of thousands of customer environments come in. Because I’ll be honest: your problems aren’t actually unique; you’re not the only business that’s dealing with legacy system components and while there may be industry-specific or even customer-specific aspects that are tricky, there’s a very, very good chance that somewhere, someone has gone through the same situation. The solution could very well be tailored specifically for your business, but the processes and tools that get used to get you to your solution don’t necessarily have to be bespoke.

It’s time to start thinking beyond whether those ancient and unsupported operating systems and applications can be backed up, but how they can be modernised so they stop holding the business back.

Client Load: Filesystem and Database Backups

 Backup theory, Best Practice, Databases  Comments Off on Client Load: Filesystem and Database Backups
Feb 032016
 

A question I get asked periodically is “can I backup my filesystem and database at the same time?”

As is often the case, the answer is: “it depends”.

Server on FireOr, to put it another way: it depends on what the specific client can handle at the time.

For the most part, backup products have a fairly basic design requirement: get the data from the source (let’s say “the client”, ignoring options like ProtectPoint for the moment) to the destination (protection storage) as quickly as possible. The faster the better, in fact. So if we want backups done as fast as possible, wouldn’t it make sense to backup the filesystem and any databases on the client at the same time? Well – the answer is “it depends”, and it comes down to the impact it has on the client and the compatibility of the client to the process.

First, let’s consider compatibility – if both the filesystem and database backup process use the same snapshot mechanism for instance, and only one can have a snapshot operational at any given time, that immediately rules out doing both at once. That’s the most obvious scenario, but the more subtle one almost comes back to the age-old parallelism problem: how fast is too fast?

If we’re simultaneously conducting a complete filesystem read (say, in the case of a full backup) and simultaneously reading an entire database and the database and filesystem we’re reading from both reside on the same physical LUN, there is the potential the two reads will be counter-productive: if the underlying physical LUN is in fact a single disk, you’re practically guaranteed that’s the case, for instance. We wouldn’t normally want RAID-less storage for pretty much anything in production, but just slipping RAID into the equation doesn’t guarantee we can achieve both reads simultaneously without impact to the client – particularly if the client is already doing other thingsProduction things.

Virtualisation doesn’t write a blank cheque, either; image level backup with databases in the image are a bit of a holy grail in the backup industry but even in those situations where it may be supported, it’s not supported for every database type; so it’s still more common than not to see situations where you have virtual/image level backups for the guest for crash consistency on the file and operating system components, and then an in-guest database agent running for that true guaranteed database recoverability. Do you want a database and image based backup happening at the same time? Your hypervisor is furiously reading the image file while the in-guest agent is furiously reading the database.

In each case that’s just at a per client level. Zooming out for a bit in a datacentre with hundreds or thousands of hosts all accessing shared storage via shared networking, usually via shared compute resources as well, how long is a piece of string becomes a exponentially increasing question as the number of shared resources and items sharing those resources start to come into play.

Unless you have an overflow of compute resources and SSD offering more IO than your systems can ever need, can I backup my filesystem and databases at the same time is very much a non-trivial question. In fact, it becomes a bit of an art, as does all performance tuning. So rather than directly answering the question, I’ll make a few suggestions to be considered along the way as you answer the question for your environment:

  • Recommendation: Particularly for traditional filesystem agent + traditional database agent backups, never start the two within five minutes, and preferably give half an hour gap between starts. I.e., overlap is OK, concurrency for starting should be avoided where possible.
  • Recommendation: Make sure the two functions can be concurrently executed. I.e., if one blocks the other from running at the same time, you have your answer.
  • Remember: It’s all parallelism. Rather than a former CEO leaping around stage shouting “developers, developers, developers!” imagine me leaping around shouting “parallelism, parallelism, parallelism!”* – at the end of the day each concurrent filesystem backup uses a unit of parallelism and each concurrent database backup uses a unit of parallelism, so if you exceed what the client can naturally do based on memory, CPU resources, network resources or disk resources, you have your answer.
  • Remember: Backup isn’t ABC, it’s CDECompression, Deduplication, Encryption: Each function will adjust the performance characteristics of the host you’re backing up – sometimes subtly, sometimes not so. Compression and encryption are easier to understand: if you’re doing either as a client-CPU function you’re likely going to be hammering the host. Deduplication gets trickier of course – you might be doing a bit more CPU processing on the host, but over a shorter period of time if the net result if a 50-99% reduction in the amount of data you’re sending.
  • Remember: You need the up-close and big picture view. It’s rare we have systems so isolated any more that you can consider this in the perspective of a single host. What’s the rest of the environment doing or likely to be doing?
  • Remember: ‘More magic’ is better than ‘magic’. (OK, it’s unrelated, but it’s always a good story to tell.)
  • Most importantly: Test. Once you’ve looked at your environment, once you’ve worked out the parallelism, once you’re happy the combined impact of a filesystem and database backup won’t go beyond the operational allowances on the host – particularly on anything remotely approaching mission critical – test it.

If you were hoping there was an easy answer, the only one I can give you is don’t, but that’s just making a blanket assumption you can never or should never do it. It’s the glib/easy answer – the real answer is: only you can answer the question.

But trust me: when you do, it’s immensely satisfying.

On another note: I’m pleased to say I made it into the EMC Elect programme for another year – that’s every year since it started! If you’re looking for some great technical people within the EMC community (partners, employees, customers) to keep an eye on, make sure you check out the announcement page.


* Try saying “parallelism, parallelism, parallelism!” three times fast when you had a speech impediment as a kid. It doesn’t look good.

Dec 222015
 

As we approach the end of 2015 I wanted to spend a bit of time reflecting on some of the data protection enhancements we’ve seen over the year. There’s certainly been a lot!

Protection

NetWorker 9

NetWorker 9 of course was a big part to the changes in the data protection landscape in 2015, but that’s not by any means the only advancement we saw. I covered some of the advances in NetWorker 9 in my initial post about it (NetWorker 9: The Future of Backup), but to summarise just a few of the key new features, we saw:

  • A policy based engine that unites backup, cloning, snapshot management and protection of virtualisation into a single, easy to understand configuration. Data protection activities in NetWorker can be fully aligned to service catalogue requirements, and the easier configuration engine actually extends the power of NetWorker by offering more complex configuration options.
  • Block based backups for Linux filesystems – speeding up backups for highly dense filesystems considerably.
  • Block based backups for Exchange, SQL Server, Hyper-V, and so on – NMM for NetWorker 9 is a block based backup engine. There’s a whole swathe of enhancements in NMM version 9, but the 3-4x backup performance improvement has to be a big win for organisations struggling against existing backup windows.
  • Enhanced snapshot management – I was speaking to a customer only a few days ago about NSM (NetWorker Snapshot Management), and his reaction to NSM was palpable. Wrapping NAS snapshots into an effective and coordinated data protection policy with the backup software orchestrating the whole process from snapshot creation, rollover to backup media and expiration just makes sense as the conventional data storage protection and backup/recovery activities continue to converge.
  • ProtectPoint Integration – I’ll get to ProtectPoint a little further below, but being able to manage ProtectPoint processes in the same way NSM manages file-based snapshots will be a big win as well for those customers who need ProtectPoint.
  • And more! – VBA enhancements (notably the native HTML5 interface and a CLI for Linux), NetWorker Virtual Edition (NVE), dynamic parallel savestreams, NMDA enhancements, restricted datazones and scaleability all got a boost in NetWorker 9.

It’s difficult to summarise everything that came in NetWorker 9 in so few words, so if you’ve not read it yet, be sure to check out my essay-length ‘summary’ of it referenced above.

ProtectPoint

In the world of mission critical databases where impact minimisation on the application host is a must yet backup performance is equally a must, ProtectPoint is an absolute game changer. To quote Alyanna Ilyadis, when it comes to those really important databases within a business,

“Ideally, you’d want the performance of a snapshot, with the functionality of a backup.”

Think about the real bottleneck in a mission critical database backup: the data gets transferred (even best case) via fibre-channel from the storage layer to the application/database layer before being passed across to the data protection storage. Even if you direct-attach data protection storage to the application server, or even if you mount a snapshot of the database at another location, you still have the fundamental requirement to:

  • Read from production storage into a server
  • Write from that server out to protection storage

ProtectPoint cuts the middle-man out of the equation. By integrating storage level snapshots with application layer control, the process effectively becomes:

  • Place database into hot backup mode
  • Trigger snapshot
  • Pull database out of hot backup mode
  • Storage system sends backup data directly to Data Domain – no server involved

That in itself is a good starting point for performance improvement – your database is only in hot backup mode for a few seconds at most. But then the real power of ProtectPoint kicks in. You see, when you first configure ProtectPoint, a block based copy from primary storage to Data Domain storage starts in the background straight away. With Change Block Tracking incorporated into ProtectPoint, the data transfer from primary to protection storage kicks into high gear – only the changes between the last copy and the current state at the time of the snapshot need to be transferred. And the Data Domain handles creation of a virtual synthetic full from each backup – full backups daily at the cost of an incremental. We’re literally seeing backup performance improvements in the order of 20x or more with ProtectPoint.

There’s some great videos explaining what ProtectPoint does and the sorts of problems it solves, and even it integrating into NetWorker 9.

Database and Application Agents

I’ve been in the data protection business for nigh on 20 years, and if there’s one thing that’s remained remarkably consistent throughout that time it’s that many DBAs are unwilling to give up control over the data protection configuration and scheduling for their babies.

It’s actually understandable for many organisations. In some places its entrenched habit, and in those situations you can integrate data protection for databases directly into the backup and recovery software. For other organisations though there’s complex scheduling requirements based on batch jobs, data warehousing activities and so on which can’t possibly be controlled by a regular backup scheduler. Those organisations need to initiate the backup job for a database not at a particular time, but when it’s the right time, and based on the amount of data or the amount of processing, that could be a highly variable time.

The traditional problem with backups for databases and applications being handled outside of the backup product is the chances of the backup data being written to primary storage, which is expensive. It’s normally more than one copy, too. I’d hazard a guess that 3-5 copies is the norm for most database backups when they’re being written to primary storage.

The Database and Application agents for Data Domain allow a business to sidestep all these problems by centralising the backups for mission critical systems onto highly protected, cost effective, deduplicated storage. The plugins work directly with each supported application (Oracle, DB2, Microsoft SQL Server, etc.) and give the DBA full control over managing the scheduling of the backups while ensuring those backups are stored under management of the data protection team. What’s more, primary storage is freed up.

Formerly known as “Data Domain Boost for Enterprise Applications” and “Data Domain Boost for Microsoft Applications”, the Database and Application Agents respectively reached version 2 this year, enabling new options and flexibility for businesses. Don’t just take my word for it though: check out some of the videos about it here and here.

CloudBoost 2.0

CloudBoost version 1 was released last year and I’ve had many conversations with customers interested in leveraging it over time to reduce their reliance on tape for long term retention. You can read my initial overview of CloudBoost here.

2015 saw the release of CloudBoost 2.0. This significantly extends the storage capabilities for CloudBoost, introduces the option for a local cache, and adds the option for a physical appliance for businesses that would prefer to keep their data protection infrastructure physical. (You can see the tech specs for CloudBoost appliances here.)

With version 2, CloudBoost can now scale to 6PB of cloud managed long term retention, and every bit of that data pushed out to a cloud is deduplicated, compressed and encrypted for maximum protection.

Spanning

Cloud is a big topic, and a big topic within that big topic is SaaS – Software as a Service. Businesses of all types are placing core services in the Cloud to be managed by providers such as Microsoft, Google and Salesforce. Office 365 Mail is proving very popular for businesses who need enterprise class email but don’t want to run the services themselves, and Salesforce is probably the most likely mission critical SaaS application you’ll find in use in a business.

So it’s absolutely terrifying to think that SaaS providers don’t really backup your data. They protect their infrastructure from physical faults, and their faults, but their SLAs around data deletion are pretty straight forward: if you deleted it, they can’t tell whether it was intentional or an accident. (And if it was an intentional delete they certainly can’t tell if it was authorised or not.)

Data corruption and data deletion in SaaS applications is far too common an occurrence, and for many businesses sadly it’s only after that happens for the first time that people become aware of what those SLAs do and don’t cover them for.

Enter Spanning. Spanning integrates with the native hooks provided in Salesforce, Google Apps and Office 365 Mail/Calendar to protect the data your business relies on so heavily for day to day operations. The interface is dead simple, the pricing is straight forward, but the peace of mind is priceless. 2015 saw the introduction of Spanning for Office 365, which has already proven hugely popular, and you can see a demo of just how simple it is to use Spanning here.

Avamar 7.2

Avamar got an upgrade this year, too, jumping to version 7.2. Virtualisation got a big boost in Avamar 7.2, with new features including:

  • Support for vSphere 6
  • Scaleable up to 5,000 virtual machines and 15+ vCenters
  • Dynamic policies for automatic discovery and protection of virtual machines within subfolders
  • Automatic proxy deployment: This sees Avamar analyse the vCenter environment and recommend where to place virtual machine backup proxies for optimum efficiency. Particularly given the updated scaleability in Avamar for VMware environments taking the hassle out of proxy placement is going to save administrators a lot of time and guess-work. You can see a demo of it here.
  • Orphan snapshot discovery and remediation
  • HTML5 FLR interface

That wasn’t all though – Avamar 7.2 also introduced:

  • Enhancements to the REST API to cover tenant level reporting
  • Scheduler enhancements – you can now define the start dates for your annual, monthly and weekly backups
  • You can browse replicated data from the source Avamar server in the replica pair
  • Support for DDOS 5.6 and higher
  • Updated platform support including SLES 12, Mac OS X 10.10, Ubuntu 12.04 and 14.04, CentOS 6.5 and 7, Windows 10, VNX2e, Isilon OneFS 7.2, plus a 10Gbe NDMP accelerator

Data Domain 9500

Already the market leader in data protection storage, EMC continued to stride forward with the Data Domain 9500, a veritable beast. Some of the quick specs of the Data Domain 9500 include:

  • Up to 58.7 TB per hour (when backing up using Boost)
  • 864TB usable capacity for active tier, up to 1.7PB usable when an extended retention tier is added. That’s the actual amount of storage; so when deduplication is added that can yield actual protection data storage well into the multiple-PB range. The spec sheet gives some details based on a mixed environment where the data storage might be anywhere from 8.6PB to 86.4PB
  • Support for traditional ES30 shelves and the new DS60 shelves.

Actually it wasn’t just the Data Domain 9500 that was released this year from a DD perspective. We also saw the release of the Data Domain 2200 – the replacement for the SMB/ROBO DD160 appliance. The DD2200 supports more streams and more capacity than the previous entry-level DD160, being able to scale from a 4TB entry point to 24TB raw when expanded to 12 x 2TB drives. In short: it doesn’t matter whether you’re a small business or a huge enterprise: there’s a Data Domain model to suit your requirements.

Data Domain Dense Shelves

The traditional ES30 Data Domain shelves have 15 drives. 2015 also saw the introduction of the DS60 – dense shelves capable of holding sixty disks. With support for 4 TB drives, that means a single 5RU data Domain DS60 shelf can hold as much as 240TB in drives.

The benefits of high density shelves include:

  • Better utilisation of rack space (60 drives in one 5RU shelf vs 60 drives in 4 x 3RU shelves – 12 RU total)
  • More efficient for cooling and power
  • Scale as required – each DS60 takes 4 x 15 drive packs, allowing you to start with just one or two packs and build your way up as your storage requirements expand

DDOS 5.7

Data Domain OS 5.7 was also released this year, and includes features such as:

  • Support for DS60 shelves
  • Support for 4TB drives
  • Support for ES30 shelves with 4TB drives (DD4500+)
  • Storage migration support – migrate those older ES20 style shelves to newer storage while the Data Domain stays online and in use
  • DDBoost over fibre-channel for Solaris
  • NPIV for FC, allowing up to 8 virtual FC ports per physical FC port
  • Active/Active or Active/Passive port failover modes for fibre-channel
  • Dynamic interface groups are now supported for managed file replication and NAT
  • More Secure Multi-Tenancy (SMT) support, including:
    • Tenant-units can be grouped together for a tenant
    • Replication integration:
      • Strict enforcing of replication to ensure source and destination tenant are the same
      • Capacity quota options for destination tenant in a replica context
      • Stream usage controls for replication on a per-tenant basis
    • Configuration wizards support SMT for
    • Hard limits for stream counts per Mtree
    • Physical Capacity Measurement (PCM) providing space utilisation reports for:
      • Files
      • Directories
      • Mtrees
      • Tenants
      • Tenant-units
  • Increased concurrent Mtree counts:
    • 256 Mtrees for Data Domain 9500
    • 128 Mtrees for each of the DD990, DD4200, DD4500 and DD7200
  • Stream count increases – DD9500 can now scale to 1,885 simultaneous incoming streams
  • Enhanced CIFS support
  • Open file replication – great for backups of large databases, etc. This allows the backup to start replicating before it’s even finished.
  • ProtectPoint for XtremIO

Data Protection Suite (DPS) for VMware

DPS for VMware is a new socket-based licensing model for mid-market businesses that are highly virtualized and want an effective enterprise-grade data protection solution. Providing Avamar, Data Protection Advisor and RecoverPoint for Virtual Machines, DPS for VMware is priced based on the number of CPU sockets (not cores) in the environment.

DPS for VMware is ideally suited for organisations that are either 100% virtualised or just have a few remaining machines that are physical. You get the full range of Avamar backup and recovery options, Data Protection Advisor to monitor and report on data protection status, capacity and trends within the environment, and RecoverPoint for a highly efficient journaled replication of critical virtual machines.

…And one minor thing

There was at least one other bit of data protection news this year, and that was me finally joining EMC. I know in the grand scheme of things it’s a pretty minor point, but after years of wanting to work for EMC it felt like I was coming home. I had worked in the system integrator space for almost 15 years and have a great appreciation for the contribution integrators bring to the market. That being said, getting to work from within a company that is so focused on bringing excellent data protection products to the market is an amazing feeling. It’s easy from the outside to think everything is done for profit or shareholder value, but EMC and its employees have a real passion for their products and the change they bring to IT, business and the community as a whole. So you might say that personally, me joining EMC was the biggest data protection news for the year.

In Summary

I’m willing to bet I forgot something in the list above. It’s been a big year for Data Protection at EMC. Every time I’ve turned around there’s been new releases or updates, new features or functions, and new options to ensure that no matter where the data is or how critical the data is to the organisation, EMC has an effective data protection strategy for it. I’m almost feeling a little bit exhausted having come up with the list above!

So I’ll end on a slightly different note (literally). If after a long year working with or thinking about Data Protection you want to chill for five minutes, listen to Kate Miller-Heidke’s cover of “Love is a Stranger”. She’s one of the best artists to emerge from Australia in the last decade. It’s hard to believe she did this cover over two years ago now, but it’s still great listening.

I’ll see you all in 2016! Oh, and don’t forget the survey.

Jobquery finally gets the duct tape removed

 Architecture, Databases, NetWorker, Scripting  Comments Off on Jobquery finally gets the duct tape removed
Jan 082010
 

Some time ago, I posted that EMC had added a jobquery utility to allow probing of the NetWorker jobs database (the one created/maintained by nsrjobd). Unfortunately, at the time, jobquery had been somewhat muzzled – you could give it commands, and it would spit output back to you, but it would never give you any indication that it was waiting for you. No prompting, no nothing. It made working with jobquery somewhat of a hassle.

Thankfully though, as of NetWorker 7.6, the duct tape has been fully removed and jobquery will happily give you that meta-information:

[root@nox ~]# jobquery
NetWorker jobs query utility.
Use the "help" command for help.
jobquery> help
Legal commands are:
print [query] (set current query)
show [attrlist]
types
all
quit
help [command]
. [query]
? [command]
Where:
query ::= attrlist
attrlist ::= attribute [; attribute]*
attribute ::= name [: [value [, value]* ]
jobquery> types
Known types: job indication, save job, probe job,
savegroup job, session info, index save job,
savefs job, bootstrap save job, utility job,
active job db;

Now sure, in previous versions you could type commands in and get output, but not having the prompt to tell you when jobquery was waiting, or when it was working, or when (as I originally thought) it was hanging on startup, is a fairly critical part in having a useful user interface.

Having a responsive user interface makes jobquery a little nicer to work with. For instance, let’s look at the “save job” type and run a backup. There’s a lot of fields in the “save job” resource type, so I’m going to limit it as follows:

jobquery> show command:; group name:; host:; job state:; level:
jobquery> print type: save job
                     command: 
"save -s nox.anywebdb.com -g "Staging Servers" -LL -f - -m nox
 -t 1262601936 -l incr -q -W 78 -N /d/nsr/01 /d/nsr/01";
                  group name: Staging Servers;
                        host: nox;
                   job state: COMPLETED;
                       level: incr;

Then, once the backup starts, if I run (and abbreviate the output, remembering it shows all jobs in the database!), I can see:

                     command: 
savepnpc -s nox.anywebdb.com -g archon -LL -f - -m archon -t 1262818820 -l in
cr -q -W 78 -N / /;
                  group name: archon;
                        host: archon;
                   job state: ACTIVE;
                       level: incr;

However, the “ACTIVE” state simply means that the job is actively queued, not that it’s actually sending data. If you want to only see jobs that are actively backing up rather than just simply active, you’d look for a job state of “SESSION ACTIVE”:

jobquery> print type: save job; job state: SESSION ACTIVE
                     command: 
savepnpc -s nox.anywebdb.com -g archon -LL -f - -m archon -t 1262818820 -l in
cr -q -W 78 -N / /;
                  group name: archon;
                        host: archon;
                   job state: SESSION ACTIVE;
                       level: incr;

                     command: 
savepnpc -s nox.anywebdb.com -g archon -LL -f - -m archon -t 1262818814 -l in
cr -q -W 78 -N /Volumes/Yu /Volumes/Yu;
                  group name: archon;
                        host: archon;
                   job state: SESSION ACTIVE;
                       level: incr;
What does this mean? For a start, it provides a way of checking (outside of NMC) which jobs are currently queued to run, and which jobs are actually running. As is always the case, the better you can monitor regardless of circumstance, the more likely you are to be able to understand your server state. Now that jobquery however at least tells us when it’s waiting for input, I’m looking forward to properly exploring what it can do. I previously said that I anticipated it being a useful tool. Now I know it’s going to be a useful tool, and will do some further digging/testing and do a future posting on using it further to track activities.
May 072009
 

Since I have more than a passing interest in databases, I always try to keep appraised of the Oracle module for NetWorker. It therefore surprised me a few days ago to see that v5 of the module had been released in March. I guess my excuse is that March was an insanely busy month for me between work and travel. (Well, that’s my excuse, and I’m sticking to it.)

So yesterday I downloaded v5 of the module (for Linux), and spun it up. This is a version I really, really like.

Now, here’s a few bullet points before I get to the most impressive feature:

  • No longer supports Oracle 9i or lower; if you want older, unsupported versions of Oracle you have to use an older version of the module.
  • Requires features that exist only in Networker 7.5.x as the underlying client.
  • Must have the NetWorker regular client installed and running in order for the module software to correctly install and activate,
  • Can work with the 7.4.x NetWorker server with the exception that what I’m about to describe below doesn’t work with a 7.4 server.
  • Now has a client configuration wizard that works within NMC and makes Oracle backup configuration a breeze.

Honestly, if you’re about to do a new NetWorker install into a site that has Oracle, skip everything else and install 7.5.1. I.e., this is one of these compelling reasons for 7.5.x.

The Oracle client configuration wizard is integrated into NMC’s wizards. Right-click on a client in the configuration panel, choose “Client Backup Configuration -> New”, and you’re off and running:

Oracle Client Configuration Step 1

Oracle Client Configuration Step 1

Oracle Client Configuration Step 2

Oracle Client Configuration Step 2

Note that you won’t reach this point if you’ve disabled ‘nsrauth’ authentication on the backup server. I had done so on my lab server as a test on Monday, and spent half an hour trying to work out a … rather inexact … error message.

Oracle Client Configuration Step 3

Oracle Client Configuration Step 3

Oracle Client Configuration Step 4

Oracle Client Configuration Step 4

The above step is where things get fun. Note that if you are given these details, you don’t even need to log onto the client to setup an nsrnmo script any longer. This is the start of A Really Good Thing.

Also, I should note, in the above screen shot, because I was using a temporary database installed just for a few tests and I was in a rush, I used the sys account for connecting to the target database. No, you shouldn’t ever do that – create a backup user and use that account, please.

Note that Oracle, and the Oracle Listener, must both be running on the client in order to clear the above step.

After the above, we then start to get into the ‘regular’ client configuration options:

Oracle Client Configuration Step 5

Oracle Client Configuration Step 5

Oracle Client Configuration Step 6

Oracle Client Configuration Step 6

Oracle Client Configuration Step 7

Oracle Client Configuration Step 7

This summary screen shows you what you’re going to get as far as the configuration is concerned – including the RMAN script that has been automatically generated for you:

Oracle Client Configuration Step 8

Oracle Client Configuration Step 8

Confirmation of sweet success:

Oracle Client Configuration Step 9

Oracle Client Configuration Step 9

The finished client in NMC:

Oracle Client Configuration Step 10

Oracle Client Configuration Step 10

Once configured, you’re ready to start backing up straight away. Honestly, it couldn’t be simpler.

As a closing note, I know some other backup products have had Oracle backup wizards for some time, so I’m not claiming EMC is the first with this style of setup, but I do think it’s a great feature to see included now.