The rise of the hyperconverged administrator

 Architecture, Best Practice  Comments Off on The rise of the hyperconverged administrator
Dec 142017

Hyperconverged is a hot topic in enterprise IT these days, and deservedly so. The normal approach to enterprise IT infrastructure after all, is a bit like assembling a jigsaw puzzle over a few years. Over successive years it might be time to refresh storage, then compute, then backup, in any combination or order. So you’re piecing together the puzzle over an extended period of time, and wanting to make sure that you end up in a situation where all the pieces – sometimes purchased up to five years apart – will work well together. That’s before you start to deal with operating system and application compatibility considerations, too.

bigStock Roads

Unless your business is actually in infrastructure service delivery, none of this is of value to the business. The value starts when people can start delivering new business services on top of infrastructure – not building the infrastructure itself. I’m going to (perhaps controversially to some IT people suggest) that if your IT floor or office still has signs scattered around the place along the lines of “Unix Admin Team”, “Windows Admin Team”, “Storage Admin Team”, “Network Admin Team”, “Security Team”, “VMware Admin Team”, “Backup Admin Team”, etc., then your IT is premised on an architectural jigsaw approach that will continue to lose relevance to businesses who are seeking true, cloud-like agility within their environment. Building a new system, or delivering a new service in that silo-based environment is a multi-step process where someone creates a service ticket, which goes for approval, then once approved, goes to the network team to allocate an IP address, before going to the storage team to confirm sufficient storage, then moving to the VMware team to allocate a virtual machine, before that’s passed over to the Unix or Windows team to prepare the system, then the ticket gets migrated to the backup team and … five weeks have passed. If you’re lucky.

Ultimately, a perception of “it’s cheaper” is only part of the reason why businesses look enviously at cloud – it’s also the promise of agility, which is what the business needs more than anything else from IT.

If a business is either investigating, planning or implementing hyperconverged, it’s doing it for efficiency and to achieve cloud-like agility. At the point that hyperconverged lands, those signs in the office have to come down, and the teams have to work together. Administration stops being a single field of expertise and becomes an entire-stack vertical operation.

Let’s look at that (just) from the perspective of data protection. If we think of a typical data protection continuum (excluding data lifecycle management’s archive part), it might look as simple as the following:

Now I’ve highlighted those functions in different colours, for the simple fact that while they’re all data protection, in a classic IT environment, they’re rather differentiated from one another, viz.:

Primary vs Protection Storage

The first three – continuous availability, replication and snapshots – all, for the most part in traditional infrastructure IT shops, fit into the realm of functions of primary storage. On the other hand, operational backup and recovery, as well as long term retention, both fit into models around protection storage (regardless of what your protection storage is). So, as a result, you’ll typically end up with the following administrative separation:

Storage Admin vs Backup Admin

And that’s where the traditional IT model has sat for a long time – storage admins, and backup admins. Both working on different aspects of the same root discipline – data protection. That’s where you might find, in those traditionally silo’d IT shops, a team of storage administrators and a team of backup administrators. Colleagues to be sure, though not part of the same team.

But a funny thing happens when hyperconverged enters the picture. Hyperconverged merges storage and compute, and to a degree, networking, all into the same bucket of resources. This creates a different operational model which the old silo’d approach just can’t handle:

Hyperconverged Administration

Once hyperconverged is in the picture, the traditional delineation between storage administration and backup administration disappears. Storage becomes just one part of the integrated stack. Protection storage might share the hyperconverged environment (depending on the redundancy therein, or the operational budget), or it might be separate. Separate or not though, data protection by necessity is going to become a lot more closely intertwined.

In fact, in a hyperconverged environment, it’s likely to be way more than just storage and backup administration that becomes intertwined – it really does go all the way up the stack. So much so that I can honestly say every environment where I’ve seen hyperconverged deployed and fail is where the traditional silos have remained in place within IT, with processes and intra-group communications effectively handled by service/trouble tickets.

So here’s my tip as we reach the end of the 2017 tunnel and now see 2018 barrelling towards us: if you’re a backup administrator and hyperconverged is coming into your environment, this is your perfect opportunity to make the jump into being a full spectrum data protection administrator (and expert). (This is something I cover in a bit more detail in my book.) Within hyperconverged, you’ll need to understand the entire data protection stack, and this is a great opportunity to grow. It’s also a fantastic opportunity to jump out of pure operational administration and start looking at the bigger picture of service availability, too, since the automation and service catalogue approach to hyperconverged is going to let you free yourself substantially from the more mundane aspects of backup administration.

“Hyperconverged administrator” – it’s something you’re going to hear a lot more of in 2018 and beyond.

Hey, while you’re here, please take a few minutes to fill out the NetWorker Usage Survey for 2017!

Sep 162009

A while ago, I ran a post titled Ethical Obligations of Backup Administrators. Following up from that now I want to talk about the procedural obligations implicit to working in the role of being a backup administrator.

Now, to start with, if you think that the primary procedural obligation of a backup administrator is to ensure that the backups work or run, then you need to think more about the end obligation than the start obligation. (This is a primary topic of consideration in my book.)

Before I set out the procedural obligations, I need to define recoverable. You may think this is a self-obvious definition – however, if it were, a lot of problems that regularly occur in backup systems wouldn’t happen at all. Thus, by recoverable I mean the following:

  1. The item that was backed up can be retrieved from the backup media.
  2. The item that is retrieved from the backup media is usable as a replacement to the data that was backed up.
  3. The item can be retrieved within the required window.

A backup should not be deemed to be recoverable unless it meets all three of the above requirements. No ifs, no buts, no maybes. (Indeed, it’s worth noting that many “soft” recovery failures are caused by a failure to meet the third requirement – getting the data back in time is equally as important in mission critical systems as getting the data back.)

Since most people work well with lists, I’ll define these procedural obligations as a list, ordered in priority starting at the highest:

  1. To ensure that all required data is recoverable. By “data” I’m not just referring to raw data, but all items, files, information, databases, systems, etc., designated as requiring recovery.
  2. To maintain a zero error policy. There is no such thing as 100% certainty, but the closest you can get to it is by maintaining a zero error policy. In essence, by maintaining a zero error policy, you become immediately aware of any issues that may compromise the above rule.
  3. To maintain documentation for the environment. No system is complete without documentation. In particular, if someone with adequate skills cannot interact with it after reading the documentation, then the system is not documented and is not a system.
  4. To maintain an issues register. This is somewhat implicit in the maintenance of a zero error policy, but it is worth remembering that not all issues in a backup system are to do with errors. Issues may be that department heads approve of, or insist on non-standard backups, or that a system went into production without adequate testing, etc.
  5. To be across ongoing capacity management and forecasting requirements. A backup system can’t reliably work if it could halt due to capacity restraints at any random moment or minor data growth. Thus, the backup administrator must have a finger on the pulse of the capacity of the system.
  6. To maintain reports. A backup system does not work in isolation, and thus a backup administrator must ensure that reports (both daily/operational and long term/management) are accurate and timely.
  7. To document all data that is not required for recovery. There should be no “unknowns” in a backup system. Thus, any systems or data that are designated to not require recovery (e.g., QA systems) must be documented as such, and periodically rechecked to confirm this remains the case.

As I said from the outset, many of these obligations are implicit to the role of being a backup administrator. However, for organisations wanting to formalise their processes and their role descriptions, thus achieving higher guarantees of reliability within their backup system, clearly documenting these obligations are vital.

%d bloggers like this: