Jan 242017
 

In 2013 I undertook the endeavour to revisit some of the topics from my first book, “Enterprise Systems Backup and Recovery: A Corporate Insurance Policy”, and expand it based on the changes that had happened in the industry since the publication of the original in 2008.

A lot had happened since that time. At the point I was writing my first book, deduplication was an emerging trend, but tape was still entrenched in the datacentre. While backup to disk was an increasingly common scenario, it was (for the most part) mainly used as a staging activity (“disk to disk to tape”), and backup to disk use was either dumb filesystems or Virtual Tape Libraries (VTL).

The Cloud, seemingly ubiquitous now, was still emerging. Many (myself included) struggled to see how the Cloud was any different from outsourcing with a bit of someone else’s hardware thrown in. Now, core tenets of Cloud computing that made it so popular (e.g., agility and scaleability) have been well and truly adopted as essential tenets of the modern datacentre, as well. Indeed, for on-premises IT to compete against Cloud, on-premises IT has increasingly focused on delivering a private-Cloud or hybrid-Cloud experience to their businesses.

When I started as a Unix System Administrator in 1996, at least in Australia, SANs were relatively new. In fact, I remember around 1998 or 1999 having a couple of sales executives from this company called EMC come in to talk about their Symmetrix arrays. At the time the datacentre I worked in was mostly DAS with a little JBOD and just the start of very, very basic SANs.

When I was writing my first book the pinnacle of storage performance was the 15,000 RPM drive, and flash memory storage was something you (primarily) used in digital cameras only, with storage capacities measured in the hundreds of megabytes more than gigabytes (or now, terabytes).

When the first book was published, x86 virtualisation was well and truly growing into the datacentre, but traditional Unix platforms were still heavily used. Their decline and fall started when Oracle acquired Sun and killed low-cost Unix, with Linux and Windows gaining the ascendency – with virtualisation a significant driving force by adding an economy of scale that couldn’t be found in the old model. (Ironically, it had been found in an older model – the mainframe. Guess what folks, mainframe won.)

When the first book was published, we were still thinking of silo-like infrastructure within IT. Networking, compute, storage, security and data protection all as seperate functions – separately administered functions. But business, having spent a decade or two hammering into IT the need for governance and process, became hamstrung by IT governance and process and needed things done faster, cheaper, more efficiently. Cloud was one approach – hyperconvergence in particular was another: switch to a more commodity, unit-based approach, using software to virtualise and automate everything.

Where are we now?

Cloud. Virtualisation. Big Data. Converged and hyperconverged systems. Automation everywhere (guess what? Unix system administrators won, too). The need to drive costs down – IT is no longer allowed to be a sunk cost for the business, but has to deliver innovation and for many businesses, profit too. Flash systems are now offering significantly more IOPs than a traditional array could – Dell EMC for instance can now drop a 5RU system into your datacentre capable of delivering 10,000,000+ IOPs. To achieve ten million IOPs on a traditional spinning-disk array you’d need … I don’t even want to think about how many disks, rack units, racks and kilowatts of power you’d need.

The old model of backup and recovery can’t cut it in the modern environment.

The old model of backup and recovery is dead. Sort of. It’s dead as a standalone topic. When we plan or think about data protection any more, we don’t have the luxury of thinking of backup and recovery alone. We need holistic data protection strategies and a whole-of-infrastructure approach to achieving data continuity.

And that, my friends, is where Data Protection: Ensuring Data Availability is born from. It’s not just backup and recovery any more. It’s not just replication and snapshots, or continuous data protection. It’s all the technology married with business awareness, data lifecycle management and the recognition that Professor Moody in Harry Potter was right, too: “constant vigilance!”

Data Protection: Ensuring Data Availability

This isn’t a book about just backup and recovery because that’s just not enough any more. You need other data protection functions deployed holistically with a business focus and an eye on data management in order to truly have an effective data protection strategy for your business.

To give you an idea of the topics I’m covering in this book, here’s the chapter list:

  1. Introduction
  2. Contextualizing Data Protection
  3. Data Lifecycle
  4. Elements of a Protection System
  5. IT Governance and Data Protection
  6. Monitoring and Reporting
  7. Business Continuity
  8. Data Discovery
  9. Continuous Availability and Replication
  10. Snapshots
  11. Backup and Recovery
  12. The Cloud
  13. Deduplication
  14. Protecting Virtual Infrastructure
  15. Big Data
  16. Data Storage Protection
  17. Tape
  18. Converged Infrastructure
  19. Data Protection Service Catalogues
  20. Holistic Data Protection Strategies
  21. Data Recovery
  22. Choosing Protection Infrastructure
  23. The Impact of Flash on Data Protection
  24. In Closing

There’s a lot there – you’ll see the first eight chapters are not about technology, and for a good reason: you must have a grasp on the other bits before you can start considering everything else, otherwise you’re just doing point-solutions, and eventually just doing point-solutions will cost you more in time, money and risk than they give you in return.

I’m pleased to say that Data Protection: Ensuring Data Availability is released next month. You can find out more and order direct from the publisher, CRC Press, or order from Amazon, too. I hope you find it enjoyable.

Jun 112015
 

I’m back in a position where I’m having a lot of conversations with customers who are looking at infrastructure change.

What I’m finding remarkable in every single one of these conversations is the pervasiveness of Cloud considerations in data protection. I’m not just talking Spanning for your SaaS systems (though that gets people sitting up and taking notice every time), I’m talking about businesses that are looking towards Cloud to deal with something that has been a feature of datacentres for decades: tape.

I’ve mentioned CloudBoost before, and I personally think this is an exciting topic.

iStock Cloud Touch Small

An absolute ‘classic’ model now with NetWorker is to have it coupled with Data Domain systems, with backups duplicated between the Data Domain systems and removing of tape – at least for that daily and weekly backup cycle. Data Domain Extended Retention is getting a lot of interest in companies, but without a doubt there’s still been some people who look at a transition to deduplication as a phased approach: start with short-term backups going to deduplication, and keep those legacy tape libraries around for handling tape-out for monthly backups.

That certainly has appeal for businesses that want to stretch their tape investment out for the longest possible time, especially if they have long-term backups already sitting on tape.

But every time I talk to a company about deploying Data Domain for their backups, before I get to talk about CloudBoost and other functions, I’m getting asked: hey, can we say, look towards moving our long term backups to Cloud instead of tape at a later date?

You certainly can – CloudBoost is here now, and whether you’re ready to start shifting longer-term compliance style backups out to Cloud now, or in a year or two years time, it’ll be there waiting for you.

Over time (as the surveys have shown), backup to disk has increased in NetWorker environments to over 90% use. The basic assumption for years had been disk will kill tape. People say it every year. What I’m now seeing is Cloud could very well be the enabler for that final death of tape. Many businesses are entirely killing tape already thanks to deduplication: I know of many who are literally pulling their tapes back from their offsite storage vendor and ingesting them back into their Data Domains – particularly those with Extended Retention. But some businesses don’t want to keep all those long-term backups on local disk, deduplicated or not – they want a different economic scale, and they see Cloud as delivering that economy of scale.

I find it fascinating that I’m being so regularly asked by people: can we ditch tape and go to Cloud? That to me is a seismic shift on the remaining users of tape.

[Side note: Sure, you’ll find related links from me below about tape. Just because I wrote something 1-3 years ago I can’t change my opinion 🙂 ]

May 162015
 

Introduced alongside NetWorker 8.2 SP1 is integration with a new EMC product, CloudBoost.

The purpose of CloudBoost is to allow a NetWorker server to write deduplicated backups from its datazone out to one of a number of different types of cloud (e.g., EMC ECS Storage Service, Google Cloud Storage, Azure Cloud Storage, Amazon S3, etc.) in an efficient form.

CloudBoostThe integration point is quite straight forward, designed to simplify the configuration within NetWorker.

A CloudBoost system is a virtual appliance that can be deployed within your VMware vSphere environment. The appliance is an “all in one” system that includes:

  • NetWorker 8.2 SP1 storage node/client software
  • CloudBoost management console
  • CloudBoost discovery service

One of the nifty functions that CloudBoost performs in order to make deduplicated storage to the cloud efficient is a splitting of metadata and actual content. The metadata effectively relates to all the vital information the CloudBoost appliance has to know in order to access content from the object store it places in the selected cloud. While the metadata is backed up to the cloud, all metadata operations will happen against the local copy of the metadata, thereby significantly speeding up access and maintenance operations. (And everything written out to cloud is done so using AES-256 encryption, keeping it safe from prying eyes.)

A CloudBoost appliance can logically address 400TB of storage in the cloud pre-deduplication. With estimated deduplication ratios of up to 4x for data analysis performed by EMC, that might equate to up to 1.6PB of actual stored data, and it can be any data that NetWorker has backed up.

Once a CloudBoost appliance has been deployed (consisting of VM provisioning and connection to a supported cloud storage system), and integrated into NetWorker as storage node with in-built AFTD, getting long-term data out to the cloud is as simple as executing a clone operation against the required data, with the destination storage node being the CloudBoost storage node. Since the data is written to the CloudBoost embedded NetWorker Storage Node, recovery from backups that have been sent to the cloud is as simple as executing a recovery with the copy on the CloudBoost appliance being selected to use.

In other words, once it’s been setup, it’s business as usual for a NetWorker administrator or operator.

To get a thorough understanding of how CloudBoost and NetWorker integrate, I suggest you read the Release Notes and Integration Guide (you’ll need to log into the EMC support website to view those links). Additionally, there’s an excellent overview video you can watch here:

 

May 312013
 

iStock Cloud Touch Small

The other day I stumbled across a link to an article, Why you should stop buying servers. The title was interesting enough to grab my attention so I had a quick peruse through it. It’s an article about why you should start using the cloud rather than buying local infrastructure.

While initially I was reasonably skeptical of cloud, that view has been tempered over time. When handled correctly, cloud or cloud-like services will definitely be a part of the business landscape for some time to come. (I personally suspect we’ll see pendulum swings on cloud services in pretty much exactly the same way as we see pendulum swings on outsourcing.)

The lynch pin in that statement above though is when handled correctly; in this case, I was somewhat concerned at the table showing the merits of cloud servers vs local servers when it came to backup:

Cloud vs Local ServersThis comparison to me shows a key question people aren’t yet asking of cloud services companies:

Do you understand backup?

It’s not a hard question, but it does deserve hard answers.

To say that a remote snapshot of a virtual server represents an offsite backup in a single instance may be technically true (minus fine print on whether or not application/database consistent recovery can be achieved), but it’s hardly the big picture on backup policies and processes. In fact, it’s about as atomic as you can get.

I had the pleasure of working with an IaaS company last year to help formulate their backup strategy; their intent was clear: to make sure they were offering business suitable and real backup policies for potential customers. So, to be blunt: it can be done.

As someone who has worked in backup my entire professional career, the above table scares me. In a single instance it might be accurate (might); as part of a full picture, it doesn’t even scratch the surface. Perhaps what best sums up my concerns with this sort of information is this rollover at the top of the table:

Sponsor content rolloverSeveral years back now, I heard an outsourcer manager crowing about getting an entire outsourcing deal signed, with strict requirements for backup and penalties for non-conformance that didn’t once mention the word recovery. It’s your data, it’s your business, you have a right and an obligation to ask a cloud services provider:

Do you understand backup?

 

Jun 022012
 

Those who regularly follow my blog know that I see cloud as a great unknown when it comes to data protection. It’s still an evolving model, and many cloud vendors take the process of backup and data protection a little to cavalierly – pushing it onto the end users. Some supposedly “enterprise” vendors won’t even let you see what their data protection options are, until you sign an NDA.

Recently I’ve been working with a cloud service provider to build a fairly comprehensive backup model, and it’s greatly reassuring to see companies starting to approach cloud with a sensible, responsible approach to data protection processes. It’s a good change to witness, and it’s proven to me that my key concerns with data protection in the cloud originated from poor practices. Take that problem away, and cloud data protection becomes a lot better.

Stepping back from the enterprise level, one thing I’m quite cognisant of as a “backup expert” is designing my own systems for recovery. I have a variety of backup options in use that provide local protection, but providing off-site protection is a little more challenging. Removable hard-drives stored elsewhere exist more for disaster recovery purposes – best used for data that doesn’t change frequently, or for data you don’t need to recover instantly – such as media.

Inevitably though, for personal backups that are off-site as quickly as possible, cloud represents an obvious option, so long as your link is fast enough.

Some time ago, I used Mozy, but found it somewhat unsatisfying to use. I could never quite bring myself to paying for the full service, and once they introduced their pricing changes, I was rather grateful I’d abandoned it – too pricey, and prone on the Mac at least to deciding it needed to start all backups from scratch again.

So a bit of digging around led me to Crashplan. Specifically, I chose the “CrashPlan+ Family Unlimited Monthly Subscription” option. It costs me $12 US a month – I could bring that down to an effective $6 US monthly charge by paying up-front, but I prefer the minimised regular billing option over a single, up-front hit.

Crashplan+ Family Unlimited allows me to backup as much data as I want from up to 10 computers, all tied to the same account. Since it has clients for Windows, Mac OS X, Linux and Solaris, I’m fairly covered for options. (In fact, so far I’ve only been working on getting Mac OS X clients backing up.)

On standard ADSL2, with an uplink speed currently maxing out at 600Kbps, I don’t have the luxury of backing up everything I have to a cloud provider. At last count, Darren and I have about 30TB of allocated storage at home, of which about 10TB is active storage. So, contrary to everything I talk about, I have to run an inclusive backup policy for cloud backups – I select explicitly what I want backed up.

That being said, I’ve managed in the last few months, given a host of distractions, including moving house, to push a reasonable chunk of non-recreatable data across to Crashplan:

Crashplan Report

That’s the first thing I like about Crashplan – I get a weekly report showing how much data I’m protecting, how much of it has been backed up, and what machines that data belongs to. (I like reports.)

As an aside, for the purposes of backing up over a slow link where I have to be selective, I classify data as follows:

  • Non-recreatable – Data that I can’t recreate “as is”: Email, documents, iTunes purchased music, etc.;
  • Recreatable – Data which is a distillation of other content – e.g., the movies I’ve encoded from DVD for easy accesss;
  • Archival – Data that I can periodically take archive copies of and have no urgent Recovery Point Objective (RPO) for – e.g., virtual machines for my lab, etc.

For both recreatable and archival content, the solution is to take what I describe as “local offsite” copies – offline copies that are not stored in my house are sufficient. However, it’s the non-recreatable content that I need to get truly offsite copies of. In this instance, it’s not just having an offsite copy that matters, but having an offsite copy that’s accessible relatively quickly from any location, should I need. That’s where cloud backup comes in, for me.

But there’s more than weekly reports to like about Crashplan. For a start, it intelligently handles cumulative selection. That’s where I have a large directory structure where the long-term intent is to backup the entire parent directory, but I want to be able to cumulatively add content from subdirectories before switching over. For example, I have the following parent directory on my Drobo I need to protect:

  • /Volumes/Alteran/Documents

However, there’s over 200 GB of data in there, and I didn’t want a single backup to take that long to complete, so I cumulatively added:

  • /Volumes/Alteran/Documents/• Sync
  • /Volumes/Alteran/Documents/Backgrounds
  • /Volumes/Alteran/Documents/Music
  • etc

Once all of these individual subdirectories backups were complete, I could switch them off and immediately switch on /Volumes/Alteran/Documents without any penalty. This may seem like a common sense approach, but it’s not something you can assume to happen. So recently, with no net impact to the overall amount of data I was backing up, I was able to make that switch:

Backup Selections

Crashplan offers some neat additional tricks, too. For a start, if you want, you can configure Crashplan to backup to a local drive, too. Handy if you don’t have any other backup options available. (I’m not using that functionality, but between cross-machine synchronisation with archive, Time Machine and other backup options, I’m fairly covered there already.) You can also have your friends backup to you rather than Crashplan themselves – which would be useful in a household where you want all the data to go across to Crashplan from one central computer for ease of network control:

External backup options

The meat of a backup product though is being able to restore data, and Crashplan performs admirably on that front. The restore interface, while somewhat plain, is straight forward and easy to understand:

Recovery Interface

One of the things I like about the recovery interface is how it leads you from one logical step to another, as evidenced by the text directly under the main file selection box:

  1. First choose what you want to recover
  2. Optionally change what version you want to recover
  3. Optionally change the permissions for the recovered files
  4. Optionally change the folder you recover to
  5. Choose what to do with existing files

All of these are the sorts of standard questions you’d expect to deal with, but rather than being hidden in a menu somewhere, they’re out in the open, and configured as hyperlinks to immediately draw the attention of the user.

Overall I have to say I’m fairly happy with Crashplan. I trialled it first for free, then upgraded to the Family+ plan once I saw it would suit my needs. As a disclaimer, I did have one incident where I logged a support case it took Crashplan 12 days to respond to me, which I found totally unacceptable, and poor support on their behalf, but I’ll accept it was an isolated incident on the basis of their subsequent apology and feedback from other Crashplan users via Twitter that this was a highly abnormal experience.

If you’re looking for a way of backing up your personal data where offsite and accessibility are key criteria, Crashplan is certainly a good direction to look.  While the Crashplan user interface may not be as slick looking as other applications, it works, and it leads you logically from one set of selections to the next.

[Edit, 2012-12-21]

A few months have gone by since that post, and I’m now up to over 1.5TB backed up to Crashplan across 6 computers, 2 x Linux, 4 x Macs. I remain very confident in Crashplan.

Clouds and bandaids

 General Technology, General thoughts  Comments Off on Clouds and bandaids
Nov 092011
 

I’m going to stir the pot a bit here and suggest that a reasonable percentage of businesses considering cloud deployments have their heads in the sand over where their problems really lay.

In short, any company that sees cloud – public cloud – as a solution to becoming more flexible and cost competitive don’t actually understand the real problem they’re facing.

That problem? Business/IT alignment.

One of the oft-touted advantages of public cloud is that it’s about enabling businesses to react more quickly to changing IT requirements than their own IT departments do.

How many people can honestly look at that description and think that the solution is to cut off the IT limb from the business body?

The solution – the real solution – is to address whatever divide there is between the business and IT. And have no doubt – if blame is to be laid it will likely need to be laid with equal measure at both the feet of IT and the rest of the business.

Shifting to public cloud instead says that the business itself is unwilling to seriously look at its own function, behaviour and attitudes and address real issues. It’s the business hoping for an “abracadabra” solution rather than working on a real solution.

It’s like having an arterial bleed, and sticking a bandaid on it.

37 Signals and the “end of the IT department”

 Architecture, Aside, Data loss, General Technology, General thoughts  Comments Off on 37 Signals and the “end of the IT department”
Mar 022011
 

The folks over at 37 Signals published a little piece of what I would have to describe as crazy fiction, about how the combination of cloud and more technically savvy users means that we’re now seeing the end of the IT department.

I thought long and hard about writing a rebuttal here, but quite frankly, their lack of logic made me too mad to publish the article on my main blog, where I try to be a little more polite.

So, if you don’t mind a few strong words and want to read a rebuttal to 37 Signals, check out my response here.

Nov 182009
 

So The Register has a story about how Microsoft is edging closer to delivering it’s cloud based system, Azure.

It seems inept that through the entire article, there wasn’t a single mention of the Sidekick Debacle. As you may remember, that debacle was sponsored by ‘Danger’, a Microsoft subsidiary. If you think Microsoft weren’t involved because Danger was a subsidiary, think again.

If we can learn anything from this, it’s that too many people like to close one eye and half shut the other one to make sure they don’t see all those dark and dangerous storm clouds racing around their silver linings.

Based on Microsoft’s track record, I wouldn’t trust Azure for a minute with a KB of my data even if they were paying me. Not until there’s an industry-wide alliance for certifying cloud based solutions and ensuring vendors actually treat customer data as if it were their own most sensitive and important data. Not until Microsoft are a gold member of that alliance and have come out of their first two audits with shining covers.

Until then when it comes to Azure, all I see are dark Clouds with no silver linings.

Nov 022009
 

Over at The Register, there’s a story, “Gmail users howl over Halloween Outage“. As readers may remember, I discussed in The Scandalous Truth about Clouds that there needs to be significant improvements in the realm of visibility and accountability from Cloud vendors if it is to achieve any form of significant trust.

The fact that there was a Gmail outage for some users wasn’t what caught my attention in this article – it seems that there’s almost always some users who are experiencing problems with Google Mail. What really got my goat was this quote:

Some of the affected users say they’re actually paying to use the service. And one user says that although he represents an organization with a premier account – complete with a phone support option – no one is answering Google’s support line. Indeed, our call to Google’s support line indicates the company does not answer the phone after business hours. But the support does invite you leave a message and provide an account pin number. Google advertises 24/7 phone support for premier accounts, which cost about $50 per user per year.

Do No Evil, huh, Google? What would you call unstaffed 24×7 support line for people who pay for 24×7 support?

It’s time for the cloud hype to be replaced by some cold hard reality checks: big corporates, no matter “how nice” they claim to be, will as a matter of indifference trample on individual end-users time and time again. Cloud is all about big corporates and individual end users. If we don’t get some industry regulation/certification/compliance soon, then as people continue to buy into the cloud hype, we’re going to keep seeing stories of data loss and data unavailability – and the frequency will continue to increase.

Shame Google, shame.

Oct 172009
 

Your cloud based data may be hanging by a thread and you wouldn’t even know.

Clouds: Is your data hanging by a thread?

Introduction

The recent Sidekick debacle proved one thing: it’s insufficient to “just trust” companies that are currently offering cloud based services. Instead, industry standards and regulations must be developed to permit use of the term.

I’ll be blunt: as per previous articles here, I don’t believe in “The Cloud” as a fundamental paradigm shift. I see it as a way of charging more for delivering the same thing for private clouds, and (as exemplified by Sidekick), something which may be fundamentally unreliable as a sole repository of data in the public instance.

Regardless of that however, it’s clear that the “cloud” moniker will be around for a while, and businesses will continue to trade on being providing “cloud” services (and thus being buzzword compliant). So, like it or lump it, we need to come up with some rules.

Recently SNIA has started an initiative to try to setup some standards for Cloud based activities. However, as is SNIAs right, and their focus, this primarily looks at data management, which is less than half of the equation for public cloud services. The lions share of the equation for public cloud services, as proven by the Sidekick debacle is trust.

Currently the cloud computing industry is like the wild west. Lots of people are running around promising fabulous new things that can solve any number of problems. But when those fabulous new things fail or fall over even temporarily, a lot of people can be negatively affected.

How can people trust that their cloud data is safe? Regulation is a good starting point.

If you are one of those people who at the first hint of the word regulation throws up your hands and says “that’s too much government intervention”, then I’d invite you to stop and think for a few minutes about the global financial crisis. If you’re one of those people who insists “industries should be self regulating”, I’d invite you to look at a certain Microsoft subsidiary called Danger that was offering a service called Sidekick. In short, self regulation doesn’t work without rigid transparency.

So, what needs to be done?

Well, there’s three key factors that need to be addressed in order to achieve true and transparent trust within cloud based businesses. These are:

  • Foundation of ethical principles of operation
  • Periodic certified (mandatory) audit process
  • Reporting

Let’s look at each of these individually.

Ethical Principles of Operation

Whenever I start thinking about ethics in IT, I think of two different yet equally applicable sayings:

  • Common sense is not that common (usually incorrectly attributed to Voltaire)
  • When you assume you make an ass out of u and me. (Unknown source.)

Extending beyond the notion of “cloud”, we can say that companies should strive to understand the ethical requirements of data hosting, so as to ensure that whenever they hold data for and on behalf of another company or individual they:

  1. At all times aim to keep the data available within the stated availability times/percentages.
  2. At all times ensure the data is recoverable.
  3. At all times be prepared to handover said data on request/on termination of services.

These should be self evident in that if the situation were reversed we would expect the same thing. Companies that offer cloud services should work such ethical goals into their mission requirements and individual goals of every individual employee. (If the company offers cloud application services as well as just data services, the same applies.)

Mandatory, Periodic, Independently Certified Auditing of Compliance

In a perfect world, ethics alone would be sufficient to garner trust. However, as we all know, we need more than ethics in order to generate trust. Trust will primarily come from mandatory periodic independently certified auditing of compliance to ethical principles of cloud data storage.

What does this mean?

So let’s look at each word in that statement to understand what company* should have to do in order to offer “cloud” data/services:

  • mandatory – it must, in order to keep referring to itself as “cloud”
  • periodic – every 6-12 months (more likely every 12 months – 6 would be preferable in the fast moving world of the internet however)
  • independently – to be done by companies or consultants who do not have any affiliation that would cause a conflict of interest
  • certified auditing – said companies or consultants doing the auditing must have certification from SNIA for following appropriate practices
  • compliance – if found to be non-compliant, SNIA (or some other designated agency) must post a warning on their web-site within 1 month of the audit, and the company be given 3 months to rectify the issue. If after 3 months they have not, then SNIA should flag them as non-compliant. This should also result in the company taking down any reference to “cloud”.

Obviously unless legally enforced, a company could choose to sidestep the entire compliancy check and just declare themselves to be cloud services regardless. Therefore there must be a “Known Compliant” list kept up to date, country-by-country, that would be advertised not only by SNIA but by actual cloud-compliant companies which partake in the process, so that end-users and businesses could reference this to determine who have exhibited certified levels of trust.

In order to achieve that certification, companies would need to be able to demonstrate to the auditor that they have:

  • Designed their systems for sufficient redundancy
  • Designed adequate backup and per-customer data recoverability options (see note below)
  • Have disaster recovery/contingency planning in place
  • Have appropriate change controls to manage updates to infrastructure or services

Note/Aside regarding adequate backup and per-customer data recoverability options. Currently this is an entirely laughable and inappropriate state. If companies wish to offer cloud based data services, and encourage users to store their data within their environment, they must also offer backup/recovery services for that data. They may choose to make this a “local-sync” style option – keeping a replica of the cloud-data in a designated local machine for the user, or, if not done this way, they must offer a minimum level of data recoverability service to their users. For example, something even as basic as “Any file stored in our service for more than 24 hours will be recoverable for 6 weeks from time of storage.” I.e., it doesn’t necessarily have to be the same level of data recovery we expect from private enterprise networks, but it must be something.

It would be easy and entirely inappropriate to say instead of all this auditing that companies must simply publish all the above information. However, that represents a potential data security issue, and it also potentially gives away business-sensitive information, so I’m firmly against that idea. The only workable alternative to that however is the certified auditing process.

Reporting

Currently there is far too cavalier an approach to reporting by cloud vendors about the state of their systems. Reporting must be publicly available, fulfilling the following categories:

  1. Compliancy – companies should ensure that any statement of compliancy is up to date.
  2. Availability – companies should keep their availability percentile (e.g., “99.9% available”) publicly available in the way that many primary industries for instance publish their “days without an injury” statistics.
  3. Failures – companies must publish failure status reports/incident updates at minimum every half an hour, starting from the time of the incident and finishing after the incident is resolved. It’s important for cloud vendors to start to realise that their products may be used by anyone else in the world, so it’s not sufficient to just wake IT staff on an incident, management or other staff must be available to ensure that updates continue to be generated without requiring IT staff to stop working on resolution. I.e., round-the-clock services require round-the-clock reporting.
  4. Incident reports – all incidents that result in unavailability should have a report generated on which will be reviewed by the auditor on the next compliancy check.

In conclusion

Does this sound like a lot of work? Well, yes.

It’s all too easy for those of us in IT to take a cavalier attitude towards user data – they should know how to backup, they should understand the risks, they should … well, you get the picture. Yes, there’s a certain level of education we would like to see in end users, but think of the flip-side. They’re not IT people. They don’t necessarily think like IT people. For the most part, they’ve been trained not to think about backup and data protection because it’s not something that’s been pushed home within the operating systems they’re using. (A trend that seems to be readily reversing in Mac OS X thanks to Time Machine.)

Ultimately, cloud failures can’t be palmed off with trite statements that users should have kept local copies of their data. Cloud services are being marketed and promoted as “data available anywhere” style systems, which creates an expectation of protection and availability.

So in short, while this is potentially a lot of work to setup, it’s necessary. It should be considered to be a moral imperative. In order to actually garner trust, the current wild-west approach to Clouds must be reined in and be given certified processes that enable users (or at least trusted IT advisers of users) to confidently point at a service and say: “that’s been independently checked: it’s trustworthy“.

Anything short of this would be a scandalous statement about deniability, legal weaseling out of responsibility and a “screw you” attitude towards end-user data.


* Obviously some individuals, moving forward, may in various ways choose to offer cloud access. Due to hosting and bandwidth, it’s likely in most instances that such access would be as a virtual private cloud – a cloud that’s “out there” in internet land, but is available only to select users. As such, it would fall into the realm of private clouds, which will undoubtedly have a do whatever the hell you feel like doing approach. However, in the event of individuals rather than corporates specifically offering full public-cloud style access to data, there should be a moniker for “uncertified” individual cloud offerings – available only to individuals; never to corporates.