Architecture Matters: Protection in the Cloud (Part 2)

 Architecture  Comments Off on Architecture Matters: Protection in the Cloud (Part 2)
Jun 052017

(Part 1).

Particularly when we think of IaaS style workloads in the Cloud, there’s two key approaches that can be used for data protection.

The first is snapshots. Snapshots fulfil part of a data protection strategy, but we do always need to remember with snapshots that:

  • They’re an inefficient storage and retrieval model for long-term retention
  • Cloud or not, they’re still essentially on-platform

As we know, and something I cover in my book quite a bit – a real data protection strategy will be multi-layered. Snapshots undoubtedly can provide options around meeting fast RTOs and minimal RPOs, but traditional backup systems will deliver a sufficient recovery granularity for protection copies stretching back weeks, months or years.

Stepping back from data protection itself – public cloud is a very different operating model to traditional  in-datacentre infrastructure spending. The classic in-datacentre infrastructure procurement process is an up-front investment designed around 3- or 5-year depreciation schedules. For some businesses that may mean a literal up-front purchase to cover the entire time-frame (particularly so when infrastructure budget is only released for the initial deployment project), and for others with more fluid budget options, there’ll be an investment into infrastructure that can be expanded over the 3- or 5-year solution lifetime to meet systems growth.

Cloud – public Cloud – isn’t costed or sold that way. It’s a much smaller billing window and costing model; use a GB or RAM, pay for a GB of RAM. Use a GHz or CPU, pay for a GHz of CPU. Use a GB of storage, pay for a GB of storage. Public cloud costing models often remind me of Master of the House from Les Miserables, particularly this verse:

Charge ’em for the lice, extra for the mice
Two percent for looking in the mirror twice
Here a little slice, there a little cut
Three percent for sleeping with the window shut
When it comes to fixing prices
There are a lot of tricks I knows
How it all increases, all them bits and pieces
Jesus! It’s amazing how it grows!

Master of the House, Les Miserables.

That’s the Cloud operating model in a nutshell. Minimal (or no) up-front investment, but you pay for every scintilla of resource you use – every day or month.

If you say, deploy a $30,000 server into your datacentre, you then get to use that as much or as little as you want, without any further costs beyond power and cooling*. With Cloud, you won’t be paying that $30,000 initial fee, but you will pay for every MHz, KB of RAM and byte of storage consumed within every billing period.

If you want Cloud to be cost-effective, you have to be able to optimise – you have to effectively game the system, so to speak. Your in-Cloud services have to be maximally streamlined. We’ve become inured to resource wastage in the datacentre because resources have been cheap for a long time. RAM size/speed grows, CPU speed grows, as does the number of cores, and storage – well, storage seems to have an infinite expansion capability. Who cares if what you’re doing generates 5 TB of logs per day? Information is money, after all.

To me, this is just the next step in the somewhat lost art of programmatic optimisation. I grew up in the days of 8-bit computing**, and we knew back then that CPU, RAM and storage weren’t infinite. This didn’t end with 8-bit computing, though. When I started in IT as a Unix system administrator, swap file sizing, layout and performance was something that formed a critical aspect of your overall configuration, because if – Jupiter forbid – your system started swapping, you needed a fighting chance that the swapping wasn’t going to kill your performance. Swap file optimisation was, to use a Bianca Del Rio line, all about the goal: “Not today, Satan.”

That’s Cloud, now. But we’re not so much talking about swap files as we are resource consumption. Optimisation is critical. A failure to optimise means you’ll pay more. The only time you want to pay more is when what you’re paying for delivers a tangible, cost-recoverable benefit to the business. (I.e., it’s something you get to charge someone else for, either immediately, or later.)

Cloud Cost

If we think about backup, it’s about getting data from location A to location B. In order to optimise it, you want to do two distinct thinks:

  • Minimise the number of ‘hops’ that data has to make in order to get from A to B
  • Minimise the amount of data that you need to send from A to B.

If you don’t optimise that, you end up in a ‘classic’ backup architecture that we used to rely so much on in the 90s and early 00s, such as:

Cloud Architecture Matters 1

(In this case I’m looking just at backup services that land data into object storage. There are situations where you might want higher performance than what object offers, but let’s stick just with object storage for the time being.)

I don’t think this diagram is actually good at giving the full picture. There’s another way I like to draw the diagram, and it looks like this:

Cloud Architecture Matters 2

In the Cloud, you’re going to pay for the systems you’re running for business purposes no matter what. That’s a cost you have to accept, and the goal is to ensure that whatever services or products you’re on-selling to your customers using those services will pay for the running costs in the Cloud***.

You want to ensure you can protect data in the Cloud, but sticking to architectures designed at the time of on-premises infrastructure – and physical infrastructure at that – is significantly sub-optimal.

Think of how traditional media servers (or in NetWorker parlance, storage nodes) needed to work. A media server is designed to be a high performance system that funnels data coming from client to protection storage. If a backup architecture still heavily relies on media servers, then the cost in the Cloud is going to be higher than you need it – or want it – to be. That gets worse if a media server needs to be some sort of highly specced system encapsulating non-optimised deduplication. For instance, one of NetWorker’s competitors provides details on their website of hardware requirements for deduplication media servers, so I’ve taken these specifications directly from their website. To work with just 200 TB of storage allocated for deduplication, a media server for that product needs:

  • 16 CPU Cores
  • 128 GB of RAM
  • 400 GB SSD for OS and applications
  • 2 TB of SSD for deduplication databases
  • 2 TB of 800 IOPs+ disk (SSD recommended in some instances) for index cache

For every 200 TB. Think on that for a moment. If you’re deploying systems in the Cloud that generate a lot of data, you could very easily find yourself having to deploy multiple systems such as the above to protect those workloads, in addition to the backup server itself and the protection storage that underpins the deduplication system.

Or, on the other hand, you could work with an efficient architecture designed to minimise the number of data hops, and minimise the amount of data transferred:

CloudBoost Workflow

That’s NetWorker with CloudBoost. Unlike that competitor, a single CloudBoost appliance doesn’t just allow you to address 200TB of deduplication storage, but 6 PB of logical object storage. 6 PB, not 200 TB. All that using 4 – 8 CPUs and 16 – 32GB of RAM, and with a metadata sizing ratio of 1:2000 (i.e., every 100 GB of metadata storage allows you to address 200 TB of logical capacity). Yes, there’ll be SSD optimally for the metadata, but noticeably less than the competitor’s media server – and with a significantly greater addressable range.

NetWorker and CloudBoost can do that because the deduplication workflow has been optimised. In much the same way that NetWorker and Data Domain work together, within a CloudBoost environment, NetWorker clients will participate in the segmentation, deduplication, compression (and encryption!) of the data. That’s the first architectural advantage: rather than needing a big server to handle all the deduplication of the protection environment, a little bit of load is leveraged in each client being protected. The second architectural advantage is that the CloudBoost appliance does not pass the data through. Clients send their deduplicated, compressed and encrypted data directly to the object storage, minimising the data hops involved****.

To be sure, there are still going to be costs associated with running a NetWorker+CloudBoost configuration in public cloud – but that will be true of any data protection service. That’s the nature of public cloud – you use it, you pay for it. What you do get with NetWorker+CloudBoost though is one of the most streamlined and optimised public cloud backup options available. In an infrastructure model where you pay for every resource consumed, it’s imperative that the backup architecture be as resource-optimised as possible.

IaaS workloads will only continue to grow in public cloud. If your business uses NetWorker, you can take comfort in being able to still protect those workloads while they’re in public cloud, and doing it efficiently, optimised for maximum storage potential with minimised resource cost. Remember always: architecture matters, no matter where your infrastructure is.

Hey, if you found this useful, don’t forget to check out Data Protection: Ensuring Data Availability.


* Yes, I am aware there’ll be other costs beyond power and cooling when calculating a true system management price, but I’m not going to go into those for the purposes of this blog.

** Some readers of my blog may very well recall earlier computing models. But I started with a Vic-20, then the Commodore-64, and both taught me valuable lessons about what you can – and can’t – fit in memory.

*** Many a company has been burnt by failing to cost that simple factor, but in the style of Michael Ende, that is another story, for another time.

**** Linux 64-bit clients do this now. Windows 64-bit clients are supported in NetWorker 9.2, coming soon. (In the interim Windows clients work via a storage node.)

May 232017

I’m going to keep this one short and sweet. In Cloud Boost vs Cloud Tier I go through a few examples of where and when you might consider using Cloud Boost instead of Cloud Tier.

One interesting thing I’m noticing of late is a variety of people talking about “VTL in the Cloud”.

BigStock Exhausted

I want to be perfectly blunt here: if your vendor is talking to you about “VTL in the Cloud”, they’re talking to you about transferring your workloads rather than transforming your workloads. When moving to the Cloud, about the worst thing you can do is lift and shift. Even in Infrastructure as a Service (IaaS), you need to closely consider what you’re doing to ensure you minimise the cost of running services in the Cloud.

Is your vendor talking to you about how they can run VTL in the Cloud? That’s old hat. It means they’ve lost the capacity to innovate – or at least, lost interest in it. They’re not talking to you about a modern approach, but just repeating old ways in new locations.

Is that really the best that can be done?

In a coming blog article I’ll talk about the criticality of ensuring your architecture is streamlined for running in the Cloud; in the meantime I just want to make a simple point: talking about VTL in the Cloud isn’t a “modern” discussion – in fact, it’s quite the opposite.

Jan 242017

In 2013 I undertook the endeavour to revisit some of the topics from my first book, “Enterprise Systems Backup and Recovery: A Corporate Insurance Policy”, and expand it based on the changes that had happened in the industry since the publication of the original in 2008.

A lot had happened since that time. At the point I was writing my first book, deduplication was an emerging trend, but tape was still entrenched in the datacentre. While backup to disk was an increasingly common scenario, it was (for the most part) mainly used as a staging activity (“disk to disk to tape”), and backup to disk use was either dumb filesystems or Virtual Tape Libraries (VTL).

The Cloud, seemingly ubiquitous now, was still emerging. Many (myself included) struggled to see how the Cloud was any different from outsourcing with a bit of someone else’s hardware thrown in. Now, core tenets of Cloud computing that made it so popular (e.g., agility and scaleability) have been well and truly adopted as essential tenets of the modern datacentre, as well. Indeed, for on-premises IT to compete against Cloud, on-premises IT has increasingly focused on delivering a private-Cloud or hybrid-Cloud experience to their businesses.

When I started as a Unix System Administrator in 1996, at least in Australia, SANs were relatively new. In fact, I remember around 1998 or 1999 having a couple of sales executives from this company called EMC come in to talk about their Symmetrix arrays. At the time the datacentre I worked in was mostly DAS with a little JBOD and just the start of very, very basic SANs.

When I was writing my first book the pinnacle of storage performance was the 15,000 RPM drive, and flash memory storage was something you (primarily) used in digital cameras only, with storage capacities measured in the hundreds of megabytes more than gigabytes (or now, terabytes).

When the first book was published, x86 virtualisation was well and truly growing into the datacentre, but traditional Unix platforms were still heavily used. Their decline and fall started when Oracle acquired Sun and killed low-cost Unix, with Linux and Windows gaining the ascendency – with virtualisation a significant driving force by adding an economy of scale that couldn’t be found in the old model. (Ironically, it had been found in an older model – the mainframe. Guess what folks, mainframe won.)

When the first book was published, we were still thinking of silo-like infrastructure within IT. Networking, compute, storage, security and data protection all as seperate functions – separately administered functions. But business, having spent a decade or two hammering into IT the need for governance and process, became hamstrung by IT governance and process and needed things done faster, cheaper, more efficiently. Cloud was one approach – hyperconvergence in particular was another: switch to a more commodity, unit-based approach, using software to virtualise and automate everything.

Where are we now?

Cloud. Virtualisation. Big Data. Converged and hyperconverged systems. Automation everywhere (guess what? Unix system administrators won, too). The need to drive costs down – IT is no longer allowed to be a sunk cost for the business, but has to deliver innovation and for many businesses, profit too. Flash systems are now offering significantly more IOPs than a traditional array could – Dell EMC for instance can now drop a 5RU system into your datacentre capable of delivering 10,000,000+ IOPs. To achieve ten million IOPs on a traditional spinning-disk array you’d need … I don’t even want to think about how many disks, rack units, racks and kilowatts of power you’d need.

The old model of backup and recovery can’t cut it in the modern environment.

The old model of backup and recovery is dead. Sort of. It’s dead as a standalone topic. When we plan or think about data protection any more, we don’t have the luxury of thinking of backup and recovery alone. We need holistic data protection strategies and a whole-of-infrastructure approach to achieving data continuity.

And that, my friends, is where Data Protection: Ensuring Data Availability is born from. It’s not just backup and recovery any more. It’s not just replication and snapshots, or continuous data protection. It’s all the technology married with business awareness, data lifecycle management and the recognition that Professor Moody in Harry Potter was right, too: “constant vigilance!”

Data Protection: Ensuring Data Availability

This isn’t a book about just backup and recovery because that’s just not enough any more. You need other data protection functions deployed holistically with a business focus and an eye on data management in order to truly have an effective data protection strategy for your business.

To give you an idea of the topics I’m covering in this book, here’s the chapter list:

  1. Introduction
  2. Contextualizing Data Protection
  3. Data Lifecycle
  4. Elements of a Protection System
  5. IT Governance and Data Protection
  6. Monitoring and Reporting
  7. Business Continuity
  8. Data Discovery
  9. Continuous Availability and Replication
  10. Snapshots
  11. Backup and Recovery
  12. The Cloud
  13. Deduplication
  14. Protecting Virtual Infrastructure
  15. Big Data
  16. Data Storage Protection
  17. Tape
  18. Converged Infrastructure
  19. Data Protection Service Catalogues
  20. Holistic Data Protection Strategies
  21. Data Recovery
  22. Choosing Protection Infrastructure
  23. The Impact of Flash on Data Protection
  24. In Closing

There’s a lot there – you’ll see the first eight chapters are not about technology, and for a good reason: you must have a grasp on the other bits before you can start considering everything else, otherwise you’re just doing point-solutions, and eventually just doing point-solutions will cost you more in time, money and risk than they give you in return.

I’m pleased to say that Data Protection: Ensuring Data Availability is released next month. You can find out more and order direct from the publisher, CRC Press, or order from Amazon, too. I hope you find it enjoyable.

Jun 112015

I’m back in a position where I’m having a lot of conversations with customers who are looking at infrastructure change.

What I’m finding remarkable in every single one of these conversations is the pervasiveness of Cloud considerations in data protection. I’m not just talking Spanning for your SaaS systems (though that gets people sitting up and taking notice every time), I’m talking about businesses that are looking towards Cloud to deal with something that has been a feature of datacentres for decades: tape.

I’ve mentioned CloudBoost before, and I personally think this is an exciting topic.

iStock Cloud Touch Small

An absolute ‘classic’ model now with NetWorker is to have it coupled with Data Domain systems, with backups duplicated between the Data Domain systems and removing of tape – at least for that daily and weekly backup cycle. Data Domain Extended Retention is getting a lot of interest in companies, but without a doubt there’s still been some people who look at a transition to deduplication as a phased approach: start with short-term backups going to deduplication, and keep those legacy tape libraries around for handling tape-out for monthly backups.

That certainly has appeal for businesses that want to stretch their tape investment out for the longest possible time, especially if they have long-term backups already sitting on tape.

But every time I talk to a company about deploying Data Domain for their backups, before I get to talk about CloudBoost and other functions, I’m getting asked: hey, can we say, look towards moving our long term backups to Cloud instead of tape at a later date?

You certainly can – CloudBoost is here now, and whether you’re ready to start shifting longer-term compliance style backups out to Cloud now, or in a year or two years time, it’ll be there waiting for you.

Over time (as the surveys have shown), backup to disk has increased in NetWorker environments to over 90% use. The basic assumption for years had been disk will kill tape. People say it every year. What I’m now seeing is Cloud could very well be the enabler for that final death of tape. Many businesses are entirely killing tape already thanks to deduplication: I know of many who are literally pulling their tapes back from their offsite storage vendor and ingesting them back into their Data Domains – particularly those with Extended Retention. But some businesses don’t want to keep all those long-term backups on local disk, deduplicated or not – they want a different economic scale, and they see Cloud as delivering that economy of scale.

I find it fascinating that I’m being so regularly asked by people: can we ditch tape and go to Cloud? That to me is a seismic shift on the remaining users of tape.

[Side note: Sure, you’ll find related links from me below about tape. Just because I wrote something 1-3 years ago I can’t change my opinion 🙂 ]

May 162015

Introduced alongside NetWorker 8.2 SP1 is integration with a new EMC product, CloudBoost.

The purpose of CloudBoost is to allow a NetWorker server to write deduplicated backups from its datazone out to one of a number of different types of cloud (e.g., EMC ECS Storage Service, Google Cloud Storage, Azure Cloud Storage, Amazon S3, etc.) in an efficient form.

CloudBoostThe integration point is quite straight forward, designed to simplify the configuration within NetWorker.

A CloudBoost system is a virtual appliance that can be deployed within your VMware vSphere environment. The appliance is an “all in one” system that includes:

  • NetWorker 8.2 SP1 storage node/client software
  • CloudBoost management console
  • CloudBoost discovery service

One of the nifty functions that CloudBoost performs in order to make deduplicated storage to the cloud efficient is a splitting of metadata and actual content. The metadata effectively relates to all the vital information the CloudBoost appliance has to know in order to access content from the object store it places in the selected cloud. While the metadata is backed up to the cloud, all metadata operations will happen against the local copy of the metadata, thereby significantly speeding up access and maintenance operations. (And everything written out to cloud is done so using AES-256 encryption, keeping it safe from prying eyes.)

A CloudBoost appliance can logically address 400TB of storage in the cloud pre-deduplication. With estimated deduplication ratios of up to 4x for data analysis performed by EMC, that might equate to up to 1.6PB of actual stored data, and it can be any data that NetWorker has backed up.

Once a CloudBoost appliance has been deployed (consisting of VM provisioning and connection to a supported cloud storage system), and integrated into NetWorker as storage node with in-built AFTD, getting long-term data out to the cloud is as simple as executing a clone operation against the required data, with the destination storage node being the CloudBoost storage node. Since the data is written to the CloudBoost embedded NetWorker Storage Node, recovery from backups that have been sent to the cloud is as simple as executing a recovery with the copy on the CloudBoost appliance being selected to use.

In other words, once it’s been setup, it’s business as usual for a NetWorker administrator or operator.

To get a thorough understanding of how CloudBoost and NetWorker integrate, I suggest you read the Release Notes and Integration Guide (you’ll need to log into the EMC support website to view those links). Additionally, there’s an excellent overview video you can watch here:


May 312013

iStock Cloud Touch Small

The other day I stumbled across a link to an article, Why you should stop buying servers. The title was interesting enough to grab my attention so I had a quick peruse through it. It’s an article about why you should start using the cloud rather than buying local infrastructure.

While initially I was reasonably skeptical of cloud, that view has been tempered over time. When handled correctly, cloud or cloud-like services will definitely be a part of the business landscape for some time to come. (I personally suspect we’ll see pendulum swings on cloud services in pretty much exactly the same way as we see pendulum swings on outsourcing.)

The lynch pin in that statement above though is when handled correctly; in this case, I was somewhat concerned at the table showing the merits of cloud servers vs local servers when it came to backup:

Cloud vs Local ServersThis comparison to me shows a key question people aren’t yet asking of cloud services companies:

Do you understand backup?

It’s not a hard question, but it does deserve hard answers.

To say that a remote snapshot of a virtual server represents an offsite backup in a single instance may be technically true (minus fine print on whether or not application/database consistent recovery can be achieved), but it’s hardly the big picture on backup policies and processes. In fact, it’s about as atomic as you can get.

I had the pleasure of working with an IaaS company last year to help formulate their backup strategy; their intent was clear: to make sure they were offering business suitable and real backup policies for potential customers. So, to be blunt: it can be done.

As someone who has worked in backup my entire professional career, the above table scares me. In a single instance it might be accurate (might); as part of a full picture, it doesn’t even scratch the surface. Perhaps what best sums up my concerns with this sort of information is this rollover at the top of the table:

Sponsor content rolloverSeveral years back now, I heard an outsourcer manager crowing about getting an entire outsourcing deal signed, with strict requirements for backup and penalties for non-conformance that didn’t once mention the word recovery. It’s your data, it’s your business, you have a right and an obligation to ask a cloud services provider:

Do you understand backup?


Jun 022012

Those who regularly follow my blog know that I see cloud as a great unknown when it comes to data protection. It’s still an evolving model, and many cloud vendors take the process of backup and data protection a little to cavalierly – pushing it onto the end users. Some supposedly “enterprise” vendors won’t even let you see what their data protection options are, until you sign an NDA.

Recently I’ve been working with a cloud service provider to build a fairly comprehensive backup model, and it’s greatly reassuring to see companies starting to approach cloud with a sensible, responsible approach to data protection processes. It’s a good change to witness, and it’s proven to me that my key concerns with data protection in the cloud originated from poor practices. Take that problem away, and cloud data protection becomes a lot better.

Stepping back from the enterprise level, one thing I’m quite cognisant of as a “backup expert” is designing my own systems for recovery. I have a variety of backup options in use that provide local protection, but providing off-site protection is a little more challenging. Removable hard-drives stored elsewhere exist more for disaster recovery purposes – best used for data that doesn’t change frequently, or for data you don’t need to recover instantly – such as media.

Inevitably though, for personal backups that are off-site as quickly as possible, cloud represents an obvious option, so long as your link is fast enough.

Some time ago, I used Mozy, but found it somewhat unsatisfying to use. I could never quite bring myself to paying for the full service, and once they introduced their pricing changes, I was rather grateful I’d abandoned it – too pricey, and prone on the Mac at least to deciding it needed to start all backups from scratch again.

So a bit of digging around led me to Crashplan. Specifically, I chose the “CrashPlan+ Family Unlimited Monthly Subscription” option. It costs me $12 US a month – I could bring that down to an effective $6 US monthly charge by paying up-front, but I prefer the minimised regular billing option over a single, up-front hit.

Crashplan+ Family Unlimited allows me to backup as much data as I want from up to 10 computers, all tied to the same account. Since it has clients for Windows, Mac OS X, Linux and Solaris, I’m fairly covered for options. (In fact, so far I’ve only been working on getting Mac OS X clients backing up.)

On standard ADSL2, with an uplink speed currently maxing out at 600Kbps, I don’t have the luxury of backing up everything I have to a cloud provider. At last count, Darren and I have about 30TB of allocated storage at home, of which about 10TB is active storage. So, contrary to everything I talk about, I have to run an inclusive backup policy for cloud backups – I select explicitly what I want backed up.

That being said, I’ve managed in the last few months, given a host of distractions, including moving house, to push a reasonable chunk of non-recreatable data across to Crashplan:

Crashplan Report

That’s the first thing I like about Crashplan – I get a weekly report showing how much data I’m protecting, how much of it has been backed up, and what machines that data belongs to. (I like reports.)

As an aside, for the purposes of backing up over a slow link where I have to be selective, I classify data as follows:

  • Non-recreatable – Data that I can’t recreate “as is”: Email, documents, iTunes purchased music, etc.;
  • Recreatable – Data which is a distillation of other content – e.g., the movies I’ve encoded from DVD for easy accesss;
  • Archival – Data that I can periodically take archive copies of and have no urgent Recovery Point Objective (RPO) for – e.g., virtual machines for my lab, etc.

For both recreatable and archival content, the solution is to take what I describe as “local offsite” copies – offline copies that are not stored in my house are sufficient. However, it’s the non-recreatable content that I need to get truly offsite copies of. In this instance, it’s not just having an offsite copy that matters, but having an offsite copy that’s accessible relatively quickly from any location, should I need. That’s where cloud backup comes in, for me.

But there’s more than weekly reports to like about Crashplan. For a start, it intelligently handles cumulative selection. That’s where I have a large directory structure where the long-term intent is to backup the entire parent directory, but I want to be able to cumulatively add content from subdirectories before switching over. For example, I have the following parent directory on my Drobo I need to protect:

  • /Volumes/Alteran/Documents

However, there’s over 200 GB of data in there, and I didn’t want a single backup to take that long to complete, so I cumulatively added:

  • /Volumes/Alteran/Documents/• Sync
  • /Volumes/Alteran/Documents/Backgrounds
  • /Volumes/Alteran/Documents/Music
  • etc

Once all of these individual subdirectories backups were complete, I could switch them off and immediately switch on /Volumes/Alteran/Documents without any penalty. This may seem like a common sense approach, but it’s not something you can assume to happen. So recently, with no net impact to the overall amount of data I was backing up, I was able to make that switch:

Backup Selections

Crashplan offers some neat additional tricks, too. For a start, if you want, you can configure Crashplan to backup to a local drive, too. Handy if you don’t have any other backup options available. (I’m not using that functionality, but between cross-machine synchronisation with archive, Time Machine and other backup options, I’m fairly covered there already.) You can also have your friends backup to you rather than Crashplan themselves – which would be useful in a household where you want all the data to go across to Crashplan from one central computer for ease of network control:

External backup options

The meat of a backup product though is being able to restore data, and Crashplan performs admirably on that front. The restore interface, while somewhat plain, is straight forward and easy to understand:

Recovery Interface

One of the things I like about the recovery interface is how it leads you from one logical step to another, as evidenced by the text directly under the main file selection box:

  1. First choose what you want to recover
  2. Optionally change what version you want to recover
  3. Optionally change the permissions for the recovered files
  4. Optionally change the folder you recover to
  5. Choose what to do with existing files

All of these are the sorts of standard questions you’d expect to deal with, but rather than being hidden in a menu somewhere, they’re out in the open, and configured as hyperlinks to immediately draw the attention of the user.

Overall I have to say I’m fairly happy with Crashplan. I trialled it first for free, then upgraded to the Family+ plan once I saw it would suit my needs. As a disclaimer, I did have one incident where I logged a support case it took Crashplan 12 days to respond to me, which I found totally unacceptable, and poor support on their behalf, but I’ll accept it was an isolated incident on the basis of their subsequent apology and feedback from other Crashplan users via Twitter that this was a highly abnormal experience.

If you’re looking for a way of backing up your personal data where offsite and accessibility are key criteria, Crashplan is certainly a good direction to look.  While the Crashplan user interface may not be as slick looking as other applications, it works, and it leads you logically from one set of selections to the next.

[Edit, 2012-12-21]

A few months have gone by since that post, and I’m now up to over 1.5TB backed up to Crashplan across 6 computers, 2 x Linux, 4 x Macs. I remain very confident in Crashplan.

Clouds and bandaids

 General Technology, General thoughts  Comments Off on Clouds and bandaids
Nov 092011

I’m going to stir the pot a bit here and suggest that a reasonable percentage of businesses considering cloud deployments have their heads in the sand over where their problems really lay.

In short, any company that sees cloud – public cloud – as a solution to becoming more flexible and cost competitive don’t actually understand the real problem they’re facing.

That problem? Business/IT alignment.

One of the oft-touted advantages of public cloud is that it’s about enabling businesses to react more quickly to changing IT requirements than their own IT departments do.

How many people can honestly look at that description and think that the solution is to cut off the IT limb from the business body?

The solution – the real solution – is to address whatever divide there is between the business and IT. And have no doubt – if blame is to be laid it will likely need to be laid with equal measure at both the feet of IT and the rest of the business.

Shifting to public cloud instead says that the business itself is unwilling to seriously look at its own function, behaviour and attitudes and address real issues. It’s the business hoping for an “abracadabra” solution rather than working on a real solution.

It’s like having an arterial bleed, and sticking a bandaid on it.

37 Signals and the “end of the IT department”

 Architecture, Aside, Data loss, General Technology, General thoughts  Comments Off on 37 Signals and the “end of the IT department”
Mar 022011

The folks over at 37 Signals published a little piece of what I would have to describe as crazy fiction, about how the combination of cloud and more technically savvy users means that we’re now seeing the end of the IT department.

I thought long and hard about writing a rebuttal here, but quite frankly, their lack of logic made me too mad to publish the article on my main blog, where I try to be a little more polite.

So, if you don’t mind a few strong words and want to read a rebuttal to 37 Signals, check out my response here.

Nov 182009

So The Register has a story about how Microsoft is edging closer to delivering it’s cloud based system, Azure.

It seems inept that through the entire article, there wasn’t a single mention of the Sidekick Debacle. As you may remember, that debacle was sponsored by ‘Danger’, a Microsoft subsidiary. If you think Microsoft weren’t involved because Danger was a subsidiary, think again.

If we can learn anything from this, it’s that too many people like to close one eye and half shut the other one to make sure they don’t see all those dark and dangerous storm clouds racing around their silver linings.

Based on Microsoft’s track record, I wouldn’t trust Azure for a minute with a KB of my data even if they were paying me. Not until there’s an industry-wide alliance for certifying cloud based solutions and ensuring vendors actually treat customer data as if it were their own most sensitive and important data. Not until Microsoft are a gold member of that alliance and have come out of their first two audits with shining covers.

Until then when it comes to Azure, all I see are dark Clouds with no silver linings.