Earlier in the year, I wrote a post, “Technology is rarely the issue“. In that post, I said:

As techos though, let’s be honest. The technology is rarely the issue. Or to be more accurate, if there’s an issue, technology is the tip of the iceberg – the visible tip. And using the iceberg analogy, you know I mean that technology is rarely going to be the majority of the issue.

Now it’s time for the follow-up.

In that article, I was effectively talking about specific situations – e.g., when someone says to you, “product X is crap; we’ve had it for Y months and it still doesn’t work properly”. While sometimes it will mean that product X is bad, it usually means that the wrong product was purchased, or there wasn’t enough training, or it’s being misused.

In this post, I want to turn from the specific, to the generic, and suggest that it is rarely going to be the case moving forward that technology is the solution. It will be part of the solution, but we are moving out of a situation where a single piece of technology is the entire solution to a problem. In fact I’d suggest that it was rarely the case anyway, but we must be more aware as technology continues to get more powerful – it’s not a magic bullet.

For this reason, the core emerging technology – that we must continue to demand from vendors, and continue to support the development of – is interoperability. This may come from open standards, or it may come from virtualisation, but it has to become core to all future technology.

Why?

We no longer have the luxury of mass swapping and changing of technology. Martin Glasborow, AKA Storagebod, wrote in “Migration is a way of life“:

One of the things which is daunting is the sheer amount of data that we are beginning to ingest and the fact that we are currently looking at a ‘grow forever’ archive; everything we ingest, we will keep forever.

Even though we are less than two years into the project, we are already thinking about the next refresh of technology. And what is really daunting is that with our data growth; once we start refreshing, I suspect that we will never stop.

Not only will we be storing petabytes of new content every year; we will be moving even more old content between technologies every year. We are already looking at moving many hundreds of terabytes into the full production system without impacting operations and with little to no downtime.

While Martin’s organisation is undoubtedly at the “big data” end of town, it reflects a growing problem for many organisations – the shrinking grace period. Previously we had scenarios where capital expenditure periods of say, 3 years worth of equipment purchase, would have short implementation periods, followed by long-term controlled and pre-allocated growth periods, followed by the final preparatory process leading into the next CapEx cycle.

This is increasingly becoming a luxury. As data growth continues, regardless of whether that data is hosted locally or externally, mass migration projects will become a thing of the past. It’s not possible to stop a business long enough to do a migration. They have to run seamlessly and synchronously in the background, transparent to users and the business, and the only way this will happen is via interoperability.

The two methods to achieve this are either compatible APIs/protocols, and virtualisation. In the cloud space for instance, whatever brick level storage is chosen, only a fool would deploy their business storage on just a single cloud provider. So you need two different providers, and you need to be able to interface with the same storage at both providers without every access step being an “If writing to Cloud X, this way, else that way.”

For locally accessible storage, virtualisation is critical – not just at the OS layer, but also at the storage layer. That way, it doesn’t matter whether you’re currently buying vendor X, Y or Z arrays and storage – and which ones are currently active. It should all be transparent to the business.

This is why technology is not the solution. Or rather, specific technology is not the solution. It’s the application of technology, and the interoperability of currently deployed technologies that will be the solution every time.

If you’re not thinking along these lines, you’re still staring into the past.

 

How often have you heard these two memes?

“Tape Sucks”

“Tape is dead”

Oh it just goes on and on, and on and on and on. One might think that I’m having a dig at EMC and Data Domain here – particularly in light of my response on another topic’s comment thread here. And while some folk at EMC and Data Domain would technically be in my sights on this post, there’ll equally be folks from NetApp and a plethora of other vendors who think that tape is dead. So I’m not so much picking on any company, just the meme itself.

It’s the same story, over and over again. Some new whiz-bang product comes out, and people jump onto the “tape is dead” bandwagon again. Only like a really bad villain in a superhero movie, tape just won’t die. It has more lives than every cat in the world combined.

Sure, its use has evolved over time. I’m the first to admit that. When I first started in backup myself, the notion of backing up to disk was a complete anathema. After all, I had to beg, borrow, plead and promise long on-call shifts just to get a couple of extra 2GB spindles for my backup server to handle indices and temp space. Why would I have been so crazy as to backup to such an expensive medium? Tape, on the other hand, was much cheaper.

Over time disk became cheaper and had higher capacities, but it still isn’t as cheap or as high capacity as tape over the long haul. Where it exceeds tape every time is on the economics of access. You need that data back straight away? Then it needs to come back from disk, not tape. There’s no load times, etc., when it comes to disk.

And so over time as disk became cheaper, we (the industry) evolved backups to use tape as secondary, long term or high capacity storage. Backup to disk, keep the most frequently recovered backups on that medium (i.e., the most recent), and keep copies on tape. As space fills, we shift those older backups off to tape, and keep using disk for the high frequency recoveries. Disk also smooths out those pesky shoe-shining issues we see in highly varied streaming speeds to tape, too.

So it’s a win-win solution, and it’s going to stay that way for some time to come. Tape may have evolved, but it’s still better, cheaper, and more reliable for longer term storage. Curtis Preston has an excellent summary of this point here, for what it’s worth.

Will the “tape is dead” people come around to reality? Probably not. Adherents to the repeated meme don’t always give up so easily. After all, there’s even people who still believe in a flat earth.

 

If you’re not familiar with the term jumping the shark, you might want to read up on the history of it over at Wikipedia. The basic premise though comes direct from the decline and fall of Happy Days, and is summed up as:

Jumping the shark is a widely used idiom, first employed to describe a moment in the evolution of a television show, characterized by absurdity, when a particular show abandons its core premises and begins a decline in quality that is beyond recovery.

Now, I’ve worked for companies that have had partnerships with EMC for the last 11 years, and you can take what I’m about to say with whatever grain of salt or assumption of bias, but I’d like to think that I’m actually speaking more from the Australian perspective of not tooting ones horn in a way that goes completely overboard.

It started with a tweet by John Martin, aka @life_no_borders:

When a customer says “NetApp is contributing more auditable financial benefit than every other technology vendor combined” it makes me proud

If you’re not familiar with him, John is a senior NetApp employee in Australia. (Obviously John, as an Australian, had a different perspective here to me.)

Once this was Tweeted by John it was picked up and retweeted by a couple of other NetApp twitterers, and I want to make myself 100% clear here:

I am not, in any way, questioning what John has said. A customer may very well have said that. That’s not my point.

My point is that the statement jars. It’s odd. It’s jarring – there’s some compliments that … let’s just say … come out wrong. Like the time someone told one of the directors at my previous company, “You should bottle Preston’s blood.” Sure, there’s a compliment, but sometimes – particularly in the IT world – an unbelievably good compliment just comes across as jumping the shark.

I wasn’t the only person who thought this. Matt Davis aka @da5is, an extremely good techo in the storage industry, tweeted:

“one vendor saved me more money than all others” is either fanboism or incompetence.

Now, to be accused of fanboism in say, the competition between Apple, Microsoft and Google is a daily event for some people. I know people call me an Appletard and an Apple Fanboi regularly, and I’ve argued that this isn’t the case. And despite their opinions of me on that front, I value their insight.

However raising it against a person or a company on the storage front isn’t something you regularly hear, yet it comes close to explaining that jarring sensation when reading John’s original quote.

John replied to Matt at this point:

@da5is Neither fanboi or incompetence, just brilliant execution from a NetApp reseller and services team delivering on their promises.

Phil Jaenke aka @rootwyrm also weighed in helping to highlight the silliness of the entire thing:

@life_no_borders @da5is No, it really isn’t. No vendor would sit down and take it, and I know two that would cut off NetApp’s legs to win.

So what’s the story here? Did NetApp earn such a quote from a customer? Very likely – and very likely that every other vendor out there and any decent systems integrator worth their salt out there has also received a similar compliment at some point or another.

Accurate customer quote or not, I just don’t see an average manager looking at the quote and saying “OK, you’ve sold me”. If I were a betting man, I’d probably look at the average Australian manager and lay $10 that they’d think something along the lines of: “NetApp, did you just jump the shark?”

NetApp Jumps the Shark

What do you think? Did they jump the shark?

 

I periodically see indignant tweets and comments by people that if you sell something to a client, then you’re at worst being unethical, or at best being idiotic to say that you like to consider customer relations as partnerships.

This has reached the point where I’ll no longer sit back and listen to cynics who think that as soon as you start selling you either cease being human, or cease being unable to think symbiotically.

Insisting that companies cannot, and should not, refer to clients as partners, is at worst toxic and at best, demeaning to all parties.

Now, I’m not going to say that there are instances where some companies jump on the bandwagon and like to insinuate a partnership but stick to a traditional “stick whatever badge you need on that widget to sell it” sales approach. Of course that is going to happen.

But to tar all companies that sell, or integrators with that brush? Pah! Think again.

I’ve worked in some form of consulting pretty much all my career. I started as a trainee consultant, and when that programme was dying I transferred across to a Unix system administration team. Even as an “end customer” I still had my own customers, and as the company I was working for started taking on outsourcing contracts, I started being a consultant again. That was followed by a brief stint in the less than compatible world of finance, and since then I’ve remained in consulting.

Consult! Consult! Consult!

Consulting, systems integration, however you want to think about it, does not work well when customers are treated as meat – as paying clients to service the next bill. That leads to a succession of one-off engagements and implementations. Rape a company of budget, move on to the next and pillage that, too. It’s not a sustainable model. Or rather, unless you’re a global company and trade on some pre-established name, that model doesn’t get you very far. Pretty soon you get a crap name in the market and you start driving yourself out of business. You’ll blame the technology you’re using, and switch to another product, or another vendor, exhaust a new set of customers, and move on again.

There’s only one sustainable model in consulting and systems integration, and that’s the model where you engage with clients in a partnership. I’m not talking about looking for joint ventures; I’m talking about basic recognition of fundamental business cooperation, viz:

  • I want to help you succeed at what you do;
  • If you succeed at what you do, you’ll be able to help me succeed by buying things from me.

Symbiotic? Or parasitic? A cynic would say parasitic, and they’d be wrong. Or they’d come from the “everything should be free except for what I do” school of business. You know – the people who think that the only company entitled to put markup on a widget, or make a profit, is themselves.

It’s actually a symbiotic relationship, because it recognises that a relationship can actually be of mutual benefit to both parties. It doesn’t have to be about one “winning” and one “losing”, or “one making money” and “one spending money”.

The absolute basis of my belief in this is covered in my “13 traits of a great consultant” post. In particular, point 11 sums up exactly why a customer/client relationships should become a partnership:

Solve the problem, don’t answer the question – From an IT perspective, I use this example: an engineer, if asked a question by a customer, will do his or her utmost to answer the question as exactingly as possible. A consultant will look past the direct question and aim to solve the problem that led the customer to ask the question. Or in other words: if it doesn’t have a yes/no answer, no question is asked in isolation.

If you just have a customer/client relationship, then all you get is an engineering relationship. “Yes we can sell you widget X? What, you thought widget X did Y? But you didn’t ask? Thankyou for shopping, no refunds!” Do you really want that sort of relationship? Going down that path, you get a plethora of situations where technology is blamed for non-technical issues – and indeed, it happens at both the client and the sales side.

Form a symbiotic partnership though, and the relationship is far more wholesome and useful. From the sales side of it, satisfied customers whom you consistently deliver expected results to are repeat customers; repeat customers form the basis of predictable sales and earnings, and as time goes on provide valuable feedback to your growth as a company, too. From the client side, you get solutions that are tailored to your needs by people who you know and trust – and you know and trust them because they’re very much aware of your business requirements, constraints and operational models. A partner in fact will be able to help you through the rougher times – regardless of whether that’s unexpected staff changes without handover, or simply when needing a leaner approach that sacrifices scope only, rather than quality and scope. A partner will have the experience of working within your organisation and be able to deliver faster, more efficiently, and with less impact to your operational processes.

So, the next time someone suggests to you that you can’t have a partnership in a sales/client model, or that consultants/system integrators can’t form symbiotic relationships with your business, consider this one question:

Do you want a supplier you can trust, or a box dropper?

Rarely, if ever, will the answer be the latter.

 

Martin Glassborow, aka @storagebod, and I had a bit of a discussion via Twitter, which came down to the following:

  • Martin feels the default backup policy within an environment should be to backup nothing;
  • I feel the default backup policy within an environment should be to backup everything.

Now the interesting thing is, we both actually meet in the middle, but just start from different points.

Martin has discussed his reasoning behind his default policy here, in “Don’t BackUp“, which I encourage you to read before continuing. There is, indeed, as Martin suggested in a tweet to me last night, a nice absolutism in either approach – don’t backup, or backup everything. Yet, neither is really the case.

My approach – that being to start with “backup everything”, starts with the following assumptions:

  1. Hardware can fail.
  2. Software can fail.
  3. Humans can make errors.
  4. Processes can fail.

By my very nature I think I’m perfectly suited to working in the backup space. I’ve always been into backup. On the Vic-20, when I was learning to program, I’d always save my programs onto two different tapes. On the Commodore 64, I’d always save my programs and documents onto two different disks. When I went to the PC, I’d always have a copy on a hard drive, and a copy on a floppy drive.

Martin’s approach is this:

Making it policy that nothing gets backed-up unless requested takes out all ambiguity. There can be no assumptions about what is being backed-up, it makes it someone’s responsibility as opposed to an assumed default.

There is, undoubtedly, logic in what Martin suggests, but it’s not a logical starting point I can personally reconcile myself with, for the fundamental reason that it (IMHO) assumes that everyone who interacts with the system understands the system and the nature of their interaction.

It in fact runs completely contrary to an axiom in user desktop/laptop backup approaches – if you leave backups up to the users, nothing will get backed up. That holds true for pretty much every business I’ve ever interacted with, from the most, to the least technical.

It’s for that reason, that lack of total systems awareness and data responsibility from all users of any environment, that my approach starts from the other end. Backup everything.

But I don’t really mean it. I abhor wastage. Recently, I’ve learnt that wastage comes in many forms, which is why the decision to move interstate and re-evaluate what I/we own has been cleansing. (See the article “deconstruction of falling stars” over at my personal blog for a bit more on that front.)

As I abhor wastage, I don’t actually believe you should backup everything within your environment. Sure, some vendors might like that notion – infinite tapes, disk, storage, snapshots, you name it. But it’s neither practical nor commercial reality to do this.

No, there is a middle ground. For me, the sweet spot is this what I always come back to:

It is always better to backup a little more than you need, and waste some storage media, than it is to not backup quite enough, and be unable to recover.

So if your tape usage is say, 5-10% higher than it should be, or your VTL/B2D environment is 5-10% bigger than it really needs to be, I’m not concerned. (If it’s a crazy amount, like 100% more, then there’s a problem – a serious problem that has arisen from a lack of capacity planning, etc.)

I’ve seen IT sites where NetWorker agents have been deployed on every server within the environment, and when I’ve done a coverage analysis, I’ve seen servers that have this as the saveset:

/etc/hosts

Just that. Nothing more, nothing less. (You couldn’t get much less anyway.) I’ve equally seen sites where not only was a hot backup done of the production Oracle database via a module, but the database files were backed up as part of the filesystem backup, and then export/dumps were generated and backed up as well. Overkill? Yes. Were some backups unrecoverable? Yes.

Both are very clear examples of wastage, but I’ll tell you the difference.

The latter one – backing up too much, is time and money wastage. Neither are pleasant, both can hurt the bottom line of a company, yet that’s where it stops.

The former – backing up only what is explicitly requested, nothing more, is corporate wastage. There’s a little bit of monetary wastage involved (why spend the money on an agent to backup a single file?) – the real wastage though is that it could waste the company. Unable to recover legally required files because someone forgot to request them to be backed up? Hello, lawsuit loss. Unable to recover financial data that proves your company has correctly paid its taxes because someone forgot to request them to be backed up? Hello, double tax payments. For me it triggers thought of every possible nightmare scenario a company might experience, right through to total dissolution and loss of the company itself.

In my book, I make the differentiation between what I call inclusive and exclusive backup products. I define:

  • An inclusive backup product is one where you have to explicitly specify what gets backed up. By default, nothing is backed up unless you specify it.
  • An exclusive backup product is one where you have to explicitly specify what doesn’t get backed up. By default, everything is selected and you have to winnow that selection down yourself.

The first, I consider to be the hallmark of a workgroup backup product approach. Cost reduction is the primary focus of this approach. The second, I consider to be a fundamental requirement for a product to earn the “enterprise backup product” badge of honour. Without this, there is a distinct lack of trust.

While I can understand Martin’s starting point, and that he moves more to the middle of making sure the right things are backed up, I can’t agree with this logic that this is the best approach.

I’ve seen, heard of, and witnessed too many IT war stories.

 

One of the stories I sometimes hear from companies is that some technology X doesn’t work in their environment because X sucks, or X is broken, or X … well, you get the picture.

Years ago, when I first got into backup, the the main reasons I had to do recovery were due to system or hardware failures. Hard drive reliability was IMHO much lower, operating systems were frequently less stable, etc. Reliability was about getting to 99% availability, let alone 99.9% or anything grandiose like that.

These days, hardware/OS/app failure is, I’d suggest, one of the least likely reasons for a recovery being conducted in most organisations. Instead, it’s mainly related to soft issues – user error, audits, compliance checking, etc.

There’s a point here, and I’m almost ready to make it.

Back when I first started with backup, I’d have agreed that technology could be firmly blamed for a lot of errors. These days? Rarely – even when I blame it.

I periodically go on a rant about just how painful Linux is sometimes, but at the core I also admit that it’s a lack of training and time on my part – I’ve not made learning the ins and outs of Linux firewalls a field of study in the past, so now that I’m having to construct them by hand for a personal project it’s about as fun as tasering myself in the genitals. Technology is partly the problem – as is always the case with Linux, it’s designed for programmers and developers to manipulate, not for end users, or people like me who have concentrated on other things and just want the damn thing to work.

Ahem, where was I?

The simple fact is that we often blame technology because it’s easy. It’s like kids picking on the “easy target” at school with bullying; we bully technology and blame it for all our woes and issues because well, it doesn’t really fight back. (Hopefully we’ll get out of this habit before the singularity…)

As techos though, let’s be honest. The technology is rarely the issue. Or to be more accurate, if there’s an issue, technology is the tip of the iceberg – the visible tip. And using the iceberg analogy, you know I mean that technology is rarely going to be the majority of the issue.

The ‘issue’ iceberg in IT looks like this:

The issue iceberg

It’s probably best here that I stop and differentiate between issues and problems. A problem to me, is an isolated or an atomic failure – like, a faulty tape drive, or a failed hard drive. They’re clearly technology related, but they’re not really issues. An issue is a deeper, systemic and compound failure. E.g., something like “on any one day, 30% of my backups fail”, or “Performance across all systems is generally 50% worse at end of month”, etc.

When technology gets blamed in those instances, I’m reminded of someone who say, never has their car serviced, then when it eventually breaks down complains that the car was a lemon. Was it that the car failed the person, or more accurately that the person failed the car?

As I said, it’s easy to blame the thing that can’t defend itself.

In environments with ongoing, long-term issues, there reaches a point where you have to sit back and ponder – is the technology causing the issue, or is the environment causing the technology to have an issue?

The inevitable and hard truth is that in some cases, it’s the latter, not the former.

Let’s consider a basic scenario – the “on any given day 30% of our backups fail” scenario. So, does that mean that on any given day 30% of servers crash and reboot during the backup? Or does the backup software agent crash on 30% of servers when a backup is attempted? Maybe, in the most exceptional of circumstances, this may be the case.

In reality though? In reality we have to start looking at the rest of that iceberg:

Rest of the iceberg

High systemic failure rates, if attributed to the deployed technology, should result in a law suit. How often do we see that happening?

>queue the cicadas<

That’s right.

When there are systemic failure rates, a business must, eventually, turn to face the truth that they have to review their:

  • Policies – Are there any governing rules to the company which are contributing to the problem? For instance, does the company require the technology to be adapted in such a way that it wasn’t designed for? This can be hard and real policies, or they can be implicitly allowed policies – such as empire building.
  • Processes – Are there operating methods which are triggering the issue? Imagine a business for instance where change control has become such a consuming process that backup failures are repeatedly allowed to occur because a change window isn’t available. Is that the fault of the backup technology?
  • People and Education – I’m not suggesting that staff at sites are incompetent. Far from it. Incompetent is such a harsh, unpleasant word that in the 15+ years I’ve been consulting, it’s been a very rarely used word. Education though is a factor. No, I’m not picking on people without tertiary skills, but training is a factor. For example, managers who have no day to day technical experience may decide that some technology, based on a half hour vendor pitch, is easy enough that staff won’t need training in it. If said staff then go on to say, accidentally delete a LUN from a production server, because they weren’t trained , how is that the fault of the SAN?

Navel gazing, introspection, call it what you will, it’s not always a pleasant task. It’s about objectively looking at how we’re doing things, and ask, “are we partly to blame?”

Yet, if you aren’t prepared to do this, you’re doomed (yes, doomed) to keep making the same mistake again, and again, and again. The pile of failed technology builds up, the quest for the silver bullet becomes more frenetic, and the chances of a major failure happening increase. In the worst scenarios, it can become decidedly toxic.

But it doesn’t need to be. Evaluating your processes, your policies and your people (particularly the training of your people) can be – well, cathartic. And the benefits to the business, in terms of literal cost savings and efficiencies, ensures that the introspection is well worth it.

As a consultant, you might assume that it’s my job to ensure that customers buy the best and the most expensive technology out there that I can sell them. That’s a cynical attitude that comes from a few shoddy operators. As a consultant, my job is to partner with you and your company and help you achieve your best. (If you think I’m just blowing smoke up your proverbial, check my “13 traits of a great consultant” article.)

Sometimes that means highlighting that there are issues, not problems, and those issues require a deeper fix than plugging in a new piece of technology.

 

It struck me recently while working on a report that there’s 7 distinct challenges in data protection, and that we can only address those challenges when we’re completely across them.

Most sites with enterprise backup will be aware of a few of these challenges, but as soon as you lose sight of some of them, you’ve lost focus on the goal.

They are:

  1. Budget
  2. Communication
  3. Regulatory Compliance
  4. Age
  5. Volume
  6. Search
  7. Formalisation

Each of these on their own represents a particular obstacle or hurdle that needs to be overcome. I should also stress – these are issues for data protection as a whole, and that’s not necessarily limited just to backup and recovery.

What’s even more important, is when you look at that list, it’s clear that any issues your site is having are not unique. Every company has to deal with the same challenges, and therefore you don’t have to feel that your solution must be unique. It just simply has to fit.

And there’s a world of difference – and cost – between “unique” and “fit”.

Let’s look at each of those challenges individually and explain what I mean.

Budget

Something I mention a bit in my book, and when I run training courses, is that I could take the entire budget, for an entire organisation, spend it solely on data protection activities, and still not come up with a solution that is 100% proof positive against any form of data loss or contingency that may happen. There’s always another contingency or potential problem looming around the corner. Sure, it might end up being something like “asteroid hits the earth” or “pandemic kills 99% of the human population”, but the net fact is: you can’t pre-emptively deal with every single possible scenario that may occur.

So it all becomes a game of “risk vs cost”. What’s the risk of it happening? What’s the cost of preparing for it? What’s the cost of it happening and not being prepared? What’s the risk that there’s nothing you can do about it?

As soon as you can start boiling everything down to “risk vs cost” you can actually prepare your data protection needs appropriately.

Communication

Except in the smallest of businesses, there’ll be different departments. And as soon as you have different departments, you have to factor in communications between those departments. Effectively, at this point, we’re talking about IS – Information Services – rather than IT (Information Technology) getting involved. You need to have clear and effective communication between the various departments within the business and the IT group in order to ensure that everyone understands the data protection requirements. In fact, you need to have that communication for pretty much everything to work. (Otherwise you end up in a situation where people think the muck described by the 37 Signals essay is a realistic portrayal of IT.)

To form effective communication, you need a bridge between a department and IT. That bridge is IS; the IS people may actually be the same people as the IT people, but the fact remains that the communication must be held at the policy level rather than the technical level. It’s not the role of someone in department X to understand how Y is done. It’s the role of IS to take their requirements, take IT options, and present strategy and requirements to the business.

Or if you want to phrase it another way – imagine someone prancing around stage like a monkey with bad flop sweat screaming out “Communicate! Communicate! Communicate!”

It’s that important.

Regulatory Compliance

Like it or not, we’re in an age where there is regulatory compliance attached to a lot of data protection. How long should information be kept for? Does it need to be destroyed at the end of that life time, or can it just be kept ‘forever’ if that’s easier?

Someone, somewhere in the company, needs to be aware of the regulatory compliance requirements that affect the company. You might say this is part of communication, but usually there’s somewhat of a gulf between how long departments want to retain data for, and what they’re required to keep data for. As to which one is longer: well, flip a coin. You need to know both.

Age

Go to a museum or library. Find an old book in your language,  pick it up, open it to a random page, and I bet you’ll still be able to mostly grasp what was written. As an example, I’ve read Leviathan (Thomas Hobbes) several times. It’s not necessarily easy going, but you can do it.

Can you confidently say that a document written by someone in say, WordStar 1.1, hanging around in a tired old directory on a fileserver somewhere within your environment is still readable?

While age presents particular problems to paper based record keeping, it’s never been easier to preserve and replicate such information. Grab it early enough, and you photocopy the original, or scan/OCR it. Suddenly you’ve got the information all over again, in relatively pristine format. It might be from several hundred years ago even, if not longer. There’s fictional works out there going back 2000+ years that people just casually read, for instance.

But age presents a particular problem to data protection in a digital age: it doesn’t matter squat if you can recover, or keep online a document going back 5, 10, 15 years, if you can’t actually retrieve the data within it.

So age becomes a significant planning factor. How do you ensure that not only can you can retrieve a file or chunk of data from 7 years ago, or 10 years ago, but it actually is still meaningful to someone?

Volume

Without a doubt, the amount of data we’re storing each year grows at a fantastic rate. Data is somewhere between air and liquid – it seems to want to expand to fill whatever storage is available, within reason. The explosion in digital media is just further exacerbating this. I’d suggest that we’re moving from the first digital age into the second at the moment; the first digital age was where data was almost naturally structured – databases are a classic example. Now though, the second digital age is all about unstructured data. Educational facilities for instance are increasingly making every lecture done by every academic available – not as a bunch of PowerPoint slides, but the actual presentation, as a video file, and often as a separate audio file, to assist people with disabilities, or distant students.

That data growth is not slowing down. I don’t see it slowing down or plateauing any time soon – and nor does most of the storage industry.

Search

It used to be that finding data stored ‘somewhere’ was akin to finding a needle in a haystack. Now, it’s a case of finding a needle in dozens or hundreds of haystacks.

It doesn’t matter how much data you store online, or retain in backups, archive, etc., if you can’t find it when you need it. It’s the sister problem to the ‘age’ issue – there’s far more than just storage involved here.

Search is big business. We see that with Google every day, but let’s consider a prime example – it used to be that filesystem/OS search tools were primarily around filename search. “Tell me part of the file name, and I’ll have a hunt around for it”, was the old approach. Now, it’s “tell me something that’s in the file, and I’ll have a hunt around for it.” I use it every day. If anything, tools like Apple’s Spotlight, for instance, have devolved my previously anal retentive approach to file storage because I don’t have to rely so much on structure any longer. I can search by content.

That works for text. What’s coming next is searching by content for complex data and media. For instance, you can already search for audio – point your iPhone at a speaker, turn on Shazam, capture 11 seconds or so of a song and violá, you’ve suddenly found a song based on a snippet. I imagine in 10 years time people who have some sense of pitch will be able to hum, sing or whistle a few bars and do the same thing. Image search is a growing area too – you can upload an image to some websites and find copies of it online – even to the point of say, finding larger, higher resolution copies of it online, etc.

Video? Undoubtedly coming.

The first vs second digital age analogy works well here too, I think. Search was able to be relatively simple when data was mostly structured. However, with that move to unstructured data, search becomes vitally important.

Make sure you have a search strategy.

(Finally) Formalisation

Most IT departments have grown from ad-hoc, informal processes within the average company. Start with a few people hired to keep systems running, and eventually as the company grows you’ve suddenly got a team of IT staff in a full time department.

What often doesn’t grow is the formality of the documentation and processes. It’s only natural that people will want to keep these as informal as possible, and I’m not suggesting that they need to be miracles of modern communication, but the simple fact remains: if it’s not written down, it doesn’t get done.

There reaches a point in any organisation where you have to be prepared to bite the bullet and admit “we have to take a more formal approach to things”. Implementing change control is a classic example; most big businesses take this for granted – yet most small businesses will start out with almost no change control process at all. Eventually though the business will hit a critical size and it becomes vitally important to actually have a real change control process.

That same jump from informal to formal is required on every level. You need formal documentation about how the network hangs together, you need formal documentation about creating new user accounts, etc. And you definitely need formal documentation about how data protection is handled within the company.

Summarising

Coming back to the original list, I can reiterate that the challenges faced in data protection are:

  1. Budget
  2. Communication
  3. Regulatory Compliance
  4. Age
  5. Volume
  6. Search
  7. Formalisation

None of those, individually should be any surprise to anyone. Again, they’re not unique to anyone either. We all have these same issues, regardless of whether we’re a customer, an integrator, a vendor, a whatever.

As soon as you acknowledge the challenges though, you can plan to overcome them.

 

As a consultant, you get attuned to (or as some would have it, “cynical”) certain key phrases and statements when you’re in meetings. Sometimes these statements are innocent and exactly what the person says, but usually they set the alarm bells ringing.

As a bit of winding down after a hectic 7 days, I thought I’d share the top 15 statements that cause me to start immediately trying to get deep qualification of what I’ve just been told…

What they say...What I worry it means...
"Our backup results get filed automatically and someone reviews them.""We have a server that hasn't successfully backed up for 6 months, but no-one's been checking the notifications."
"All our backups fit on a single tape""We upgrade our hardware every time this isn't the case."
"We're very selective about what we backup.""We have critical production systems we forgot to add to our schedule."
"We don't want to get backup notifications.""Backup? Meh."
"Our DBAs do their own backups.""The DBAs don't believe in enterprise backup software and think dumps are better" ... OR ... "The backup administrators have lost control of the system and its spiralling out of control."
"We don't have SLAs""No one wants ownership of establishing SLAs"
"We don't need SLAs""We trust in luck, and hope we don't ever need SLAs"
"Our users are responsible for backing up their laptops""Every day we're losing critical data that may be legally or fiscally required by the company."
"We don't have to do monthly backups.""Even though we know we SHOULD do monthly backups, until someone puts it in writing, we're not going to."
"We've been asked to shrink our backup budget...""The business has this crazy idea that backup is an IT function and problem."
"Tape is dead""Someone with a vested interest in selling lots of HDD storage has visited lately."
"We do per-incident support.""We have an Icarus support contract."
"It's too busy here to do capacity planning.""We're wasting money as fast as we can get the budget for it."
"We don't need to {clone or otherwise duplicate} our backups.""We're going to suffer a critical data loss situation."
"We only backup production data.""A lot of people's work within the company is unprotected."

 

The folks over at 37 Signals published a little piece of what I would have to describe as crazy fiction, about how the combination of cloud and more technically savvy users means that we’re now seeing the end of the IT department.

I thought long and hard about writing a rebuttal here, but quite frankly, their lack of logic made me too mad to publish the article on my main blog, where I try to be a little more polite.

So, if you don’t mind a few strong words and want to read a rebuttal to 37 Signals, check out my response here.

 

When it comes time to consider refreshing the hardware in your environment, do you want do do it quickly, or properly?

Because here’s the thing: If you want to do it quickly – if you feel rushed, and want to just get it done ASAP, not seeing the point of actually doing a thorough analysis of your sizing and growth requirements, here’s what you do:

  • Guess at the number of clients you’re going to backup.
  • Guess at the amount of data you’ll be backing up from first implementation.
  • Guess at the growth rate you’ll experience over the X years you want the system to last for.
  • Guess at the number of staff you’ll need to manage it.

Then, once you’ve got those numbers down, multiply each one by at least 4.

Then, ask for twice the budget necessary to achieve those numbers – just to be on the safe side.

If you think I’m joking – I’m not; I’m deadly serious. Deciding to skip an architecture phase where you actually review your needs, your growth patterns, your staffing requirements, etc., because you’re in a hurry is a costly and damning mistake to make. So if you’re going to do it, you may as well try to make sure you can survive the budget period.

And if asking for that much budget scares the heck out of you – well, there is an alternative: conduct a proper system architecture phase. Sure, it may take a little longer to get things running, or cost a little more time/money to get the plan done, but once you’ve got that done, it’ll be gold.

© 2012 The NetWorker Blog Suffusion theme by Sayontan Sinha