Ongoing trends with tape

Tape LivesWhen I first started in backup and recovery, my primary backup medium was DDS-1 tapes, distributed across probably 15 servers in a computer room. Over time the number of hosts with dedicated tape drives dropped as systems were consolidated into NetWorker, and the NetWorker server got a couple of gravity-fed DDS autoloaders.

Needless to say, since that point I’ve watched lots of changes in tape technology, particularly since LTO burst onto the scene. DLT had been seemingly stagnant for years, a practical monopoly in the server space, and suffering a severe lack of innovation.

Despite years of various vendors trying to push that tape is dead, we’ll see it remain for some time yet, mainly because it still represents an incredibly economic way of storing large amounts of backup data. Sure, you can avoid using tape if you’ve got replicated backup-to-disk storage between two sites, but that either requires a substantial MAID-style footprint, or some deduplication unit – and either way it’s going to cost you a lot of money. (My personal belief is that 10TB per week backup is the minimum cut-off for consideration of deduplication technologies; and there’s a lot of businesses still backing up less than 10TB per week.)

So, here’s what I see as the key continuing trends for tape:

  1. Minimised usage for primary copy – This is a no-brainer, really. Backup to disk has taken over as the primary mechanism in a significant percentage of businesses – the “B2D2T” model, so to speak. There’s no doubt that model will continue, regardless of what that initial “to disk” looks like.
  2. Fallback/secondary copy – Tape will continue to reign supreme as the preferred fallback/secondary copy of backups for some time to come. This decade is indeed the one where some form of backup to disk will become the norm for the vast majority of businesses, but when it comes to those monthly backups that need to be kept for 7+ years, etc., tape will continue to shine.
  3. Enterprise tape is squeezed down – It used to be that there were two distinct tiers of tape: enterprise technology such as LTO (unless you believed the IBM hype that said LTO was toy-tape) and commercial/consumer tape, such as AIT, DDS, etc. That enterprise technology remained largely out of reach of the smaller businesses, but as backup to disk continues to press into the nearline/immediate recovery arena, use of enterprise tape as a primary backup and recovery source will be pushed down into smaller businesses.
  4. Commercial/consumer tape is squeezed out – Those non-enterprise tape formats, such as AIT, DDS, etc., are dead. Sony discontinued AIT to work with HP et al on DDS development, and DDS effectively died at v5. Oh, HP blather on about DDS still having a future – DDS-6/160 was released a while ago, and DDS-7/320 is supposedly in development, but these are dead duck technologies. These non-enterprise tapes were at best unreliable formats – they actually gave a lot of fodder to the “tape is dodgy” meme, and the way they’re kept on life-support by vendors unwilling to concede their time is past is frankly embarrassing.
  5. Deduplication will not migrate in any usable form to tape – Various companies blather about having “deduplication out” to tape from their products, be they target or source deduplication, but this writing of deduplicated data to tape format is fundamentally flawed and logically incompatible. Why? Deduplication requires massive amounts of random access to be able to rehydrate efficiently, but tape is sequential-access by design. So instead what is written out to tape in “deduplicated” format is entire deduplication environments, which must be read back and recovered to systems before a regular recovery can be run. Instead, they just create situations where recoveries aren’t done unless they’re hyper-critical because there’s too much effort involved.
  6. Hardware encryption will become the norm – Initially introduced in LTO-4, we’ll see continued adoption of hardware-encryption at the per-cartridge level as businesses become acutely aware of the potential damage caused by media theft. We’re already seeing various countries legislate requiring encryption of at-rest data in particular industries, and this is driving more businesses to use hardware encryption “just in case”.
  7. We’ll continue to be told tape is dead – As sure as the sun rises each day, we’ll awake almost every day to another story about the imminent death of tape.
  8. Direct iSCSI tape drives are here – Some vendors are already selling them; as the war settles between FC and IP, it’s logical that we’ll see tape drives and tape libraries appearing with 10Gbe connections. This should make connectivity simpler and quite possibly more flexible.

Other predictions

OK, the above list are the things I’m certain about. Here are a few things I’m not certain about, but I’ve been idly speculating on for some time…

  1. QR Barcodes – Personally, I think these are a joke. However, I’m betting that someone will start selling combo tape barcodes where for reach regular tape barcode you get a QR barcode so that operators and administrators can scan them from their phones, etc. They’ll be sold as allowing a whole new level of integration, automation and control, and a few businesses will get sucked into buying them. They won’t last long though. That’s assuming that QR barcodes themselves stay popular enough for this to happen.
  2. Tape RFID will get bigger – Some tape vendors are already selling tapes with RFID embedded. This’ll be a low-traction market for some time to come, but I suspect it’ll eventually become standard. I.e., this is an evolutionary rather than revolutionary progression in tape.
  3. Hardware twinning with software recognition – RAIT lost its appeal years ago, though some proprietary control systems such as ACSLS still support it. I suspect we’re going to reach a point though where hardware enabled tape twinning will be offered as a feature from those enterprise tape vendors who are being squeezed down. However, the difference will be that there’ll be APIs between the libraries/drives and the backup software to allow the backup software to see the secondary tapes as registered copies. Why? Tracking and accountability. Auditing and data tracking requirements will see to that. I don’t necessarily think that this will gain a lot of traction, but I do think it’ll become an offering again.

4 thoughts on “Ongoing trends with tape”

  1. I would have to say that I think your predictions are spot-on for the most part, including those that are a bit more circumspect. But, on the dedupe point, I think you are thinking too narrowly and have overlooked the opportunity presented by LTFS to overcome limitations with current approaches. I’ve outlined why at http://bit.ly/t0bAAh including a few other observations.

    1. Hi Chris,

      I considered LTFS when I wrote that deduped-data-on-tape still won’t fly.

      The simple fact is that while LTFS has been developed to allow a simulacrum of random access to tape, the seeks for data are still going to be wildly variant. I’ve worked with LTO in HSM environments, and the results are not pretty; if you consider that dedupe is effectively a block-level implementation of single-instance HSM, the end result is obvious.

      Having also dealt with file-level recoveries from block-level backups, the performance issues when dealing with recovery of large, highly fragmented files are … well, to be frank, hideous. If we’re talking large deduplicated data stores, again, it’s a similar sort of problem.

      So yes, it’ll “work”, but for very small values of “work”. Undoubtedly vendors will promote this as a usable solution, too, and undoubtedly some companies will buy into it. But the resource requirements will be higher, the performance lower, and the RTOs will have to be very large to support recovery from deduplicated tape – so it’s quite likely it’ll be at best a niche sort of solution.

      Cheers,
      Preston.

  2. Preston,

    I agree with you that if you attempt to put block level deduplicated data onto tape that the seek times would be atrocious. Realistically, anything more granular than file level deduplication is likely untenable. Moreover, using tape where a quick RTO is required will always be problematic and this is the reason (as you know and stated) that tape is a fail-safe medium and an archive medium. Keeping those two applications in mind, tapping into a deduplicated data store, re-hydrating unique instances of data and building a new metadata catalog that provides information on where the data originated from and storing that on an LTFS enabled volume is, again, where I see deduplication and tape meaningfully converging. This doesn’t fully address your RTO comment because the tape based metadata catalog would still have a lot of pointer to what, in effect, would look like highly fragmented data on tape and the start, stop, seek, backhitch, etc joy that can come with tape could (likely would) hamper restore performance. I think you will see an answer to that too (albeit I don’t think there is a huge requirement because we are talking about data that, in a lot of cases, won’t be restored or will only be restored very infrequently). I suspect you will see hardware vendors supplying LTFS supported / enabled hardware to evolve as well to support more than 2 partitions and that this will allow for some form of “on tape defragmentation” as is already being done with Oracle T10KC drives. Again, I don’t know that the need for this is significant because of the low probability of archive data needing to be restored quickly but it is something that I think will come to pass and mark part of what I consider the meaningful convergence of deduplication and tape.

    Best,

    Chris

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.