Some sites get quite particular about their volume barcodes when it comes to physical media. This means you’ll see barcodes such as:
- Bxxxxxx – Byyyyyy – Backup volumes
- Cxxxxxx – Cyyyyyy – Clone volumes
I would call this bad barcode label practices, and advise that it should be avoided wherever possible.
There’s a very simple reason for this: the 3am reason.
Put yourself in this scenario: at 3am you get an automated notification that – due to some backup blow-out, or operators not loading tapes (it really doesn’t matter) – the system has run out of backup media. All B* volumes in the library are full.
On the other hand, there’s a bunch of C* volumes in the library that are sitting there empty – tantalisingly empty, in fact.
There’s two options – get up, get suitably clothed for the trip into work, drive/train/whatever into work to load tapes yourself, or relabel some of those empty C* volumes to be in the appropriate Backup rather than Backup Clone pool.
Unless there’s severe punishments for doing so, I’m betting 99.999% of backup administrators will choose the latter, not the former option. After all, it’s the difference between being able to go back to sleep within 10 minutes or maybe not at all.
However, this decision will then cascade through to having other repercussions. One of the following factors will likely come into play:
- If operators beat you in the next day, they’ll possibly ship backup as well as clone media off-site, by virtue of the volumes all starting with C*. (Even if you send them an email, they may do the tape shipping before they get to the email.)
- If you choose to manually separate out the backup and the clone media, and keep the appropriate backup media in the library even though the barcode designates that it should be clone, you’ll create an ongoing management overhead that just asks for trouble and headaches.
- If you choose to manually stage the backup data on the C* volumes to freshly loaded B* volumes, you’ve just added a bunch of work to your (likely already busy) schedule.
None of these scenarios “work” from a business perspective – they’re not suitable use of your time, either personally or for the business either.
There’s a solution to this: stop treating the barcode as a definition of the data.
Using different barcodes for different categories is also the start of a slippery slope. Temptation can then start to have each pool represented by its own set of barcodes, or a global prefix for the company, etc. Pretty soon you can be left in a state where anyone with suitable familiarity with your IT can make a reasonable stab at knowing what sort of backup may be on any individual tape. I’ve seen this happen in many sites.
Security through obfuscation is usually never enough by itself, but it’s always a good starting point.
So there’s four key reasons why I would go so far as to say that barcode categories are bad design:
- They create issues when you run out of media with one type of barcode, but you have spare media using another type of barcode.
- They encourage you to think of the barcode as defining content on the tape. You should be relying on the media database for that.
- It decreases the security of the backup media by allowing someone to make fewer guesses to determine what might be on which tape.
- Sooner or later some event will break the “rule”, and then you’ll be stuck with operational practices that no longer align with reality.
If you’re currently using barcode categories, I’d invite you to step back and consider how you could use the media database to avoid the necessity. If you’ve got pressure to use barcode categories, I’d suggest you strongly argue against it using the above four issues.
Ultimately, barcodes should exist for one reason – to allow a robot to readily move media from slots to drives, in and out of the library, etc. They’re not there to provide information to humans as to what is on the tape – and nor should they be munged to provide such information. That’s the job of the backup software – something NetWorker does quite well.
[Edit, 2010-03-25]
See Ted’s comment below about using barcode categories to differentiate per-product media in a shared (virtualised) tape library arrangement. When running with partitioned/virtualised library environments with shared physical storage between multiple backup products, that’s of course a good reason to have some barcode level differentiation – so that upon import rules can be defined to allocate media to the appropriate backup product.
There are two pieces of information on my barcodes: a “date code” and a sequential number. The sequential number has no real meaning other than just an incremental count. The “date code” is setup with two alphabetical characters representing the year and month, e.g. IF = 09(I) and 06(F), meaning 2009, June. This is the month the tape was purchased. As tapes get relabeled, deleted and re-added to the system (for whatever reason), we loose track of the age of the tape. This allows me to know when a tape has reached a certain age and possibly a time in which its reliability has significantly diminished due to its age. It also keeps my operators from (hopefully) recycling labels, which just causes more confustion.
I can think of a situation where you want to utilize barcodes beyond your listed “one reason”……when you have multiple backup products utilizing the same physical library WITHOUT some degree of virtual library configuration for ‘uniqueness’. In my case, I have NetWorker, NetBackup, and TSM using the same ACSLS-based powderhorn library. Each product have their own devices (same make/model throughout) and their own pools of media. And while NetWorker and TSM work under an ‘additive’ inventory methodology (tell me what tapes I have and I’ll use them), NetBackup likes to grab every media reported by the robotics as existing in the library unit. Thus, we had to define a barcode set ‘unique’ to the NetBackup instance, apply those media to a specific ACS pool within the ACSLS application (‘set scratch ‘ or use ‘ ‘), and use the /usr/openv/volmgr/vm.conf file on the NBU servers to set inventory filters.
EXAMPLE:
INVENTORY_FILTER = ACS 0 BY_ACS_POOL 3
INVENTORY_FILTER = ACS 2 BY_ACS_POOL 1
SCRATCH_POOL = SCRATCH
SSO_SCAN_NAME = $MASTER
SSO_SCAN_ABILITY = 9
REQUIRED_INTERFACE = $MASTER
No this doe not REQUIRE barcode segregation, but doing so made for a much more reviewable Library-centric volume audit ability and general lack of confusion over ownership when viewed at the Library level. Note that since this implementation, we have decreased Networker solutions in exchange for NBU solutions, and I have moved some prior NW tape (recyclable) into the ACS pool 3 for NBU to use, despite the lack of ‘barcode match’. But I also have decreased the need for manual parsing of the ACSLS volume audit for return-from-Vault NW media….which was another good reason for segregation, since I use ‘roll your own’ script based vaulting on a 7.2.x build and Legato wants a new add/inventory per media to recognize returned vault tape.
Also note, long ago and far away, a coworker decided to execute the NBU inventory from the Media Server (ie Storage Node) instead of from the master (pre-6.x NBU w/ centralized DB). Unfortunately, despite my recommendations at build time, the vm.conf had NOT been updated on the MS. As such, the inventory pulled in every live Networker tape in the library and NBU began using the media as SCRATCH, overwriting a large-ish number of live NW tapes. It was ugly, but thank god none of the impacted tapes were called upon for recovery purposes during the the window it took to resolve the issue. Goes to show that even with well thought out segregation and design, an improper configuration and unaware end user can cause extreme havoc.
Having said that, I agree wholeheartedly with your comments regarding barcode segregation at the Backup Product level; it’s unnecessary and can lead to artificial media outages that can impact backup health. Data within the application should be managed by the application, not by a manual distribution of data across artificial usage barriers.
Getting to Ted first – yes, fully agree in a partitioned library you’ll need barcode categories there so that each set of volumes can be identified for segregation for the individual backup product’s virtual library they get assigned to.
Tom – have you considered tracking via ‘olabel’? It’s an mminfo field that keeps track of the original label date of media; short of being clobbered by ‘dd’ or something along those lines, it’s pretty reliable as far as reporting that original date. If it’s not working for you, I’d be keen to discuss the whys and hows of that with you off-topic and then if necessary post an update to the article.