I’m not all that conversant with SELinux, and for the most part, disable it on systems that I configure simply because these days 99% of the systems I configure are within a lab and already heavily firewalled. When NetWorker 7.5 came out and the release notes explicitly stated that SELinux was not supported, it seemed inevitable that my involvement with SELinux would continue to decrease.

When SELinux was recently discussed on the NetWorker mailing list, I responded citing the release notes indicating it wasn’t supported. I was therefore surprised to discover there was a workaround. Responding to the thread, Rich Graves posted the following SELinux adjustments that are necessary to get NetWorker and SELinux working together. I present them unaltered, but can attest to having confirmed they do indeed work. Here’s what Rich had to say:

This has worked for me for about a year, on both client and server. The textrel_shlib change is fairly common for proprietary binaries.

semanage fcontext -a -t textrel_shlib_t “/usr/lib/nsr(/.*)?”
semanage fcontext -a -t var_log_t “/nsr/logs(/.*)?”
restorecon -R /usr/lib/nsr
restorecon -R /nsr/logs

Another approach for the logs is to edit syslog.conf and drop them in /var/log instead of /nsr/logs.

If you’re needing to work with NetWorker and SELinux, hopefully the above tips will help.

 

Over at Daily WTF, there’s a new story that has two facets of relevance to any backup administrator. Titled “Bourne Into Oblivion“, the key points for a backup administrator are:

  • Cascading failures.
  • Test, test, test.

In my book, I discuss the both the implications of cascading failures, and the need to test within a backup environment. Indeed, my ongoing attitude is that if you want to assume something about an untested backup, assume it’s failed. (Similarly, if you want to make an assumption about an unchecked backup, assume it failed too.)

While normally in backup, cascading failures come down to situations such as “the original failed, and the clone failed too”, this article points out a more common form of data loss through cascading failures –  the original failure coupled with backup failure.

In the article, a shell script includes the line:

rm -rf $var1/$var2

Any long-term Unix user will shudder to think of what can happen with the above script. (I’d hazard a guess that a lot of Unix users have themselves written scripts such as the above, and suffered the consequences. What we can hope for in most situations is that we do it on well backed up personal systems rather than corporate systems with inadequate data protection!)

Something I’ve seen in several sites however is the unfortunate coupling of the above shell script with the execution of said script on a host that has read/write network mounted a host of other filesystems across the corporate network. (Indeed, the first system administration group I ever worked with told me a horror story about a script with a similar command run from a system with automounts enabled under /a.)

The net result in the story at Daily WTF? Most of a corporate network wiped out by a script run with the above command where a new user hadn’t populated either $var1 or $var2, making the script instead:

rm -rf /

You could almost argue that there’s already been a cascading failure in the above – allowing scripts to be written that have the potential for that much data loss and allowing said scripts to be run on systems that mount many other systems.

The true cascading failure however was that the backup media was unusable, having been repeatedly overwritten rather than replaced. Whether this meant that the backups ran after the above incident, or that the backups couldn’t recover all required data (e.g., running an incremental on top of a tape with a previous incremental on top of a tape with a previous full, each time overwriting the previous data), or that the tapes were literally unusable due to high overuse (or indeed, all 3), the results were the same – data loss coupled with recovery loss.

With backups not being tested periodically, such errors (in some form) can creep into any environment. Obviously in the case in this article, there’s also the problem that either (a) procedures were not established regarding rotation of media or (b) procedures were not followed.

The short of it: any business that collectively thinks that either formalisation of backup processes or the rigorous checking of backups is unnecessary is just asking for data loss.

 

When I used to be a system administrator, back when 2GB hard drives were the norm, I remember an Oracle system that only needed about 4GB of storage space, but our Oracle savvy system administrator configured an environment with around 30 x 2GB drives.

That was my first introduction to spindles and their relationship to performance.

As the years have gone by, and drive capacities have increased, the spindle problem only briefly appeared to go away – not due to capacity, but due to increasing rotational speed and other improved performance characteristics of drives.

However, perhaps even more so than high performance databases, virtualisation is forcing system administrators to become reacquainted with spindle configuration issues; multi-core systems supporting dozens of virtualised servers create IO problems even within relatively small environments.

If you’re interested in reading more about spindle issues, you may want to check out this article on Search Storage – Get More IOPS per dollar with SSD, 2.5″ Drives. Regardless of the actual vendors discussed, this article is a good overview of the spindle problem. If you’re struggling with virtualised (or even database) IO performance, and were not previously aware of spindle issues, check it out for an introduction. (As practically all storage vendors are moving into high performance options involving either SSDs and/or massive numbers of 2.5″ drives, the article is relevant regardless of your preferred storage platform.)

(If you’re looking for a backup angle to this posting, consider the following: as you virtualise and or put in applications/systems with increasingly higher demands for IOPS, you affect not only primary production operations, but also protection, maintenance and management functions, including (but not limited to) backups. Sometimes performance issues that exist, but are not yet plaguing production operations, manifest first in backup situations where there are high intensity prolonged sequential reads.)

 

To me the most valuable documents produced by EMC in relation to NetWorker or the modules are the Release Notes. These accompany any cited version update and typically contain at least the following chunks of information:

  • New features to the product
  • Changes to existing behaviour
  • Fixed problems
  • Known issues and limitations

This information is gold. To anyone thinking of updating either a NetWorker server or a NetWorker module who isn’t planning on thoroughly reading the release notes first I say this: are you nuts?

I’m the first to admit that I don’t re-read the administration guides every time they are updated. There’s just too much content in them. Instead, I rely on the release notes to tell me what has been added, and if any of that is relevant to my needs, I go searching through the administration guides for said information. In fact, I consider the release notes important enough that they’re the only NetWorker documentation I ever print. Why do I print them? Because it means I can take them away from the computer and go sit down and read them carefully – very carefully.

The release notes don’t always contain all information about an update. They also may not fully elucidate on particular problems that have been fixed*.

To me the most important aspect to the release notes – the bit I check first, is the “Known problems and limitations”. Why? This is the bit that gives you the warnings of “things that don’t work”, or “things that you may have to pay more attention to than you would otherwise think to”. I.e., what is known to not work. One would ignore these in particular at ones own peril.

So, next time you’re thinking of updating any part of your NetWorker environment, please, make sure you download and read the release notes.


* I can attest to this when I review release notes and see LGTscABCDEF numbers that have been created in response to bug filings I’ve made … a 2-3 line entry can’t convey all the details of sometimes complex, sometimes esoteric escalations.

 

I regularly check on undrln, a site mainly focused on graphic design – or more broadly, the professional graphic arts. Recently they linked to “Keeping Clients and Projects on Task“. While the article is very much written for graphic design projects, it actually has a lot of relevance with IT projects as well … i.e., change the topics of discussion and suddenly you’ve got a valuable short reference document explaining four types of situations/failures that can occur within projects.

If you work in consulting, it’s worth reading.

 

So this morning I was looking through the stats for this blog, and I generated the list of most popular posts thus far. I can’t say any of the results surprised me. Every single one of the top 5 comes from the “Basics” series.

Number 5, on that list, was Basics – Listing Files in a backup. There’s a lot of people out there who want to know how to use nsrinfo in general, and specifically want to know about pulling file lists for savesets. Net result? I think it would be greatly beneficial if in NMC users could double-click on browsable savesets and get a complete listing of files therein.

Number 4 was Basics – mminfo, savetime and greater than/less than. Now, I’m not going to pretend that every person who visited that article was looking for details about how greater than and less than works in mminfo in relation to savetimes, though I suspect a reasonable percentage of people new to mminfo found that interesting. My take on it is that it proves there’s not really enough documentation about mminfo, and that mminfo needs some expansion. My personal preference? Having a full SQL-like query engine for mminfo would greatly expand the options available to NetWorker administrators.

Number 3 on the list is Basics – Changing saveset browse/retention times. As regularly as possible I try to check the search strings that have brought people to my blog (as recorded by wordpress), and I can practically guarantee that every day there are multiple combinations to do with savesets, browse and retention times. Sometimes those combinations reference nsrmm, sometimes they don’t. Clearly, extending saveset browse/retention times in NetWorker needs to be more manageable from within the GUI as a bare minimum. I’ll get to the command line in a moment.

Moving on to number 2, we have something that I get search results for every day without fail. That’s Basics – Fixing “NSR Peer information” errors. It’s actually a reasonably simple error to fix, but sometimes finding the information about it is a bit like the old needle-in-a-haystack. I’m hoping that the posting on it has helped quite a few sites to clear out the warnings/errors in their logs and reduce the amount of clutter being reported.

Finally, for number 1, a topic I’m completely unsurprised to see at the top, we have Basics – Parallelism in NetWorker. Not because it’s difficult, but because there’s no absolute rules, parallelism is a topic in NetWorker that many administrators, regardless of length of time with the product, find challenging at times. Set too low, and backups may overrun. Set too high, and device contention, client slow-downs, recovery performance issues, etc., may come into play. Tuning parallelism in NetWorker has to take a lot into account.

The content of this list suggests a few things to me:

  • None of this information is out of reach in the product manuals, but, since the product manuals are (necessarily) lengthy, it is logistically is out of reach for a lot of users who don’t have time to read lengthy manuals.
  • EMC product management could take a few tips from the top 5 articles on my blog – I think they represent areas that could be improved within usability of the product. While parallelism is not something that can “solved” by changes within the GUI (it is, by necessity, complex), other options, such as improving mminfo search, making saveset contents more accessible within the GUI, etc., are readily fixable.
  • It seems there might be scope for a “Getting Started with NetWorker” style manual. I think a traditional book would (a) be too expensive and (b) be unsuitable. This is the sort of information that people want readily to hand on their desktops.

On the last point, I’m interested in writing such a manual. I obviously have some experience with writing – but more so than just the book, over the years I’ve written literally thousands of pages of NetWorker instructions as part of professional services documentation, training courses, etc.

So here’s a question – would people be interested in say, an eBook along the lines of “Getting Started with NetWorker” that gives basic operational and instruction usage so that rather than having to wade through the (close to 1000+) pages of the official documentation they had something shorter, and geared towards day to day operation?

Let me know what you think.

 

Coming primarily from a Unix background, I’ve remained disappointed for 10+ years that NetWorker’s init script has barely changed in that time. Or rather, the only things that have really changed in the script are checks for additional software – e.g., Legato License Manager.

It’s frustrating, in a pesky sort of way, that in all this time the engineers at EMC have never bothered to implement a restart function within the script.

For just about every Unix platform in NetWorker, the only arguments that /etc/init.d/networker takes are:

  • stop – stop the NetWorker services
  • start – start the NetWorker services

That’s it. Once, a long time ago, I got frustrated, and hacked on the NetWorker init script to include a restart option, one that worked with the following logic:

  1. Issued a stop command.
  2. Waited 30 seconds.
  3. Checked to see if there were any NetWorker daemons still running.
  4. If there were NetWorker daemons still running, warn the user and abort.
  5. If there were no NetWorker daemons still running, issue a start command.

Over time, I got tired of inserting this hack back into the init scripts after every upgrade or reinstall, and in time even gave up keeping it around. Call it laziness or apathy on my part, but whatever you want to label it, the same applies to first Legato, then EMC engineering for not adding this absolute basic and practically expected functionality after all these years.

Is there an RFE for this? I don’t know, but logically there should be no need for one. As systems have matured, the restart option has effectively become a default/expected setting on most platforms for init scripts. NetWorker has sadly lagged behind and still requires the administrator to manually run the stop and the start, one after the other.

A minor quibble, I know, but nevertheless a quibble.

 

It’s easy to change NetWorker directives. A few clicks here and there if you use NMC, then a couple of lines of text rattled off into the right fields, and suddenly you’ve made anywhere from small, precise changes to massive changes to a backup.

It’s for this reason that I think that modifying directives within the backup configuration should be considered important enough that they warrant their own change control processes. (I’ve previously talked about the backup administrator needing to be part of the change control authorisation process – this is another aspect however.)

Now, don’t get me wrong – despite what former employees may think, I’m not keen on excessive levels of red tape. In fact, I think a smart system should be designed at all times to minimise administrative overheads while ensuring that all accounting is still correctly done.

That being said, directives are, for want of a better term, dangerous. Mis-used, they can result in recovered systems being unusable – in data loss.

With this in mind, like other aspects of the backup system (adding clients, removing clients, adjusting savesets etc.), adjusting directives or applying directives to clients should also form part of change control.

Whenever directives are being changed, or applied, the following questions should be asked:

  • What is not working as desired?
  • What is the solution required?
  • What are the minimal steps required to make those changes?
  • How can system recoverability following the changes be tested?

It’s that final point that often goes missing with directives. Once, a long time ago (long enough to be NetWorker 5.5.3), a customer providing backup services to a host of companies setup a “zero error policy” but due to budget and time constraints merely kept on adjusting directives to remove any file from the backup that couldn’t be opened/read during the backup process. The end result was unrecoverable systems.

By placing directive maintenance into the realm of change control, we don’t seek to add more red tape to the backup system, but more thought, and more consideration of the consequences of changes that may adversely affect data and systems recovery.

 

I use Parallels quite a lot within my Mac environment, and recently tried to get Solaris/AMD 64-bit installed. Even on a Mac Pro system Solaris stubbornly refuses to install in 64-bit mode, picking the 32-bit kernel every time.

So after exhausting a lot of search options, I submitted a case to Parallels support – titled:

“Solaris installer does not recognise 64-bit CPU”

Overnight, I got the first email back from Parallels support, with this response:

Escalating this ticket to our next level of Support since the issue is regarding Linux.

I half-typed an email response to correct the engineer, but then I thought better of it. If I need to explain that Solaris isn’t Linux to a support engineer, then on second thoughts, I’d prefer to have my case escalated to an engineer who (hopefully) already knows this.

[2009-07-15 Edit]

The second level support engineer I got was much more savvy in the differences between operating systems and was able to answer my question. Solaris 64-bit Parallels support is being actively worked on, so hopefully I’ll see release notes for an update to the current version “soon” (my words, not theirs) mentioning added support for Solaris 64-bit guests.

[2009-12-30 Edit]

Parallels Desktop v5 does seem much better at supporting 64-bit Solaris. There’s a few tricks to getting networking going, but nothing terrible.

 

A rather smart gent whom I used to work with at another company, Mark Harvey, has in his own time been working on an open source VTL implementation that can be used for testing/lab purposes. I.e., he’s not aiming for it to be the next competitor to EDLs, FalconStor, etc., but rather, something that people can use when needing to test jukebox functionality without wanting to carry a jukebox around with them.

While Mark has primarily focused on getting his VTL software working with NetBackup, recently he’s made some progress in getting it to work (with a couple of limitations) with NetWorker.

The current limitation is that NetWorker doesn’t quite like the identity of the virtual drives – it sees them all as having the same serial number, and prohibits creating multiple drives with the same serial number. (The VTL presents differing serial numbers, but NetWorker may be working on the WWNN, which is the same on each device…)

Getting the VTL installed and configured

Limit yourself to one drive though, and you’re fine. To get started, you first need to download the VTL code – Mark hosts it at linuxvtl.googlepages.com.

My testing was with the 2009-06-09 tar ball on a CentOS 5.3 virtual machine and NetWorker 7.5.1. I’m not going to repeat the installation instructions – I suggest you build the RPMs, install sg3_util package (required), following the instructions included in Mark’s package.

Before you actually configure a jukebox in NetWorker, you need to strip down the number of devices in the VTL to 1, and the instructions below are geared towards that. Assuming you’ve not yet started the VTL software:

Create /etc/mhvtl/device.conf

Marks’ /etc/init.d/mhvtl startup script will create this file if it doesn’t exist, but we want to manually configure the file to only device. Below is the device.conf file I’ve used:

[root@tara mhvtl]# cat device.conf

VERSION: 2

# VPD page format:
# <page #> <Length> <x> <x+1>... <x+n>

# NOTE: The order of records is IMPORTANT...
# The 'Unit serial number:' should be last (except for VPD data)
# i.e.
# Order is : Vendor ID, Product ID, Product Rev and serial number finally
# Zero, one or more VPD entries.
#
# Each 'record' is sperated by one (or more) blank lines.
# Each 'record' starts at column 1

Library: 0 CHANNEL: 0 TARGET: 0 LUN: 0
 Vendor identification: STK
 Product identification: L700
 Product revision level: 5500
 Unit serial number: XYZZY

Drive: 1 CHANNEL: 0 TARGET: 1 LUN: 0
 Vendor identification: QUANTUM
 Product identification: SDLT600         
 Product revision level: 5500
 Unit serial number: ZF7584364         
 Max density: 0x46
 VPD: b0 04 00 02 01 00

(Note – yes, you can specify the serial number above, but no, if you create a second device with a different serial number it doesn’t yet work.)

Defining the library contents

After creating the device config, you need to configure the library contents – this is done by creating the file /etc/mhvtl/library_contents. Mine looks like the following:

[root@tara mhvtl]# cat library_contents
# Define how many tape drives you want in the vtl..
# The 'XYZZY_...' is the serial number assigned to
# this tape device.

Drive 1: ZF7584364

# Place holder for the robotic arm. Not really used.
Picker 1:

# Media Access Port
# (mailslots, Cartridge Access Port, <insert your favourate name here>)
# Again, define how many MAPs this vtl will contain.
MAP 1:
MAP 2:
MAP 3:
MAP 4:

# And the 'big' on, define your media and in which slot contains media.
# When the rc script is started, all media listed here will be created
# using the default media capacity.
Slot 1:    800843S3
Slot 2: 800844S3
Slot 3: 800845S3
Slot 4: 800846S3
Slot 5: 800847S3
Slot 6: 800848S3
Slot 7: 800849S3
Slot 8: 800850S3
Slot 9: 800851S3
Slot 10: 800852S3
Slot 11: 800853S3
Slot 12: 800854S3
Slot 13: 800855S3
Slot 14: 800856S3
Slot 15: 800857S3
Slot 16: 800858S3
Slot 17: 800859S3
Slot 18: 800860S3
Slot 19: 800861S3
Slot 20: 800862S3
Slot 21:
Slot 22:
Slot 23:
Slot 24:
Slot 25:
Slot 26:
Slot 27:
Slot 28:
Slot 29:
Slot 30:
Slot 31: CLN001L1
Slot 32: CLN002L1

In the above configuration, we’ve got a library with 32 presented slots, with slots 1-20 occupied by writable tapes, and slots 31-32 occupied with cleaning cartridges. Feel free to manipulate the numbers as you wish. (If you’re wondering about the choice of barcode labels, I’m terribly predictable. Every time I start a sequence of barcode labels in examples, I always start with 800843.)

Getting NetWorker and the VTL working together

Once the VTL has been configured, start it using the init script:

[root@tara mhvtl]# /etc/init.d/mhvtl start
vtllibrary process PID is 5315

So long as everything is working, you should see processes along the lines of:

[root@tara mhvtl]# ps -eaf | grep vtl
vtl       5310     1  0 08:56 ?        00:00:00 vtltape -q 1
vtl       5315     1  0 08:56 ?        00:00:00 vtllibrary -q 0

Looking in /opt/vtl, the default location for the VTL data, you should see the following files (suitably adjusted for any changes you make to barcodes/contents):

[root@tara mhvtl]# ls /opt/vtl
800843S3  800845S3  800847S3  800849S3  800851S3  800853S3
800855S3  800857S3  800859S3  800861S3  CLN001L1  800844S3
800846S3  800848S3  800850S3  800852S3  800854S3  800856S3
800858S3  800860S3  800862S3  CLN002L1

If we check the NetWorker inquire output, we get the following*:

[root@tara mhvtl]# inquire -l

-l flag found: searching all LUNs, which may take over 10 minutes per adapter
 for some fibre channel adapters.  Please be patient.

scsidev@0.0.0:STK     L700     5500|Autochanger (Jukebox), /dev/sg1
                                    S/N:    XYZZY     
                                    ATNN=STK     L700            XYZZY     
                                    WWNN=5123456003030303
scsidev@0.1.0:QUANTUM SDLT600  5500|Tape, /dev/nst0
                                    S/N:    ZF7584364
                                    ATNN=QUANTUM SDLT600         ZF7584364
                                    WWNN=5123456003030303

Assuming you get inquire output like the above, you next need to create your tape library. Below is the output of jbconfig command:

[root@tara mhvtl]# jbconfig

Jbconfig is running on host tara.pmdg.lab (Linux 2.6.18-128.1.16.el5),
 and is using tara.pmdg.lab as the NetWorker server.

 1) Configure an AlphaStor Library.
 2) Configure an Autodetected SCSI Jukebox.
 3) Configure an Autodetected NDMP SCSI Jukebox.
 4) Configure an SJI Jukebox.
 5) Configure an STL Silo.

What kind of Jukebox are you configuring? [1] 2
14484:jbconfig: Scanning SCSI buses; this may take a while ...
Installing 'Standard SCSI Jukebox' jukebox - scsidev@0.0.0.

What name do you want to assign to this jukebox device? MHVTL
15814:jbconfig: Attempting to detect serial numbers on the jukebox and drives ...

15815:jbconfig: Will try to use SCSI information returned by jukebox to configure drives.

Turn NetWorker auto-cleaning on (yes / no) [yes]? yes

The following drive(s) can be auto-configured in this jukebox:
 1> sdlt600 @ 0.1.0 ==> /dev/nst0
These are all the drives that this jukebox has reported.

To change the drive model(s) or configure them as shared or NDMP drives,
 you need to bypass auto-configure. Bypass auto-configure? (yes / no) [no] no

Jukebox has been added successfully

The following configuration options have been set:

> Jukebox description to the control port and model.
> Autochanger control port to the port at which we found it.
> Networker managed tape autocleaning on.
> Barcode reading to on.
> Volume labels that match the barcodes.
> Slot intended to hold cleaning cartridge to 32.  Please insure that a
 cleaning cartridge is in that slot
> Number of times we will use a new cleaning cartridge to 5.
> Cleaning interval for the tape drives to 6 months.

You can review and change the characteristics of the autochanger and its
 associated devices using the NetWorker Management Console.

Would you like to configure another jukebox? (yes/no) [no]no

Using the VTL with NetWorker

Once you’ve got the jukebox created, start up with a simple command – plain old nsrjb:

[root@tara mhvtl]# nsrjb

Jukebox MHVTL: (Ready to accept commands)
14118:nsrjb: No volumes found in the media database...continuing.
slot  volume                      pool  barcode   volume id  recyclable
 1: -*                                  800843S3  -                    
 2: -*                                  800844S3  -                    
 3: -*                                  800845S3  -                    
 4: -*                                  800846S3  -                    
 5: -*                                  800847S3  -                    
 6: -*                                  800848S3  -                    
 7: -*                                  800849S3  -                    
 8: -*                                  800850S3  -                    
 9: -*                                  800851S3  -                    
10: -*                                  800852S3  -                    
11: -*                                  800853S3  -                    
12: -*                                  800854S3  -                    
13: -*                                  800855S3  -                    
14: -*                                  800856S3  -                    
15: -*                                  800857S3  -                    
16: -*                                  800858S3  -                    
17: -*                                  800859S3  -                    
18: -*                                  800860S3  -                    
19: -*                                  800861S3  -                    
20: -*                                  800862S3  -                    
21:                                                                        
22:                                                                        
23:                                                                        
24:                                                                        
25:                                                                        
26:                                                                        
27:                                                                        
28:                                                                        
29:                                                                        
30:                                                                        
31: -*                                  CLN001L1  -                    
32: Cleaning Tape (5 uses left)         CLN002L1  -                    
 *not registered in the NetWorker media data base

drive 1 (/dev/nst0) slot   :

(Note that I ran that about 30 seconds after the jukebox was created, so it had already transitioned into the “Ready to accept commands” state.)

The VTL isn’t built for speed, but it’s still zippy enough for lab testing. Here’s the output from a verbose label command, with timestamps added:

[root@tara mhvtl]# date ; nsrjb -Lvvv -b Default -S 1; date
Sat Jul 11 09:09:53 EST 2009
setting verbosity level to `3'
Info: Preparing to load volume `-' from slot 1 into device `/dev/nst0'.
Info: Loading volume `-' from slot `1' into device `/dev/nst0'.
Info: Load sleep for 5 seconds.
Info: Performing operation `Verify label' on device `/dev/nst0'.
Info: Operation `Verify label' in progress on device `/dev/nst0'
Info: Cannot read the current volume label `Tape label read for volume
 ? in pool ?, is not recognised by Networker: Input/output error'.
Info: nsrmmgd assumes the volume is unlabeled and will write a new label.
Info: Performing operation `Label without mount' on device `/dev/nst0'.
Info: Operation `Label without mount' in progress on device `/dev/nst0'
Info: Label: `800843S3', pool: `Default', capacity: `<NULL>'.
Info: Performing operation `Eject' on device `/dev/nst0'.
Info: Operation `Eject' in progress on device `/dev/nst0'
Info: Eject sleep for 5 seconds.
Info: Preparing to unload volume `800843S3' from device `/dev/nst0' to slot 1.
Info: Unloading volume `800843S3' from device `/dev/nst0' to slot 1.
Info: Unload sleep for 5 seconds.
Sat Jul 11 09:10:33 EST 2009

Writing a backup, we get the obligitory screen-shot:

Screenshot showing backup to Mark's VTL

Screenshot showing backup to Mark's VTL

It’s still about recovery

Even though this is for lab usage only, we still need to make sure that what we write to virtual tape is what we get back. So after that backup, I ran a recovery, restoring the backup to another location. Performing checksums against the source and the original yielded:

[root@tara /]# md5sum /usr/share/doc/crash-4.0/README
/backup/recover_test/doc/crash-4.0/README
73568e4d9e09ce2847673dd5156cb571  /usr/share/doc/crash-4.0/README
73568e4d9e09ce2847673dd5156cb571  /backup/recover_test/doc/crash-4.0/README

Caveats

In conclusion, I’d like to offer a few caveats:

  1. In case I’ve not mentioned this enough, this is not a production VTL. Please don’t think I’m advocating it as a replacement to a full VTL.
  2. The VTL will let you backup as much as you want to any piece of media, so be careful with space management – it’s on your own head to manage media sizes, etc.
  3. The default placement of the VTL object files (i.e., media) is in /opt/vtl, which is likely to be in the root filesystem on an average Linux host. Thus, if you don’t keep an eye on media capacity, you’re going to overrun your root filesystem (or whatever filesystem the VTL data is stored in).
  4. You still need either a NetWorker autochanger license or to be running in eval mode to be able to use/configure this.

Again, if I haven’t said this enough – this is for lab testing.

[2009-07-13 Edit]

Proving that sometimes I just don’t read the documentation sufficiently, with a little bit more digging I discovered that Mark has also implemented a mktape command, that creates media with user nominated sizes. By stopping NetWorker, deleting the VTL media, recreating the media with the nominated sizes then restarting the VTL and NetWorker, you can control capacity using this VTL. Most importantly, that means you can simulate tape changes.

[2009-11-15]

See here for an update article covering multiple drive support, now that Mark has this working in a way which is compatible with NetWorker.


* Note – acknowledging I’ve adjusted the spacing slightly in the inquire output to ensure it fits on the average browser. That’s the only manipulation that was done though.

© 2012 The NetWorker Blog Suffusion theme by Sayontan Sinha