Basics – Fixing “NSR peer information” errors

If you’re using a modern NetWorker environment, the chances are that you’ll periodically notice entries such as the following in the daemon.log / daemon.raw files on the backup server:

39078 02/02/2009 09:45:13 PM  0 0 2 1152952640 5095 0 nox nsrexecd SYSTEM error: There is already a machine using the name: “faero”. Either choose a different name for your machine, or delete the “NSR peer information” entry for “faero” on host: “nox”

While this may look confronting, it’s actually a trivially easy error to fix that requires just a minute or so of your time with nsradmin. First, note the client that the error is about, and the client that the error is being recorded from. In this case, the error is about the client faero, while the error is being registered against the host nox.

To fix, run up nsradmin against the client service on nox:

# nsradmin -p nsrexec -s nox

(alternatively, you can use: nsradmin -p 390113 -s nox)

At the nsradmin> prompt, enter the command:

delete type: NSR peer information; name: faero

And answer yes when prompted to confirm. For example, the session might resemble the following:

nsradmin> delete type: NSR peer information; name: faero
                        type: NSR peer information;
               administrator: root, "user=root,host=nox";
                        name: faero;
               peer hostname: faero;
          Change certificate: ;
    certificate file to load: ;
Delete? y
deleted resource id 17.0.83.117.0.0.0.0.210.37.85.73.0.0.0.0.10.0.0.1(1)

There, you’ve done it. Note that you should be periodically scanning your daemon raw/log files for errors and trying to eliminate them. The goal should be that any error or warning reported in the file is something that you do need to worry about/investigate, rather than having a lot of “false positives” floating around in the system.

[Update, 2009-05-12]

I thought I’d mention that one of the most common times I see these warnings occur is after I’ve uninstalled/reinstalled NetWorker on a client, as opposed to having upgraded. Since on some clients it’s more or less necessary to uninstall/reinstall rather than upgrade, that helps to understand why the information is lost periodically. My surmise is that on a new install, the NetWorker client processes generate a new ‘certificate’ or ‘identity’. As this new information conflicts with existing information the backup server has on the client, that’s what triggers the error.

It could be that other factors can cause this, but it seems that this is at least a primary cause.

29 Comments

  1. Hi Preston

    I got fed up with the following error in my NetWorker server:

    04/16/09 18:31:53 nsrexecd: GSS Legato authentication user session entry (warning): “User authentication session timed out and is now invalid.”. Session number = 0:6617e, domain = eir.com, user name = root, NetWorker Instance Name = server01

    04/16/09 18:31:53 nsrexecd: SYSTEM error: An error occured when a client attempted to acquire credentials: error: “A daemon requested the information for a user session, but the user session was not found in the list of valid sessions” session number: 0:60480, user id: (NONE).

    04/16/09 18:31:53 nsrexecd: GSS Legato authentication from server02 failed…

    04/16/09 18:31:53 nsrexecd: RPC error: Authentication error

    Is there any solution or ideas Please.

    Thanks & regards
    VJ

    1. I can’t say I’m immediately familiar with that error, though it looks similarish to some nsrauth authentication errors I remember seeing around the early v7.3 releases of NetWorker. You might want to edit the authentication methods (covered in the administration manual) to switch from using “nsrauth” to “oldauth” for either the host (or temporarily, all machines), to see whether this makes a difference.

  2. Could this be put into a script where all that would need to be done is to add in the “offending” client?

    Example in this case: nsrdeletepeer faero
    Where “nsrdeletepeer” is the script and “faero” is (of course) the client.

    1. In theory yes, although it would need to be part of the client install (e.g., if looking for an addition to NetWorker) since it actually can happen on any host in the environment; it’s just most likely to be reported on the backup server.

  3. Have a NW server that was reinstalled (windows 2008) and updated all windows based clients successfully using the above procedure; however the linux clients fails with “user not on machine is not on administrator list” How do I delete the NSR peer information on linux clients from a windows NW server?

    1. Well typically NSR peer information needs to be deleted from the client that is experiencing the problem. So you may want to log onto the Linux hosts and (as the root user) run the command:

      nsradmin -p 390113 -s client

      Where client is the machine you’re currently logged onto. Then, look for the NSR peer information for the server and delete it.

      Alternatively you may also want to update any information stored in /nsr/res/servers on the client if the backup server hostname has changed (bearing in mind if you do so it requires a restart of the NetWorker services on the client.)

      1. Is it safe to assume I need the nsradmin utility installed on my linux client? It’s failing with nsradmin: command not found. Thanks for your assistance.

        1. The nsradmin utility is installed on every machine with the NetWorker client. I suspect it will be a search path issue for you.

          Make sure you run as root on the Linux client, and explicitly call /usr/sbin/nsradmin

          That should do the trick for you.

  4. when i re-install server-2-b-backupped i get the message ‘delete the nsr peer information’ in the daemon.raw file. when i then on the backupserver in nsradmin enter ‘delete type: NSR peer information; name:server-2-b-backupped’ i get the message no resource to delete!
    However i keep failing to re-install the client….
    Please advise

  5. Something so simple to fix and yet so hard to find the information to do..

    I had a LOT of these errors and no recourse previously…

  6. Hi Preston.

    We’ve been fighting these nsr peer information errors. We’ve deleted the peer information with nsradmin and deleted the /nsr/tmp on the problem client right after. But it has not worked.

    EMC is suggesting to configure our datazone for to use oldauth instead of nsrauth. But I can’t get any accurate inforation how that will impact our datazone.

    What opinion do you have about using oldauth instead of nsrauth in big datazones?

    Kind regards,
    Johannes

    1. Hi Johannes,

      Are they suggesting you cutover the entire datazone to oldauth, or just the problematic client? In this sort of scenario if it’s just one client, I’d recommend limiting the cutover to that single client and seeing whether it improves things or not.

      I can understand why you’d be reluctant to cut the entire datazone back to oldauth; I’d only want to do this at the start of a day to give myself time to run a few test backups, etc., plus some restores, to make sure it isn’t going to have any impact. These days with the latest clients, I don’t think oldauth is really all that intended for use any more, so I’d probably need to know more of their reasoning on cutting over the entire datazone before I’d agree it’s safe.

      Cheers,
      Preston.

  7. Hi Preston,

    They are suggesting to change to oldauth on the NetWorker server. As I understand it, it will cut the entire datazone to oldauth.

    So you don’t consider this a trivial change to the datazone? Is it possible we would have problems with backup/restore after such change? and is there any penalty on functionality?

    Your view on this is highly appreciated.

    Kind regards,
    Johannes

  8. I am getting this error in my daemon.raw file about a host that is NOT the NW server. Should I run the command on the server or the host that it is complaining about?