[comp.sys.apollo] problems with apollo's running mentor software

a89@nikhefh.nikhef.nl (Gerard Leurs) (10/05/90)

I have 2 questions to all you netlanders :

question 1:
We are running mentor software on 8 machines in our apollo domain.
Occasionally there are problems with the authorization to run the
different programs (Neted, Symed etc.) of this mentor software.

Once in a while we get (on an authorized node) a message something
like "?Authorization problem: you are not authorized to run this
software on this node".
There can be no problem with expiration of our codes, because the
problem only appears on 1 of the 8 nodes (and not on all 8). And
there is also a "random pick" of this problem-node out of the 8.

What is going on here ?
How can it be prevented ?

Our "solution" is to shutdown and reboot the node.
After doing this everything works fine ! This is however not a
very nice way to solve the authorization problem.

question 2:
Sometimes a so-called "boardstation" becomes very slow / unpleasant
to work with. When using "/com/dspst" one gets a negative Null-
proces! All servers are OK on these nodes (at least they look like
they are).

What is going on ?
What can be done to solve this problem ?

For what it is worth: we are running SR10.2, with (BSD)-UNIX
as our environment. The mentor software is version 7.x.


Please email me directly for comments / suggestions :
G.Leurs@nikhef.nl (or a89@nikhapo.nikhef.nl).

Thanks.

kerr@tron.UUCP (Dave Kerr) (10/07/90)

In article <1011@nikhefh.nikhef.nl> a89@nikhefh.nikhef.nl (Gerard Leurs) writes:
>
>question 1:
>We are running mentor software on 8 machines in our apollo domain.
 ...
>Once in a while we get (on an authorized node) a message something
>like "?Authorization problem: you are not authorized to run this

First, as you already no doubt know Mentor doesn't
officially support 10.2. However I have seen this behavior
on 10.1/7.0 mentor systems. I noticed it after using the
date command to set the node's clock. I didn't persue the
problem, and simply re-booted the node. The problem went
away, ie I was now able to run the application. Did you change
the time on the node before this happened? Does anybody know
why using date should cause this problem? 

-- 
Dave Kerr (301) 765-4453 (WIN)765-4453
tron::kerr                 Internal WEC vax mail
kerr@tron.bwi.wec.com      from an Internet site
kerr@tron.UUCP             from a smart uucp mailer

thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) (10/08/90)

> >Once in a while we get (on an authorized node) a message something
> >like "?Authorization problem: you are not authorized to run this
> 
> First, as you already no doubt know Mentor doesn't
> officially support 10.2. 
Unofficially, it works pretty well, though.  We've encountered only 
two problems here --
1) Rounding is handled differently between 10.1 and 10.2.  This causes the
   LISTER portion of the idea_station test routines to 'fail'.  It turns
   out that in at least four instances (+-0.250 and +-0.750 were the only 
   ones we discovered between +100 and -100 by 0.250), 10.1 rounds AWAY
   from 0, and 10.2 rounds TOWARD 0.  This does not affect the computation,
   just the printout, of values.
2) BOARD station software doesn't display all information.  This is the only
   MENTOR product that has 'failed' at 10.2 (for us).

> However I have seen this behavior
> on 10.1/7.0 mentor systems. I noticed it after using the
> date command to set the node's clock. I didn't persue the
> problem, and simply re-booted the node. The problem went
> away, ie I was now able to run the application. Did you change
> the time on the node before this happened? Does anybody know
> why using date should cause this problem? 
Mentor checks to see whether the date has been dynamically changed, either
by someone running 'timed' or by root changing the date, it refuses to
start Mentor software!  A flag needs to be kept somewhere recording this
fact, since otherwise files would be created with the (dynamically set)
new date, which _could_ cause duplicate UIDs to occur.  This flag is reset
at system shutdown, which is why rebooting clears the problem.  I've 
yelled at Mentor for this stupid practice, and they have said that it won't
be implemented at release 8.0 (I haven't checked the new 7.0 version for it,
but suspect that it's there).  A contact at HP/Apollo was really frustrated,
because apparently Mentor had been one of the big guns pushing Apollo to
_add_ the feature allowing dynamic date setting!!

John Thompson (jt)
Honeywell, SSEC
Plymouth, MN  55441
thompson@pan.ssec.honeywell.com

As ever, my opinions do not necessarily agree with Honeywell's or reality's.
(Honeywell's do not necessarily agree with mine or reality's, either)

chen@digital.sps.mot.com (Jinfu Chen) (10/08/90)

In article <649@tron.UUCP> kerr@tron.bwi.wec.com (Dave Kerr) writes:
>First, as you already no doubt know Mentor doesn't
>officially support 10.2. However I have seen this behavior
>on 10.1/7.0 mentor systems. I noticed it after using the
>date command to set the node's clock. I didn't persue the
>problem, and simply re-booted the node. The problem went
>away, ie I was now able to run the application. Did you change
>the time on the node before this happened? Does anybody know
>why using date should cause this problem? 

Mentor recommended NOT using /bin/date command to change system time, use
/sau?/calendar from the Mnemonic Debugger only. This was in some issue of
CSB (Customer Server Bulletin), the issue number escapes my mind. Call
Mentor's hotline for details.

krowitz@RICHTER.MIT.EDU (David Krowitz) (10/09/90)

Not being a Mentor user, I am simply speculating about the cause of
your problem ...

It occurs to me that if the time/date of the workstations running copies
of /etc/ncs/glbd (the NCS Global Location Broker Daemon) differ by more
than 5 minutes, the various copies of glbd will not update each other
correctly. When this happens, the nodes which use one of the copies which
have not been updated may be unable to see the network registry daemons.
This will result in things like /etc/passwd being unreadable, being unable
to change your password, being unable to login to another account, etc.
If Mentor is trying to read /etc/passwd to get a user ID, this would explain
the failure. If Mentor is using some NCS-based floating license server, the
failure of the multiple copies of glbd could also result in a Mentor client
workstation being unable to find the license server.

Try using /etc/ncs/drm_admin to check how many copies of /etc/ncs/glbd
each workstation thinks there are on the network and whether or not
the workstations' clocks are in sync. Fire up a copy of drm_admin and
give the command "set -o glb -h //some_node_with_glbd_running" to see
how many copies of glbd the workstation thinks should be running and
their respective time/date.


 -- David Krowitz

krowitz@richter.mit.edu   (18.83.0.109)
krowitz%richter.mit.edu@eddie.mit.edu
krowitz%richter.mit.edu@mitvma.bitnet
(in order of decreasing preference)

thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) (10/10/90)

Dave Krowitz writes regarding Mentor 'not authorized' errors--
> Not being a Mentor user, I am simply speculating about the cause of
> your problem ...
> 
> It occurs to me that if the time/date of the workstations running copies
> of /etc/ncs/glbd (the NCS Global Location Broker Daemon) differ by more
> than 5 minutes, ....

Unfortunately, it has nothing to do with anything logical or reasonable.  

Mentor has an authcode file that contains lists of authorized software for 
nodeIDs and packages, with the expiration date and encrypted key.  They 
check this every time you enter a Mentor application.

If the date has been modified on-the-fly (i.e. without a reboot), Mentor
decided to treat it as a "you must be trying to break our authorization"
error, and prevents you from running.  The fact that you can "get around"
this feature by rebooting didn't seem to impress upon them the stupidity
of the feature.

In fact, the original reason we first ran the 'timed' was to get our
nodes back _IN_ time-synch so the glbd wouldn't complain about time
skew warnings!  Mentor plain and simple blew it on this.  Everyone
that I've talked to at Mentor has agreed that it's stupid (I won't name
names :-) and that it shouldn't have been there.  They've also stated
(publicly at MUG-90) that this brain-dead feature would be removed at
release 8.0.  Now, when hell freezes over and release 8.0 finally hits
beta, we might find out whether that's true!

John Thompson (jt)
Honeywell, SSEC
Plymouth, MN  55441
thompson@pan.ssec.honeywell.com

As ever, my opinions do not necessarily agree with Honeywell's or reality's.
(Honeywell's do not necessarily agree with mine or reality's, either)

kerr@tron.bwi.WEC.COM (10/10/90)

> Not being a Mentor user, I am simply speculating about the cause of
> your problem ...

> It occurs to me that if the time/date of the workstations running copies
> of /etc/ncs/glbd (the NCS Global Location Broker Daemon) differ by more
> than 5 minutes, the various copies of glbd will not update each other
> correctly. When this happens, the nodes which use one of the copies which
> have not been updated may be unable to see the network registry daemons.

This was the problem I was trying to correct. I ran
drm_admin and saw that our clocks were skewed. Rather than
shutting the nodes down to run ex calendar, I tried the
date. Well, the clocks were no longer skewed, but Mentor
wouldn't work :-(. Rebooting (and maybe an ex calendar)
fixed the problem.

> This will result in things like /etc/passwd being unreadable, being unable
> to change your password, being unable to login to another account, etc.
> If Mentor is trying to read /etc/passwd to get a user ID, this would explain
> the failure. If Mentor is using some NCS-based floating license server, the
> failure of the multiple copies of glbd could also result in a Mentor client
> workstation being unable to find the license server.

AT this release (7.1) Mentor doesn't use a floating license
server. It uses a static authorization file. The file
contains authorization codes that are a function of the
nodeid, the date of the software expiration, and the application
that you're authorized to run.


Thanks for the info,
Dave
--
Dave Kerr (301) 765-4453 (WIN)765-4453
tron::kerr                 Internal WEC vax mail
kerr@tron.bwi.wec.com      from an Internet site
kerr@tron.UUCP             from a smart uucp mailer

> Try using /etc/ncs/drm_admin to check how many copies of /etc/ncs/glbd
> each workstation thinks there are on the network and whether or not
> the wo> rkstations' clocks are in sync. Fire up a copy of drm_admin and
> give the command "set -o glb -h //some_node_with_glbd_running" to see
> how many copies of glbd the workstation thinks should be running and
> their respective time/date.


>  -- David Krowitz

> krowitz@richter.mit.edu   (18.83.0.109)
> krowitz%richter.mit.edu@eddie.mit.edu
> krowitz%richter.mit.edu@mitvma.bitnet
> (in order of decreasing preference)